Section 1
Endian Mode Considerations

Data represented in memory or media storage is said to be in big endian (BE) order when the most significant byte is stored at the lowest numbered address, and less significant bytes are at successively higher numbered addresses.

Data is stored in little endian (LE) order when it is stored with the order of bytes reversed from that of BE order. In other words, the most significant byte is stored at the highest numbered address. The endian ordering of data never extends past an 8-byte group of storage.

The reference design normally operates with big endian (BE) byte significance, which is the native mode of the PowerPC 604 CPU. Internally, the CPU always operates with big endian addresses, data, and instructions, which is ideal for operating systems such as AIXt, which store data in memory and on media in big endian byte significance. In BE mode, neither the CPU nor the 660 Bridge perform address or data byte lane manipulations that are due to the endian mode. Addresses and data pass 'straight through' the CPU bus interface and the 660 Bridge.

The CPU also features a mode of operation designed to efficiently process code and operating systems such as WindowsNTt, which store data in memory and on media in LE byte significance. The reference design also supports this mode of operation.

When the reference design is in little endian mode, data is stored in memory with LE ordering. The 660 Bridge has hardware to select the proper bytes in the memory and on the PCI bus (via address transforms), and to steer the data to the correct CPU data lane (via a data byte lane swapper). Also, see the 604 CPU and 660 Bridge User's Manuals.

Table 1 summarizes the operation of the reference design in the two different modes.

Table 1. Endian Mode Operations

Mode

What the 604 Does

What the 660 Bridge Does

Big Endian (BE)

No munge, no shift

No unmunge, no swap

Little Endian (LE)

Address Munged & Data Shifted

Address Unmunged & Data Swapped

In BE mode, the CPU emits the address unchanged, and does not shift the data. This is the native mode of the 604 CPU. In BE mode, the 660 Bridge passes the address and data through to the target without any changes (that are due to endian mode).

In LE mode, the CPU transforms (munges) the three least significant address bits, and shifts the data on the byte lanes to match the munged address. In LE mode, the 660 Bridge unmunges the address and swaps the data on the byte lanes.

1.1 What the 604 CPU Does

1.1.1 The 604 Address Munge

The 604 CPU assumes that the significance of memory is BE. When it operates in LE mode, it internally generates the same effective address as the LE code would generate. Since it assumes that the memory is stored with BE significance, it transforms (munges) the three low order addresses when it activates the address pins. For example, in the 1-byte transfer case, address 7 is munged to 0, 6 to 1, 5 to 2, and so on. Table 2 shows the address transform rules for the allowed LE mode transfer sizes.

Table 2. 604 LE Mode Address Transform

Transfer Size

Address Transform

8

None

4

Physical Address[29:31] XOR 100 => A[29:31]

2

Physical Address[29:31] XOR 110 => A[29:31]

1

Physical Address[29:31] XOR 111 => A[29:31]

1.1.2 The 604 Data Shift

The data transfer occurs on the byte lanes identified by the address pins and transfer size (TSIZ) pins in either BE or LE mode. In LE mode, the CPU shifts the data from the byte lanes pointed to by the unmunged address, over to the byte lanes pointed to by the munged address. This shift is linear in that it does not rotate or alter the order of the bytes, which are now in the proper set of byte lanes. Note that the individual bytes are still in BE order.

1.2 What the 660 Bridge Does

While the reference design is operating properly, data is stored in system memory in the same endian mode as the mode in which the CPU operates. That is, the byte significance in memory is BE in BE mode and it is LE in LE mode. Because of this, hardware is included in the 660 Bridge that (in LE mode) will swap the data bytes to the correct byte lanes, and that will transform (or un-munge) the address coming from the 604.

1.2.1 The 660 Bridge Address Unmunge

In LE mode, the 660 Bridge unmunges address lines A[29:31]. This unmunge merely applies the same XOR transformation to the three low-order address lines as did the CPU. This effectively reverses the effect of the munge that occurs within the CPU. For example, if the CPU executes a one-byte load coded to access byte 0 of memory in LE mode, it will munge its internal address and emit address A[29:31] = 7h. The 660 Bridge will then unmunge the 7 on A[29:31] back to 0, and use this address to access memory.

1.2.2 The 660 Bridge Data Swapper

The 660 Bridge contains a byte swapper. As shown in Figure 1, the byte swapper is placed between the CPU data bus and the memory and PCI data busses. This allows the byte lanes to be swapped between the CPU bus and the PCI bus, or between the CPU bus and memory, but not between the PCI bus and memory. Thus, when a PCI busmaster accesses memory, the reference design does not change either the address or the data location to adjust for endian mode. In either mode, data is stored or fetched from memory at the address presented on the PCI bus.

The 660 Bridge cannot tell the endian mode of the CPU directly, and so cannot automatically change endian mode to match the CPU. There is a control bit located in ISA I/O space (port 0092) that the CPU can write to in order to set the endian mode of the motherboard.



In BE mode, the 660 Bridge byte swapper is off, and data passes through it with no changes. In LE mode, the byte swapper is on, and the order of the byte lanes is rotated (swapped) about the center. As shown in Table 3, the data on CPU byte lane 0 is steered to memory byte lane 7, the data on CPU byte lane 1 is steered to memory byte lane 6, and so on. During reads, the data flows in the opposite direction over the same paths.

Table 3. 660 Bridge Endian Mode Byte Lane Steering

CPU Byte Lane

BE Mode Connection

LE Mode Connection

CPU byte lane 0 (MSB)

Memory byte lane 0, PCI lane 0

Memory byte lane 7, PCI lane 7*

CPU byte lane 1

Memory byte lane 1, PCI lane 1

Memory Byte lane 6, PCI lane 6*

CPU byte lane 2

Memory byte lane 2, PCI lane 2

Memory byte lane 5, PCI lane 5*

CPU byte lane 3

Memory byte lane 3, PCI lane 3

Memory byte lane 4, PCI lane 4*

CPU byte lane 4

Memory byte lane 4, PCI lane 4*

Memory byte lane 3, PCI lane 3

CPU byte lane 5

Memory byte lane 5, PCI lane 5*

Memory byte lane 2, PCI lane 2

CPU byte lane 6

Memory byte lane 6, PCI lane 6*

Memory byte lane 1, PCI lane 1

CPU byte lane 7 (LSB)

Memory byte lane 7, PCI lane 7*

Memory byte lane 0, PCI lane 0

Note: * In this table, PCI byte lanes 3:0 refer to the data bytes associated with PCI_C/BE[3:0]# when the third least significant bit of the target PCI address (PCI_AD[29]) is 0, as coded in the instruction. PCI byte lanes [7:4] refer to the data bytes associated with PCI_C/BE[3:0]# when PCI_AD[29] is a 1.

1.3 Bit Ordering Within Bytes

The LE convention of numbering bits is followed for the memory and PCI busses, and the CPU busses are labeled in BE nomenclature. The various busses are connected to the 660 Bridge with their (traditional) native significance maintained (BE for CPU, and LE for PCI and memory), so that MSb connects to MSb and so on. The bit paths between the CPU and memory data busses are shown in Table 4 for both BE and LE mode operation.

Table 4. 660 Bit Transfer

CPU_DATA[ ]

BE Mode
MEM_DATA[ ]

LE Mode
MEM_DATA[ ]

0

7

63

1

6

62

2

5

61

3

4

60

4

3

59

5

2

58

6

1

57

7

0

56

8

15

55

9

14

54

10

13

53

11

12

52

12

11

51

13

10

50

14

9

49

15

8

48

16

23

47

17

22

46

18

21

45

19

20

44

20

19

43

21

18

42

22

17

41

23

16

40

24

31

39

25

30

38

26

29

37

27

28

36

28

27

35

29

26

34

30

25

33

31

24

32

32

39

31

CPU_DATA[ ]

BE Mode
MEM_DATA[ ]

LE Mode
MEM_DATA[ ]

33

38

30

34

37

29

35

36

28

36

35

27

37

34

26

38

33

25

39

32

24

40

47

23

41

46

22

42

45

21

43

44

20

44

43

19

45

42

18

46

41

17

47

40

16

48

55

15

49

54

14

50

53

13

51

52

12

52

51

11

53

50

10

54

49

9

55

48

8

56

63

7

57

62

6

58

61

5

59

60

4

60

59

3

61

58

2

62

57

1

63

56

0

1.4 Byte Swap Instructions

The Power PC architecture defines both word and halfword load/store instructions that have byte swapping capability. Programmers will find these instructions valuable for dealing with the BE nature of this architecture. For example, if a 32-bit configuration register of a typical LE PCI device is read in BE mode, the bytes will appear out of order unless the "load word with byte swap" instruction is used. The byte swap instructions are:

The byte-reverse instructions should be used in BE mode to access LE devices and in LE mode to access BE devices.

1.5 604 CPU Alignment Exceptions In LE Mode

The CPU does not support a number of instructions and data alignments in the LE mode that it supports in BE mode. When it encounters an unsupportable situation, it takes an internal alignment exception (machine check) and does not produce an external bus cycle. See the latest 604 CPU documentation for details. Examples include:

1.6 Single-Byte Transfers

Figure 2 is an example of byte write data a at address xxxx xxx0.





Figure 3 is an example of byte write data a at address xxxx xxx2.

For single byte accesses to memory in BE mode, Table 5 applies.

Table 5. Memory in BE Mode

604 604 BYTE BYTE MEM BYTE CAS

A31 30 29 add LANE LANE* LANE ACTIVE

0 0 0 0 0 MSB 0 0 0

1 0 0 1 1 1 1 1

0 1 0 2 2 2 2 2

1 1 0 3 3 3 3 3

0 0 1 4 4 4 4 4

1 0 1 5 5 5 5 5

0 1 1 6 6 6 6 6

1 1 1 7 7 LSB 7 7 7

NOT MUNGED SWAP NOT UNMUNGED
OFF

Note:
*At the CPU side.

For single byte accesses to memory in LE mode, Table 6 applies.

Table 6. Memory in LE Mode

663

604 604 BYTE BYTE MEM BYTE CAS

A31 30 29 add LANE LANE* LANE ACTIVE

0 0 0 0 0 MSB 0 7 7

1 0 0 1 1 1 6 6

0 1 0 2 2 2 5 5

1 1 0 3 3 3 4 4

0 0 1 4 4 4 3 3

1 0 1 5 5 5 2 2

0 1 1 6 6 6 1 1

1 1 1 7 7 LSB 7 0 0

MUNGED SWAP UNMUNGED
ON

Note:
*At the CPU side.

For single byte accesses to PCI in BE mode, Table 7 applies.

Table 7. PCI in BE Mode

604 604 BYTE BYTE PCI BYTE A/D** BE#

A31 30 29 add LANE LANE LANE 2 1 0 3 2 1 0

(0=active byte enable)

0 0 0 0 0 MSB 0 0 0 0 0 1 1 1 0

1 0 0 1 1 1 1 0 0 1 1 1 0 1

0 1 0 2 2 2 2 0 1 0 1 0 1 1

1 1 0 3 3 3 3 0 1 1 0 1 1 1

0 0 1 4 4 4 0 1 0 0 1 1 1 0

1 0 1 5 5 5 1 1 0 1 1 1 0 1

0 1 1 6 6 6 2 1 1 0 1 0 1 1

1 1 1 7 7 LSB 7 3 1 1 1 0 1 1 1

NOT MUNGED SWAP NOT UNMUNGED
OFF

Note:
**AD[0:1] set to 00 for all PCI transactions except I/O cycles.

For single byte accesses to PCI in LE mode, Table 8 applies.

Table 8. PCI in LE Mode

663*

604 604 BYTE BYTE PCI BYTE A/D ** BE#

A31 30 29 add LANE LANE LANE 2 1 0 3 2 1 0

(0=active byte enable)

0 0 0 0 0 MSB 0 3 1 1 1 0 1 1 1

1 0 0 1 1 1 2 1 1 0 1 0 1 1

0 1 0 2 2 2 1 1 0 1 1 1 0 1

1 1 0 3 3 3 0 1 0 0 1 1 1 0

0 0 1 4 4 4 3 0 1 1 0 1 1 1

1 0 1 5 5 5 2 0 1 0 1 0 1 1

0 1 1 6 6 6 1 0 0 1 1 1 0 1

1 1 1 7 7 LSB 7 0 0 0 0 1 1 1 0

MUNGED SWAP UNMUNGED
ON

Notes:
*At the CPU side.
**AD[0:1] set to 00 for all PCI transactions except I/O cycles.

1.7 Two-Byte Transfers

Figure 4 gives an example of double byte write data ab at address xxxx xxx0.



Table 9 and Table 10 illustrate all cases that can occur. The columns of Table 9 have these meanings:

For 2-byte transfers, Table 9 holds:

Table 9. Two Byte Transfer Information

PROG BE MODE LE MODE BE OR LE BE OR LE BE OR LE

TARG 604 BE (x or w 110) Target CAS# 0:7 PCI CBE#

ADDR add a29:31 Add a29:31 bytes 0 7 AD2 3210

0 0 000 6 110 0-1 0011 1111 0 1100

1 1 001 7 E 111 1-2 E 1001 1111 0 E 1001

2 2 010 4 100 2-3 1100 1111 0 0011

3 3 011 5 E 101 3-4 E 1110 0111 1 E PPPP

4 4 100 2 010 4-5 1111 0011 1 1100

5 5 101 3 E 011 5-6 E 1111 1001 1 E 1001

6 6 110 0 000 6-7 1111 1100 1 0011

7 N NNN 1 E 001 NNN E NNNN NNNN N E NNNN

Notes:
N= not emitted by 60X because it crosses 8 bytes (transforms to 2 singles in BE, machine CH in LE)
P= not allowed on PCI (crosses 4 bytes)
E= causes exception (does not come out on 604 bus) in LE mode

Table 10 contains the same information as found in Table 9, but it is arranged to show the CAS and PCI byte enables that activate as a function of the address presented at the pins of the 604 and as a function of BE/LE mode.

Table 10. Rearranged 2-Byte Transfer Information

2 BYTE XFERS BE BE LE LE

60X ADDRESS PINS CAS#0:7 PCI CBE# CAS#0:7 PCI CBE#

0 7 A2 3210 0 7 AD2 3210

0 000 0011 1111 0 1100 1111 1100 1 0011

1 001 1001 1111 0 1001 E NNNN NNNN N E NNNN

2 010 1100 1111 0 0011 1111 0011 1 1100

3 011 1110 0111 0 PPPP E 1111 1001 1 E 1001

4 100 1111 0011 1 1100 1100 1111 0 0011

5 101 1111 1001 1 1001 E 1110 0111E 0 E PPPP

6 110 1111 1100 1 0011 0011 1111 0 1100

7 111 NNNN NNNN N NNNN E 1001 1111E 0 E 1001

Notes:
N= not emitted by 60X because it crosses 8 bytes (transforms to 2 singles in BE, machine CH in LE)
P= not allowed on PCI (crosses 4 bytes)
E= causes exception (does not come out on 604 bus) in LE mode

1.8 Four-Byte Transfers

Figure 5 gives an example of Word (4-BYTE) Write of 0a0b0c0dh AT ADDRESS xxxx xxx4.



Table 11 and Table 12 illustrate the cases that can occur. The columns of Table 11 have these meanings:

Table 11. 4-Byte Transfer Information

PROG BE MODE LE MODE BE OR LE BE OR LE BE OR LE

TARG 604 BE (x or w 100) Target CAS# 0:7 PCI CBE#

ADDR add a29:31 add a29:31 bytes 0 7 AD2 3210

0 0 000 4 100 0-3 0000 1111 0 0000

1 1 001 5 E 101 1-4 E 1000 0111 0 E PPPP

2 2 010 6 E 110 2-5 E 1100 0011 0 E PPPP

3 3 011 7 E 111 3-6 E 1110 0001 1 E PPPP

4 4 100 0 000 4-7 1111 0000 1 0000

5 5 NNN 1 E NNN N-N NNNN NNNN 1 E NNNN

6 6 NNN 2 E NNN N-N NNNN NNNN 1 E NNNN

7 7 NNN 3 E NNN N-N NNNN NNNN 1 E NNNN

Notes:
N= not emitted by 60X because it crosses 8 bytes (transformed into 2 bus cycles)
P= not allowed on PCI (crosses 4 bytes)
E= causes exception (does not come out on 604 bus) in LE mode

Table 12 contains the same information as found in Table 11, but it is arranged to show the CAS and PCI byte enables that activate as a function of the address presented at the pins of the 604 and as a function of BE/LE mode.

Rearranging Table 12 for 4-byte transfers:

Table 12. Rearranged 4-Byte Transfer Information

4 BYTE XFERS BE BE LE LE

60X ADDRESS PINS CAS#0:7 PCI CBE# CAS#0:7 PCI CBE#

0 7 A2 3210 0 7 AD2 3210

0 000 0000 1111 0 0000 1111 0000 0 0000

1 001 1000 0111 0 PPPP E NNNN NNNN 0 E NNNN

2 010 1100 0011 0 PPPP E NNNN NNNN 0 E NNNN

3 011 1110 0001 0 PPPP E NNNN NNNN E NNNN

4 100 1111 0000 1 0000 0000 1111 1 0000

5 101 NNNN NNNN 1 NNNN E 1000 0111 1 E PPPP

6 110 NNNN NNNN 1 NNNN E 1100 0011 1 E PPPP

7 111 NNNN NNNN 1 NNNN E 1110 0001 1 E PPPP

Notes:
N= not emitted by 60X because it crosses 8 bytes (transformed into 2 bus cycles)
P= not allowed on PCI (crosses 4 bytes)
E= causes exception (does not come out on 604 bus) in LE mode
X= not supported in memory controller (crosses 4-byte boundary

1.9 Three byte Transfers

There are no explicit Load/Store three-byte instructions; however, three-byte transfers occur as a result of unaligned four-byte loads and stores as well as a result of move multiple and string instructions.

The TSIZ=3 transfers with address pins = 0, 1, 2, 3, 4, or 5 may occur in BE. All of the other TSIZ and address combinations produced by move multiple and string operations are the same as those produced by aligned or unaligned word and half-word loads and stores.

Since move multiples, strings, and unaligned transfers cause machine checks in LE mode, they are not of concern in the BE design.

1.10 Instruction Fetches and Endian Modes

Most instruction fetching is with cache on. Therefore memory is fetched eight bytes wide. Figure 6 shows the instruction alignment.

Example: 8 byte instruction fetch I1=abcd, I2=efgh at address xxxx xxx0



It is possible to fetch instructions with 4 byte aligned transfers when the cache is turned off. In that case, the 604 does not munge the address in LE mode. The memory controller does not differentiate between instruction and data fetches, but the unmunger is ineffective because the memory is always read 8 byte wide, and data is presented on all 8 byte lanes. If the unmunger were used, the wrong instruction would be read. The net result is illustrated in Figure 7.

Example: 4 byte instruction fetch, I2=efgh at address xxxx xxx4



1.11 Changing BE/LE Mode

There are two BE/LE mode controls. One is inside the 604 CPU and the other is a register bit on the motherboard. The 604 CPU interior mode is not visible to the motherboard hardware. The BE mode bit referred to in this document is the register bit on the motherboard. It is a bit in I/O space which is memory mapped just like other I/O registers. It defaults to BE mode.

The 604 CPU always powers up in the BE mode and begins fetching to fill its cache. Consequently, at least the first of the ROM code must be BE code. It is beyond the scope of this document to define how the system will know to switch to LE mode; however, great care must be made during the switch in order to synchronize the internal and external mode bits, to flush all caches, and to avoid executing extraneous code.

The following process switches the system from BE to LE mode when used in this system:

  1. Disable L1 caching.
  2. Disable L2 caching.
  3. Flush all system caches.
  4. Turn off interrupts immediately after a timer tick so no timer interrupts will occur during the next set of cycles.
  5. Mask all interrupts.
  6. Set the CPU state and the motherboard to LE (see Figure 8). Note that CPU is now in LE mode. All instructions must be in LE order.
  7. Put interrupt handlers and CPU data structures in LE format.
  8. Enable caches.
  9. Enable Interrupts.
  10. Start the LE operating system initialization.

Figure 8 shows the instruction stream to switch endian modes.



1.12 Summary of Bi-Endian Operation and Notes