Understanding computer endianness

Preface

After a long time, now I decide to add a new post. This is my first post written in English. Maybe it is a challenge, but it is surely a good beginning. I believe that one step at a time can make a thousand miles. I am going to talk about the endianness in this post.

Overview

The problems of endianness are always confusing us, today I am trying to explain these problems, what is endianness? And how it affects our program? And when shall we pay attention to these problems?

Nuts and Bolts of Endianness

What is endianness? It is the byte-order convention of a multi-byte numeric object stored in memory.
The multi-byte numeric types in c such as int, float, long, double will have the endianness problems. The endianness is determined by CPU, not the OS. There are two types of endianness, little-endian and big-endian. Suppose we wrote a piece of code for the 32-bit system as following

 // test.c 
 #include <stdio.h>
 
 int main(int argc, char** argv) {
     printf("%x", 0x01234567);
 }

what is the representation of 0x01234567 both in little endian and big endian computer?
The define of little-endian is that the right-most byte of a numeric value is in the lowest address, and the left-most byte is in the highest address.The value stored in memory is ordered from least significant byte[1] to most significant byte[2].
Oppositely, the define of big-endian is that the left-most byte of a numeric value is in the lowest address and the right-most byte is in the highest address. The value stored in memory is ordered from the most significant byte to least significant byte. As an example, we can see both different representations of 0x01234567 in the following table.

Endianness 0x0000 0x0001 0x0002 0x0003
little endian 0x67 0x45 0x23 0x01
big endian 0x01 0x23 0x45 0x67

The terms “little-endian” and “big-endian” come from Gulliver’s Travels by Jonathan Swift, where two warring factions could not agree as to how a soft-boiled egg should be opened, by the little end or by the big. There is no technological reason to use little-endian or big-endian, it can be selected arbitrarily.
Now we know that endianness is the byte-order convention of a number stored in memory. It is invisible for most programmers, except the scenario of developing a network library. I am discussing these later.

Is endianness determined at compile time?

Before we discuss the endianness problems of network programming, it is better to explore more about when the endianness is determined. As we know, at the runtime, the byte-order of a number stored in memory is fit to the hardware system (which the program is running on), little-endian or big-endian. The thing we are concerning is that, is endianness determined at compile time? or is it being transformed at the time when os load program from disk to memory?

In order to get a correct answer, we should do some experiment, just use GCC to compile the code in the file of test.c, and then check the binary file, looking for the representation of 0x01234567 in the file

Before we compile the source file, we first should prepare two different machines, one is little-endian and the other is big-endian. It is easy to get a little-endian machine, but hard to get a big-endian one, I log in my Tencent cloud-server, type “lscpu” command, and then press the enter key, after that, I get the following information

Architecture:          x86_64
CPU op-mode(s):        32-bit, 64-bit
Byte Order:            Little Endian
CPU(s):                1
On-line CPU(s) list:   0
Thread(s) per core:    1
Core(s) per socket:    1
Socket(s):             1
NUMA node(s):          1
Vendor ID:             GenuineIntel
CPU family:            6
Model:                 63
Model name:            Intel(R) Xeon(R) CPU E5-26xx v3
Stepping:              2
CPU MHz:               2294.686
BogoMIPS:              4589.37
Hypervisor vendor:     KVM
Virtualization type:   full
L1d cache:             32K
L1i cache:             32K
L2 cache:              4096K
NUMA node0 CPU(s):     0

Look at line 3, it tells us it is a little-endian machine. Now I should get another one, it is cheap for me to install an emulator of the big-endian machine. I download QEMU for windows, and install it, and then boot debian-powerpc on it. After booting success, I type “lscpu” command and press enter key, the console shows the following information

Architecture:          ppc
Byte Order:            Big Endian
CPU(s):                1
On-line CPU(s) list:   0
Thread(s) per core:    1
Core(s) per socket:    1
Socket(s):             1
Model:                 Power Macintosh

Good! I have all the environments which I need. I upload the test.c file into little-endian machine, and then run the “gcc -c test.c” command, it then generates a new file called test.o which contains machine code for the specific machine. After that, I run the “objdump -d test.o” command, and the console shows the following information

test.o:     file format elf64-x86-64
Disassembly of section .text:

0000000000000000 <main>:
line 1    0:   55                      push   %rbp
line 2    1:   48 89 e5                mov    %rsp,%rbp
line 3    4:   48 83 ec 10             sub    $0x10,%rsp
line 4    8:   89 7d fc                mov    %edi,-0x4(%rbp)
line 5    b:   48 89 75 f0             mov    %rsi,-0x10(%rbp)
line 6    f:   ba 00 00 00 00          mov    $0x0,%edx
line 7   14:   be 67 45 23 01          mov    $0x1234567,%esi
line 8   19:   bf 00 00 00 00          mov    $0x0,%edi
line 9   1e:   b8 00 00 00 00          mov    $0x0,%eax
line 10  23:   e8 00 00 00 00          callq  28 <main+0x28>
line 11  28:   b8 00 00 00 00          mov    $0x0,%eax
line 12  2d:   c9                      leaveq 
line 13  2e:   c3                      retq   

Pay attention to the line 7, the value of 0x01234567 is little-endian.

I do the same operations as above on my big-endian machine, and I get the following information

test.o      file  format elf32-powerpc
Disassembly of section .text


00000000    <main>:
line 1   0:    94 21 ff e0             stwu     r1,-32(r1)
line 2   4:    7c 08 02 a6             mflr     r0
line 3   8:    90 01 00 24             stw      r0,36(r1)
line 4   c:    93 e1 00 1c             stw      r31,28(r1)
line 5   10:   7c 3f 0b 78             mr       r31,r1
line 6   14:   90 7f 00 08             stw      r3,8(r31)
line 7   18:   90 9f 00 0c             stw      r4,12(r31)
line 8   1c:   3d 20 00 00             lis      r9,0
line 9   20:   38 69 00 00             addi     r3,r9,0
line 10  24:   3d 20 01 23             lis      r9,291
line 11  28:   61 24 45 67             ori      r4,r9,17767
line 12  2c:   4c c6 31 82             crclr    4*cr1+eq
line 13  30:   48 00 00 01             bl       30 <main+0x30>
line 14  34:   7d 23 4b 78             mr       r3,r9
line 15  38:   39 7f 99 29             addi     r11,r31,32
line 16  3c:   80 0b 00 04             lwz      r0,4(r11)
line 17  40:   7c 08 03 a6             mtlr     r0
line 18  44:   83 eb ff fc             lwz      r31,-4(r11)
line 19  48:   7d 61 5b 78             mr       r1,r11
line 20  4c:   4e 80 00 20             blr

Pay attention to the lines from line 10 to line 11, the value of 0x01234567 is big-endian.

The byte-order of numeric values in the compiled binary file is fit to the hardware system(where they are compiled), it shows the truth that compiler knows the byte-order, and it determined the machine level endianness of numeric constants, we now can conclude that endianness is determined at compile time.

When shall we pay attention to the problems of endianness

In most cases, endianness is invisible for us, but when we develop a network library, we should write a “send” method to send packages to remote side, and write a “receive” method to receive the packages from sender, it is necessary for us to encode the length of the package into the package’s header, so that remote side can read the package correctly. Suppose every package we send is less than 64kb, and so we only need 2 more bytes to represent the length of packages. To simplify our scenario, I won’t copy codes from any network library, that will complicate our scenario. The example code is not used in any project, it just shows the incorrect demonstration of endianness processing, the pseudocode is shown as following

1  int send(fd, package) {
2      len = calc_len(package)
3      buffer = new byte[len + 2]
4      memcpy(&len, buffer, 2) // copy len value to first two byte of buffer
5      memcpy(package, buffer + 2, len) // copy content of the package into buffer
6      socket_send(fd, buffer)
7      
8      return true
9  }
10 
11 package receive(fd) {
    ...
21  // this method 
22  header_buff = new byte[2]
23  read_header_until_complete(header_buff)
24    
25  header_len = 0
26  memcpy(header_buff, &header_len, 2)  // copy len value into header_len
    ...
33  return package
}

If the CPUs of both sender and receiver are the same endianness, everything is ok. But if the machine of the sender is little-endian and the receiver is big-endian, or vice verse, something will be wrong.
Now suppose sender is little-endian and receiver is big-endian, we also amuse that the length of package in sender side is 15, it’s representation in memory is 0x0f00, if we encode the 0x0f00 into the first two bytes of buffer, receiver will also receive 0x0f00, it looks well until now, but after “receive” method execute the codes at line 26, the value of header_len will be 0x0f00, which is 3840 in decimal, it means that the “receive” method can not read package correctly, and the connection will be lost efficacy.

To resolve this problem, we should write an endian-independent code. The solution is simple, we can use shift operation, instead of calling memcpy. For example, printed results of the following code are the same both in the little-endian machine and big-endian machine.

 #include <stdio.h>
 
 int main(int argc, char** argv) {
     int x = 0x01234567;
     int lsb = x & 0xff;
     int msb = (x >> 24) & 0xff;
     printf("msb:%x lsb:%x", msb, lsb);
     return 0;
 }
 
 // print msb:1 lsb:67

You can copy the above code into a file in your machine and then run “gcc yourfilename.c”, gcc will generate a binary file called a.out, afterward, run “./a.out”, finally it will print the result on the displayer.

Now we have a nice solution to resolve the problems of endianness, we then just modify the “send” and “receive” methods as following

  int send(fd, package) {
      len = calc_len(package)
      buffer = new byte[len + 2]
      buffer[0] = (len >> 8) & 0xff // copy the high 16 bits to buffer[0]
      buffer[1] = len & 0xff        // copy the low 16 bits to buffer[1]
      memcpy(package, buffer + 2, len) // copy content of the package into buffer
      socket_send(fd, buffer)
      
      return true
  }
 
 package receive(fd) {
  ...
  // this method 
  header_buff = new byte[2]
  read_header_until_complete(header_buff)
    
  header_len = header_buff[0] << 8 | header_buff[1] // now header_len is 15
  
  ...
  return package
}

Why shift operation can do the right thing? Because CPU processes the endianness of numbers carefully, it can give the right value which we want.

Summary

In this post, we have discussed many problems about endianness, we now know that what endianness it is, endianness is determined at compile-time, and we also discussed when we should pay attention to the problems of numeric endianness, and how to resolve it. I hope it can be helpful for you, thanks for reading.

Reference

[1] if you want to know what is the significant byte, maybe you should know what is the significant bit first, you can read this article Understanding the most and least significant bit, it can help you to understand this concept
[2] [Computer Systems A Programmer’s Perspective 3 edition] 2.1.3 Addressing and Byte Ordering

common