Chapter: Multicore Application Programming For Windows, Linux, and Oracle Solaris : Hardware, Processes, and Threads

The Translation of Source Code to Assembly Language

Processors execute instructions. Instructions are the basic building blocks of all computation; they perform tasks such as add numbers together, fetch data from memory, or store data back to memory.

The Translation of Source Code to Assembly Language

Processors execute instructions. Instructions are the basic building blocks of all computation; they perform tasks such as add numbers together, fetch data from memory, or store data back to memory. The instructions operate on registers that hold the current values of variables and other machine state. Consider the snippet of code shown in Listing 1.1, which increments an integer variable pointed to by the pointer ptr.

The SPARC code has the pointer ptr passed in through register %o0. The load instruction loads from this address into register %o5. Register %o5 is incremented. The store instruction stores the new value of the integer held in register %o5 into the mem-ory location pointed to by %o0, and then the return instruction exits the routine.

Listing 1.3 shows the same source code compiled for 32-bit x86. The x86 code is somewhat different. The first difference is that the x86 in 32-bit mode has a stack-based calling convention. This means that all the parameters that are passed into a function are stored onto the stack, and then the first thing that the function does is to retrieve these stored parameters. Hence, the first thing that the code does is to load the value of the pointer from the stack—in this case, at the address %esp+4—and then it places this value into the register %eax.

We then encounter a second difference between x86 and and SPARC assembly lan-guage. SPARC is a reduced instruction set computer (RISC), meaning it has a small number of simple instructions, and all operations must be made up from these simple building blocks. x86 is a complex instruction set computer (CISC), so it has instructions that perform more complex operations. The x86 instruction set has a single instruction that adds an increment to a value at a memory location. In the example, the instruction is used to add 1 to the value held at the address held in register %eax. This is a single CISC instruction, which contrasts with three RISC instructions on the SPARC side to achieve the same result.

Both snippets of code used two registers for the computation. The SPARC code used registers %o0 and %o5, and the x86 code used %esp and %eax. However, the two snippets of code used the registers for different purposes. The x86 code used %esp as the stack pointer, which points to the region of memory where the parameters to the function call are held. In contrast, the SPARC code passed the parameters to functions in regis-ters. The method of passing parameters is called the calling convention, and it is part of the application binary interface (ABI) for the platform. This specification covers how programs should be written in order to run correctly on a particular platform.

Both the code snippets use a single register to hold the address of the memory being accessed. The SPARC code used %o0, and the x86 code used %eax. The other difference between the two code snippets is that the SPARC code used the register %o1 to hold the value of the variable. The SPARC code had to take three instructions to load this value, add 1 to it, and then store the result back to memory. In contrast, the x86 code took a single instruction.

A further difference between the two processors is the number of registers available. SPARC actually has 32 general-purpose registers, whereas the x86 processor has eight general-purpose registers. Some of these general-purpose registers have special functions. The SPARC processor ends up with about 24 registers available for an application to use, while in 32-bit mode the x86 processor has only six. However, because of its CISC instruction set, the x86 processor does not need to use registers to hold values that are only transiently needed—in the example, the current value of the variable in memory was not even loaded into a register. So although the x86 processor has many fewer regis-ters, it is possible to write code so that this does not cause an issue.

However, there is a definite advantage to having more registers. If there are insuffi-cient registers available, a register has to be freed by storing its contents to memory and then reloading them later. This is called register spilling and filling, and it takes both addi-tional instructions and uses space in the caches.

The two performance advantages introduced with the 64-bit instruction set exten-sions for x86 were a significant increase in the number of registers and a much improved calling convention.

Study Material, Lecturing Notes, Assignment, Reference, Wiki description explanation, brief detail

Multicore Application Programming For Windows, Linux, and Oracle Solaris : Hardware, Processes, and Threads : The Translation of Source Code to Assembly Language |