Chapter: Multicore Application Programming For Windows, Linux, and Oracle Solaris : Hand-Coded Synchronization and Sharing

Atomic Operations

Atomic memory operations appear to the rest of the system as operations that either succeed or fail; there’s no partial state or state where the operation completes but the result is incorrect.

Atomic Operations

Atomic memory operations appear to the rest of the system as operations that either succeed or fail; there’s no partial state or state where the operation completes but the result is incorrect. Loads and stores, in most instances, are atomic. A load instruction will not fetch half the data from the most recent store to that cache line and half from what was previously held in the cache line. Similarly, a store will not perform a partial update of a memory address.

More complex operations are not atomic. For example, incrementing a value held in memory is usually implemented as a load of the value, the increment, and then a store of the new value back to memory. Unfortunately, in a multithreaded environment, another thread could interrupt this sequence and replace the original value held in memory with a new value. The final store would store the calculated value back to memory, but the entire operation would not reflect an increment of the new value held in memory. This is an example of a data race, as we have previously discussed.

In this situation, it would be useful to have an atomic increment instruction. This would take the value in memory, increment it, and replace it back to memory as a single operation without the possibility of other threads updating the value between the load and store parts of the operation. On x86 processors, the xadd instruction can be com-bined with the lock prefix to produce an atomic add, or the inc instruction can be combined with the lock prefix to produce an atomic increment. Listing 8.1 shows the code snippets to do this.

Listing 8.1 x86 Assembly Language Atomic Addition Variants

int atomic_add_int( volatile int *address, int value )

{

asm volatile( "lock xadd %0,%1": "+r"(value): "+m"(*address): "memory" );

return value;

}

int atomic_inc_int( int *address )

{

asm volatile ( "lock inc %0": : "+m"(*address): "memory" );

return (*address);

}

The routines are coded using gcc inline assembly language. Although it is not the intention of this book to dwell at this low level, it is appropriate to describe how the statements are put together. The keyword asm identifies the following text as an assembly language statement that will be inlined directly into the code. The keyword volatile tells the compiler not to move the statement from where it has been placed, because such movement could cause a difference to the statement’s semantics.

The assembly language code is enclosed in the parentheses. There are multiple parts to the code. The first item in the parentheses, surrounded by quotes, is the instruction. The instruction uses virtual registers %0 and %1 as parameters. The compiler will allocate real registers later and will be responsible for populating these registers with the appro-priate values.

After the assembly language instruction, there are up to three colon-delimited lists. These are optional extended syntax. The first list is of output variables and whether these are accesses to registers or memory. In the example, the expression "+r"(value) means that the output parameter is a register that should be used to update the variable value. The plus sign means that the register will be both read and written by the instruction.

The second list contains the input values and where these are located. Both routines take the pointer address as an input value, and the expression "+m"(*address) indicates that this pointer is used as a memory access. The plus sign indicates that the instruction will both read and write the location in memory.

The third list is the “clobber” list indicating what the instruction will modify. In both instances, the instruction will modify memory.

The virtual registers are numbered from the input registers, so register %0 is assigned the value of the variable address. The output registers are the next set of virtual regis-ters, so the variable value gets assigned to register %1.

It is also useful to look at the actual assembly language instructions. The xadd instruction is an exchange add, so it adds the variable value to the value held at the memory address, but it also returns the value held at the address before the add operation was performed; this is the exchange operation. The inc instruction just adds one to the value held in memory but does not return a value in any register. Both instructions are pre-fixed with the lock operation. The lock operation locks the system bus so that no other processors can touch the memory location that is being updated; hence, it is the lock prefix that actually makes these instructions atomic. Without it, the result of the operations would be undefined if there were multiple threads acting on the same memory location.

The routine atomic_add_int() adds the specified amount to the value held at the memory location and returns the value held in memory before the atomic operation.

The routine atomic_inc_int() increments the value held at the memory location and returns the value currently held in memory. Since the inc instruction does not return the new value, the return value is a load of the value held in memory. This need not be the true result of the operation; the value could have been modified between the atomic operation and the final load.

Study Material, Lecturing Notes, Assignment, Reference, Wiki description explanation, brief detail

Multicore Application Programming For Windows, Linux, and Oracle Solaris : Hand-Coded Synchronization and Sharing : Atomic Operations |