Ensuring
the Correct Order of Memory Operations
There is
one more concern to discuss when dealing with systems that contain multiple
processors or multiple cores: memory ordering.
Memory ordering is the order in which memory operations are visible to the
other processors in the system. Most of the time, the processor does the right
thing without any need for the programmer to do anything.
However,
there are situations where the programmer does need to step in. These can be
either architecture specific (SPARC processors and x86 processors have
different requirements) or implementation specific (one type of SPARC processor
may have dif-ferent needs than another type of SPARC processor). The good news
is that the system libraries implement the appropriate mechanisms, so
multithreaded applications that use system libraries should never encounter
this.
On the
other hand, there is some overhead from calling system libraries, so there
could well be a performance motivation for writing custom synchronization code.
This situation is covered in Chapter 8, “Hand-Coded Synchronization and
Sharing.”
The
memory ordering instructions are given the name memory barriers (membar) on SPARC and memory fences (mfence) on x86. These instructions stop
memory opera-tions from becoming visible outside the thread in the wrong order.
The following exam-ple will illustrate why this is important.
Suppose
you have a variable, count, protected by a locking mechanism and you want to increment that
variable. The lock works by having the value 1 stored into it when it is
acquired and then the value 0 stored into it when the lock is released. The
code for acquiring the lock is not relevant to this example, so the example
starts with the assump-tion that the lock is already acquired, and therefore
the variable lock contains
the value 1. Now that the lock is acquired, the code can increment the variable
count. Then,
to release the lock, the code would store the value 0 into the variable lock. The process of incrementing the
variable and then releasing the lock with a store of the value 0 would look
something like the pseudocode shown in Listing 1.6.
Listing 1.6 Incrementing a Variable and Freeing a Lock
LOAD [&count], %A
INC %A
STORE %A, [&count]
STORE 0, [&lock]
As soon
as the value 0 is stored into the variable lock, then another thread can come
along to acquire the lock and modify the variable count. For performance reasons, some
processors implement a weak ordering of memory operations, meaning that stores
can be moved past other stores or loads can be moved past other loads. If the
previous code is run on a machine with a weaker store ordering, then the code
at execution time could look like the code shown in Listing 1.7.
Listing 1.7 Incrementing and Freeing a Lock Under Weak Memory
Ordering
LOAD [&count], %A
INC %A
STORE 0, [&lock]
STORE %A, [&count]
At
runtime, the processor has hoisted the store to the lock so that it becomes
visible to the rest of the system before the store to the variable count. Hence, the lock is released
before the new value of count is visible. Another processor could see that the lock was free and load
up the old value of count rather than the new value.
The
solution is to place a memory barrier between the two stores to tell the
processor not to reorder them. Listing 1.8 shows the corrected code. In this
example, the membar
instruction ensures that all previous store operations have completed before
the next store instruction is executed.
Listing 1.8 Using a Memory Bar to Enforce Store Ordering
LOAD [&count], %A
INC %A
STORE %A, [&count]
MEMBAR #store, #store
STORE 0, [&lock]
There are
other types of memory barriers to enforce other orderings of load and store
operations. Without these memory barriers, other memory ordering errors could
occur. For example, a similar issue could occur when the lock is acquired. The
load that fetches the value of count might be executed before the store that
sets the lock to be acquired. In such a situation, it would be possible for
another processor to modify the value of count between the time that the value
was retrieved from memory and the point at which the lock was acquired.
The
programmer’s reference manual for each family of processors will give details
about the exact circumstances when memory barriers may or may not be required,
so it is essential to refer to these documents when writing custom locking
code.
Related Topics
Privacy Policy, Terms and Conditions, DMCA Policy and Compliant
Copyright © 2018-2023 BrainKart.com; All Rights Reserved. Developed by Therithal info, Chennai.