Dekker’s
Algorithm
One of the first lockless algorithms was Dekker’s algorithm for mutual
exclusion. Without using any atomic operations, the algorithm ensures that only
one thread at a time out of a pair of threads can enter a critical region.
Listing 8.16 shows an implemen-tation of Dekker’s algorithm. To increment the
counter a thread would call the function increment() with its thread ID.
Listing 8.16 Implementation
of Dekker’s Algorithm for Mutual Exclusion
volatile int priority = 0; volatile int counter =
0; volatile int waiting[2];
void increment( int id )
{
waiting[id] = 1;
while( waiting[1-id] == 1 )
{
if ( priority != id )
{
waiting[id] = 0;
while ( priority != id ){} waiting[id] = 1;
}
}
/* Critical section */ counter++;
/* Exit critical section */ priority = 1-id; waiting[id] = 0;
}
The algorithm works because each thread signals that it is waiting to
get into the critical section. If the other thread has not signaled that it is
waiting for or has already entered the critical section, then the current
thread can enter it. If both threads are wait-ing, then one of the threads gets
priority, and the other thread waits until it gets priority.
The variables priority and counter,
together with the array waiting, are
shared between the two threads and as such need to be declared as being volatile. This ensures that the compiler stores these variables immediately
after modification and loads them immediately before use. To test the
correctness of this implementation, we can place the code in a harness. Listing
8.17 shows the harness. This harness creates the two threads. Each thread
increments the variable counter 1 million times, so at the end of the program,
the variable should contain the value 2 million. The program reports the
differ-ence between this value and what the counter actually contains.
Listing 8.17 Test
Harness for Dekker’s Algorithm
#include <stdio.h> #include <pthread.h>
void
* worker( void * param )
{
for ( int i=0; i<1000000; i++ )
{
increment(
(int)param );
}
}
int
main()
{
pthread_t threads[2];
pthread_create( &threads[0], 0, worker,
(void*)0 ); pthread_create( &threads[1], 0, worker, (void*)1 );
pthread_join( threads[1], 0 );
pthread_join( threads[0], 0 );
printf( "Errors = %i\n", 2*1000000 - counter );
}
Listing 8.18 shows the results of compiling and
running the program. Unfortunately, the program reports that the variable did
not get incremented the correct number of times, and this means that there were
some situations when the two threads managed to enter the critical region at
the same time.
Listing 8.18 Results
of Compiling and Running Test Harness
$
cc -O -mt dekker.c
$
./a.out
Errors
= 14
The problem with the code is one of memory ordering. The two threads can
simulta-neously indicate that they are waiting to enter the critical section by
storing to their index in the waiting array.
In the very next cycle, they load the other thread’s waiting status. Since the
other thread has only just issued the store, the store has not yet made its way
through the pipeline to be visible to the rest of the system. So, the load
instruction picks up a zero, indicating that the other thread is not waiting
for the critical section. Both threads fail to see the other thread waiting,
and both threads enter the critical region.
The way to fix this is to put memory barriers into
the code. The memory barriers ensure that previous memory operations have
completed before the next memory oper-ation issues. In this case, we want to
ensure that the store to indicate a thread is waiting is completed before the
load to check whether the other thread is also waiting. Now consider the
sequence of operations. Both threads hit the store at the same time; both
threads now wait until the stores are visible to the other processor before
issuing their load. So, both threads will get the data that was stored by the
other thread.
On SPARC processors, the appropriate memory barrier
to use is membar #StoreLoad, which
ensures that all previous stores have completed before the following load is
issued. On x86 processors, it is necessary to use an mfence operation that ensures that all previ-ous memory operations complete
before the next memory operation is issued.
Listing 8.19 shows the modified code for Dekker’s
algorithm. The code requires two memory barriers, once before the loop is entered
and a second barrier inside the loop. With this modification, the code produces
the correct result when run in the test harness.
Listing 8.19 Dekker’s
Algorithm with Memory Barriers
#include <stdio.h> #include <pthread.h>
volatile int priority = 0; volatile int counter =
0; volatile int waiting[2];
void increment( int i )
{
waiting[i] = 1;
#ifdef
__sparc
asm(
"membar #StoreLoad": : : "memory" );
#else
asm(
"mfence": : : "memory" );
#endif
while( waiting[1-i] == 1 )
{
if ( priority != i )
{
waiting[i] = 0;
while
( priority != i ){} waiting[i] = 1;
#ifdef __sparc
asm(
"membar #StoreLoad": : : "memory" );
#else
asm(
"mfence": : :"memory" );
#endif
}
}
counter++;
priority = 1-i;
waiting[i] = 0;
}
Related Topics
Privacy Policy, Terms and Conditions, DMCA Policy and Compliant
Copyright © 2018-2023 BrainKart.com; All Rights Reserved. Developed by Therithal info, Chennai.