BARRIERS AND CONDITION VARIABLES
Let’s take a look at another
problem in shared-memory programming: synchronizing the threads by making sure
that they all are at the same point in a program. Such a point of
synchronization is called a barrier because no thread can proceed beyond the
barrier until all the threads have reached it.
Barriers have numerous
applications. As we discussed in Chapter 2, if we’re tim-ing some part of a
multithreaded program, we’d like for all the threads to start the timed code at
the same instant, and then report the time taken by the last thread to finish,
that is, the “slowest” thread. We’d therefore like to do something like this:
/∗ Shared ∗/
double elapsed_time;
. . .
/∗ Private ∗/
/∗ Private ∗/
double my_start, my_finish,
my_elapsed;
. . .
Synchronize threads;
Store current time in
my_start;
/∗ Execute timed code ∗/
. . .
Store current time in
my_finish;
Store current time in
my_finish;
my_elapsed = Maximum of
my_elapsed values;
Using this approach, we’re
sure that all of the threads will record my start at approximately the same time.
Another very important use of
barriers is in debugging. As you’ve probably already seen, it can be very
difficult to determine where an error is occuring in a parallel program. We
can, of course, have each thread print a message indicating which point it’s
reached in the program, but it doesn’t take long for the volume of the output
to become overwhelming. Barriers provide an alternative:
point in program we want to reach; barrier;
if
(my_rank == 0) {
printf("All threads reached this pointnn");
fflush(stdout);
}
Many implementations of
Pthreads don’t provide barriers, so if our code is to be portable, we need to
develop our own implementation. There are a number of options; we’ll look at
three. The first two only use constructs that we’ve already studied. The third
uses a new type of Pthreads object: a condition variable.
1. Busy-waiting and a mutex
Implementing a barrier using busy-waiting and a
mutex is straightforward: we use a shared counter protected by the mutex. When
the counter indicates that every thread has entered the critical section,
threads can leave a busy-wait loop.
/* Shared and initialized by the main thread */
int counter;
/* Initialize to 0* /
int thread_count;
pthread_mutex_t barrier_mutex;
. . .
void Thread_work(. . .) {
. . .
/* Barrier */
pthread_mutex_lock(&barrier_mutex);
counter++;
pthread_mutex_unlock(&barrier_mutex); while (counter < thread_count);
. . .
}
Of course, this implementation will have the
same problems that our other busy-wait codes had: we’ll waste CPU cycles when
threads are in the busy-wait loop, and, if we run the program with more threads
than cores, we may find that the performance of the program seriously degrades.
Another issue is the shared variable counter. What happens if we want to implement a second barrier and we try
to reuse the counter? When the first barrier is completed, counter will have the value thread_count. Unless we can somehow reset counter, the while condition we used for our first barrier counter < thread_count will be false, and the barrier won’t cause the
threads to block. Furthermore, any attempt to reset counter to zero is almost certainly doomed to failure. If the last thread
to enter the loop tries to reset it, some thread in the busy-wait may never see
the fact that counter == thread_count, and that thread may hang in the busy-wait. If some thread tries
to reset the counter after the barrier, some other thread may enter the second
barrier before the counter is reset and its increment to the counter will be
lost. This will have the unfortunate effect of causing all the threads to hang
in the second busy-wait loop. So if we want to use this barrier, we need one
counter variable for each instance of the barrier.
2. Semaphores
A natural question is whether we can implement
a barrier with semaphores, and, if so, whether we can reduce the number of
problems we encountered with busy-waiting. The answer to the first question is
yes:
/* Shared variables */
int counter; /* Initialize to 0 */
sem_t count_sem; /* Initialize
to 1 */
sem_t barrier_sem; /* Initialize
to 0 */
. . .
void Thread_work(...)
{
. . .
/* Barrier */
sem_wait(&count_sem);
if (counter
== thread_count - 1) { counter = 0;
sem_post(&count_sem);
for (j = 0; j < thread_count -1; j++) sem_post(&barrier_sem);
} else { counter++;
sem_post(&count_sem); sem_wait(&barrier_sem);
}
. . .
}
As with the busy-wait barrier, we have a counter
that we use to determine how many threads have entered the barrier. We use two
semaphores: count_sem pro-tects the counter, and barrier_sem is used to block threads that have entered the
barrier. The count_sem semaphore is initialized to 1 (that is, “unlocked”), so the first
thread to reach the barrier will be able to proceed past the call to sem_wait. Subsequent threads, however, will block until they can have
exclusive access to the counter. When a thread has exclusive access to the
counter, it checks to see if counter < thread_count-1. If it is, the thread increments counter “relinquishes the lock” (sem_post(&count
sem)) and blocks in sem
wait(&barrier_sem). On the other hand, if counter == thread_count-1, the thread is the last to enter the
barrier, so it can reset counter to zero and “unlock” count sem by calling sem post(&count_sem). Now, it wants to notify all the other threads
that they can proceed, so it executes sem post(&barrier
sem) for each of the thread
count-1 threads that are blocked in sem wait(&barrier_sem).
Note that it doesn’t matter if the thread
executing the loop of calls to sem_post(& barrier_sem) races ahead and executes multiple calls to sem post before a thread can be unblocked from sem_wait(&barrier_sem). For recall that a semaphore is an unsigned int, and the calls to sem post increment it, while the calls to sem wait decrement it—unless it’s already 0, in which case the calling
threads will block until it’s positive again, and they’ll decrement it when
they unblock. Therefore, it doesn’t matter if the thread executing the loop of
calls to sem post(&barrier_sem) gets ahead of the threads blocked in the calls
to sem_wait(&barrier_sem), because eventually the blocked threads will see that barrier sem is positive, and they’ll decrement it and proceed.
It should be clear that this implementation of
a barrier is superior to the busy-wait barrier, since the threads don’t need to
consume CPU cycles when they’re blocked in sem_wait. Can we reuse the data structures from the
first barrier if we want to execute a second barrier?
The counter can be reused, since we were careful to reset
it before releas-ing any of the threads from the barrier. Also, count_sem can be reused, since it is reset to 1 before any threads can leave
the barrier. This leaves barrier_sem. Since there’s exactly one sem post for each sem wait, it might appear that the value of barrier_sem will be 0 when the threads start executing a second barrier.
However, suppose we have two threads, and thread 0 is blocked in sem_wait(&barrier_sem) in the first barrier, while thread 1 is
executing the loop of sem post. Also suppose that the operating system has
seen that thread 0 is idle, and descheduled it out. Then thread 1 can go on to
the second barrier. Since counter == 0, it will execute the else clause. After incrementing counter, it executes sem_post(&count_sem), and then executes sem wait(&barrier_sem).
However, if thread 0 is still descheduled, it
will not have decremented barrier sem. Thus when thread 1 reaches sem_wait(&barrier_sem), barrier_sem
will still be 1, so it will simply decrement barrier_sem and proceed. This will have the unfortunate consequence that when
thread 0 starts executing again, it will still be blocked in the first sem_wait(&barrier_sem), and thread 1 will proceed through the second
barrier before thread 0 has entered it. Reusing barrier_sem therefore results in a race condition.
3. Condition variables
A somewhat better approach to creating a
barrier in Pthreads is provided by condition variables. A condition variable is a data object that allows a thread to
suspend exe-cution until a certain event or condition occurs. When the event or condition occurs another thread can signal the thread to “wake up.” A condition variable
is always associated with a mutex.
Typically, condition variables are used in
constructs similar to this pseudocode:
lock mutex;
if condition has occurred signal thread(s);
else {
unlock the mutex and block;
/* when thread is unblocked, mutex is relocked*/
}
unlock mutex;
Condition variables in Pthreads have type
pthread cond t. The function
int pthread_cond_signal(pthread_cond_t* cond_var_p /* in/out */);
will unblock one of the blocked threads, and
int pthread_cond_broadcast(pthread_cond_t* cond_var_p /* in/out */);
will unblock all of the blocked threads. The
function
int pthread_cond_wait(
pthread_cond_t* cond_var_p /* in/out */,
pthread_mutex_t* mutex_p /* in/out */);
will unlock the mutex
referred to by mutex p and cause the
executing thread
to block until it is unblocked by another thread’s call to pthread cond signal
or pthread cond broadcast. When the thread is
unblocked, it reacquires the mutex. So in effect, pthread cond wait implements
the following sequence of functions:
pthread_mutex_unlock(&mutex_p);
wait_on_signal(&cond_var_p);
pthread_mutex_lock(&mutex_p);
The following code implements a barrier with a
condition variable:
/* Shared */
int counter = 0;
pthread_mutex_t mutex;
pthread_cond_t cond_var;
. . .
Void* Thread work(. . .) {
. . .
/* Barrier */
pthread_mutex_lock(&mutex);
counter++;
if (counter == thread_count) { counter = 0;
pthread_cond_broadcast(&cond_var);
} else {
while (pthread_cond_wait(&cond_var,
&mutex) != 0);
}
pthread_mutex_unlock(&mutex);
. . .
}
Note that it is possible that events other than
the call to pthread_cond_broadcast can cause a suspended thread to unblock (see, for example,
Butenhof [6], page 80). Hence, the call to pthread
cond wait is usually placed in a while loop. If the thread is unblocked by some event
other than a call to pthread cond signal or pthread_cond_broadcast, then the return value of pthread cond wait will be nonzero, and the unblocked thread will call pthread_cond wait again.
If a single thread is being awakened, it’s also
a good idea to check that the condition has, in fact, been satisfied before
proceeding. In our example, if a single thread were being released from the
barrier with a call to pthread_cond_signal, then that thread should verify that counter == 0 before proceeding. This can be dangerous with the broadcast,
though. After being awakened, some thread may race ahead and change the
condition, and if each thread is checking the condition, a thread that awakened
later may find the condition is no longer satisfied and go back to sleep.
Note that in order for our barrier to function
correctly, it’s essential that the call to pthread_cond_wait unlock the mutex. If it didn’t unlock the
mutex, then only one thread could enter the barrier; all of the other threads
would block in the call to pthread_mutex_lock, the first thread to enter the barrier would
block in the call to pthread_cond_wait, and our program would hang.
Also note that the semantics of mutexes
requires that the mutex be relocked before we return from the call to pthread_cond_wait. We “obtained” the lock when we returned from the call to pthread_mutex_lock. Hence, we should at some point “relinquish” the lock through a
call to pthread_mutex_unlock.
Like mutexes and semaphores, condition
variables should be initialized and destroyed. In this case, the functions are
int pthread_cond_init(
pthread_cond_t* cond_p /* out */,
const
pthread_condattr t* cond_attr_p /* in */);
int pthread_cond_destroy(pthread_cond_t* cond_p /* in/out
*/);
We won’t be using the second argument to pthread_cond_init (we’ll call it with second argument NULL).
4. Pthreads barriers
Before proceeding we should
note that the Open Group, the standards group that is continuing to develop the
POSIX standard, does define a barrier interface for Pthreads. However, as we
noted earlier, it is not universally available, so we haven’t discussed it in
the text. See Exercise 4.9 for some of the details of the API.
Related Topics
Privacy Policy, Terms and Conditions, DMCA Policy and Compliant
Copyright © 2018-2023 BrainKart.com; All Rights Reserved. Developed by Therithal info, Chennai.