Chapter: Multicore Application Programming For Windows, Linux, and Oracle Solaris : Using POSIX Threads

Creating Threads

An application initially starts with a single thread, which is often referred to as the main thread or the master thread. Calling pthread_create() creates a new thread.

Creating Threads

An application initially starts with a single thread, which is often referred to as the main thread or the master thread. Calling pthread_create() creates a new thread. It takes the following parameters:

ⁿA pointer to a pthread_t structure. The call will return the handle to the thread in this structure.

ⁿA pointer to a pthread attributes structure, which can be a null pointer if the default attributes are to be used. The details of this structure will be discussed later.

ⁿThe address of the routine to be executed.

ⁿA value or pointer to be passed into the new thread as a parameter.

Listing 5.1 shows how this API call can be used.

1. www.unix.org

Listing 5.1 Creating a New Thread

#include <pthread.h> #include <stdio.h>

void* thread_code( void * param )

{

printf( "In thread code\n" );

}

int main()

{

pthread_t thread;

pthread_create( &thread, 0, &thread_code, 0 ); printf( "In main thread\n" );

}

In this example, the main thread will create a second thread to execute the routine thread_code(), which will print one message while the main thread prints another. The call to create the thread has a value of zero for the attributes, which gives the thread default attributes. The call also passes the address of a pthread_t variable for the func-tion to store a handle to the thread.

The return value from the pthread_create() call is zero if the call is successful; otherwise, it returns an error condition.

Thread Termination

Child threads terminate when they complete the routine they were assigned to run. In Listing 5.2, the child thread will terminate when it completes the routine thread_code(). The value returned by the routine executed by the child thread can be made available to the main thread when the main thread calls the routine pthread_join().

The pthread_join() call takes two parameters. The first parameter is the handle of the thread that is to be waited for. The second parameter is either zero or the address of a pointer to a void, which will hold the value returned by the child thread.

The resources consumed by the thread will be recycled when the main thread calls pthread_join(). If the thread has not yet terminated, this call will wait until the thread terminates and then free the assigned resources. Listing 5.2 shows an expanded example where the main thread waits for the child thread to complete.

Listing 5.2 Creating a New Thread and Waiting for It to Complete

#include <pthread.h> #include <stdio.h>

void* thread_code( void * param )

{

printf( "In thread code\n" );

}

int main()

{

pthread_t thread;

pthread_create( &thread, 0, &thread_code, 0 );

printf( "In main thread\n" );

pthread_join( thread, 0 );

}

Another way a thread can terminate is to call the routine pthread_exit(), which takes a single parameter—either zero or a pointer—to void. This routine does not return and instead terminates the thread. The parameter passed in to the pthread_exit() call is returned to the main thread through the pthread_join(). The child threads do not need to explicitly call pthread_exit() because it is implicitly called when the thread exits. Unless the thread is a detached thread, which will be covered later, the resources used by the thread are not freed until another thread calls pthread_join(), passing in the handle of the exited thread.

Passing Data to and from Child Threads

In many cases, it is important to pass data into the child thread and have the child thread return status information when it completes. To pass data into a child thread, it should be cast as a pointer to void and then passed as a parameter to pthread_create(). It is critical to realize that the child thread can start executing at any point after the call, so the pointer must point to something that still exists and still retains the same value. This rules out passing in pointers to changing variables as well as pointers to information held on the stack (unless the stack is certain to exist until after the child thread has read the value).

Listing 5.3 shows an acceptable way of passing the value of a variable into the child thread. The value is type cast to a void* and then passed as the parameter to the thread.

Listing 5.3 Passing a Value into a Created Thread

for ( int i=0; i<10; i++ )

pthread_create( &thread, 0, &thread_code, (void *)i );

Listing 5.4 shows an unacceptable way of passing the value of a variable into the child thread.

Listing 5.4 Erroneous Way of Passing Data to a New Thread

for ( int i=0; i<10; i++ )

pthread_create( &thread, 0, &thread_code, (void *)&i );

The code in Listing 5.4 code is unacceptable for two reasons: First, the variable i will almost certainly have changed value before the child thread starts executing, and second, the variable i is allocated on the stack, and there is no guarantee that the stack space will even be in scope when the child thread starts executing.

The child thread will receive the value passed by the main thread as a parameter, which usually will need to be type cast to an appropriate value. Listing 5.5 shows an example.

Listing 5.5 Reading the Parameter Passed to the New Thread

void* child_thread( void* value )

{

int id = (int)value;

...

}

The return value from a child thread will be made available to the main thread through the second parameter of the pthread_join() function call, as shown in Listing 5.6.

Listing 5.6 Reading the Return Value of an Exiting Child Thread

#include <pthread.h> #include <stdio.h>

void* child_thread( void * param )

{

int id = (int)param;

printf( "Start thread %i\n", id ); return (void *)id;

}

int main()

{

pthread_t thread[10]; int return_value[10];

for ( int i=0; i<10; i++ )

{

pthread_create( &thread[i], 0, &child_thread, (void*)i );

}

for ( int i=0; i<10; i++ )

{

pthread_join( thread[i], (void**)&return_value[i] ); printf( "End thread %i\n", return_value[i] );

}

This code will pass a unique number into each child thread, and then the child thread will return this number as its return code. The value is then made available to the main thread through the pthread_join() call.

Detached Threads

The previous discussion focused on joinable threads, threads that terminate and then wait for the main thread to read their return value before the resources that they consume are recycled.

It is also possible to create detached threads that do not wait around for another thread to call pthread_join() before the resources they consume are recycled. One way to do this is to set the appropriate attribute in the thread attributes structure; this will be dis-cussed in the next section. Another way is to call pthread_detach() on an existing thread. Calling pthread_join() on detached threads is an error; they do not need the call, and making the call will return an error value.

The handle of the detached thread may be recycled when the thread exits, so any cached version of the handle may no longer refer to the original thread; hence, care needs to be taken when writing code that uses detached threads.

Listing 5.7 shows an example of detaching a thread. The child thread calls pthread_self() to get its own handle, which it can then use to convert itself into a detached thread.

Listing 5.7 Detaching a Thread Using pthread_detach() Call

#include <pthread.h> #include <stdio.h>

void* child_routine( void * param )

{

int id = (int)param;

printf( "Detach thread %i\n", id ); pthread_detach( pthread_self() );

}

int main()

{

pthread_t thread[10];

for ( int i=0; i<10; i++ )

{

pthread_create( &thread[i], 0, child_routine, (void*)i );

}

Threads can also be created in the detached state. To do this, it is necessary to pass a set of attributes into the call to pthread_create().

Setting the Attributes for Pthreads

The attributes for a thread are set when the thread is created. Some attributes, such as the detached state, can be modified once the thread exists, but others cannot be changed. To set the initial thread attributes, first create a thread attributes structure, and then set the appropriate attributes in that structure, before passing the structure into the pthread_create() call. Once the attributes have been used to set up a thread, they can be destroyed with pthread_attr_destroy(). Listing 5.8 shows the basic outline of this.

Listing 5.8 Passing Attributes to pthread_create

#include <pthread.h>

...

int main()

{

pthread_t thread;

pthread_attr_t attributes;

pthread_attr_init( &attributes );

pthread_create( &thread, &attributes, child_routine, 0 );

pthread_attr_destroy( &attributes );

}

There are various attributes that can be set using API calls. The most useful ones determine whether the thread is created in the detached or joinable state, as well as the size of the stack allocated to the new thread.

A thread can be created as either a detached or a joinable thread. The default is to create a joinable thread. The code in Listing 5.9 sets the attributes to create a detached thread.

Listing 5.9 Creating a Detached Thread

#include <pthread.h> #include <stdio.h>

void* child_routine( void * param )

{

int id = (int)param; printf( "Thread %i\n", id );

}

int main()

{

pthread_t thread;

pthread_attr_t attributes;

pthread_attr_init( &attributes );

pthread_attr_setdetachstate( &attributes, PTHREAD_CREATE_DETACHED );

pthread_create( &thread, &attributes, child_routine, 0 );

pthread_attr_destroy( &attributes );

}

The default stack size is dependent upon the operating system. The code in Listing 5.10 will print out the default stack size.

Listing 5.10 Reading the Stack Size Attribute for a New Thread

#include <pthread.h> #include <stdio.h>

int main()

{

size_t stacksize; pthread_attr_t attributes; pthread_attr_init( &attributes );

pthread_attr_getstacksize( &attributes, &stacksize );

printf( "Stack Size = %i\n", stacksize);

pthread_attr_destroy( &attributes );

}

Running it on Ubuntu produces the result shown in Listing 5.11, indicating that the default stack size is 8MB.

Listing 5.11 Compiling and Running Code to Show Default Stack Size on Ubuntu

$ gcc stack.c -lpthread

$ ./a.out

Stack Size = 8388608

However, running the same code on Solaris produces the result shown in Listing 5.12.

Listing 5.12 Compiling and Running Code to Show Default Stack Size on Solaris

$ cc stack.c

$ ./a.out

Stack Size = 0

Reading the man pages indicates that if zero is set for the stack size, the Solaris defaults to 1MB for 32-bit code and 2MB for 64-bit code.

Another command that controls stack size is ulimit -s <stacksize>. On Linux, this command is used to set the default stack size for both the initial thread created and for subsequent threads. On Solaris, this command controls only the stack size for the ini-tial thread. Consequently, to write portable codes, it is best to explicitly control the size of the stack for any child threads, particularly if the code places large objects on the stack or uses recursion.

The obvious question to ask is, why not set the largest stack possible for all child threads? The answer to this question leads to a discussion on how stacks are created.

To allow the both the heap (where malloc obtains its memory from) and the stack to grow, the heap is usually placed after the application at the low end of the addressable memory, and the stack is usually placed at the upper end of memory, as in Figure 5.1. The heap grows upward, and the stack can grow downward.

This approach works for a single-threaded application, but each thread in a multi-threaded application needs its own stack. To do this, the application must have a limit to the size of the initial stack for the main thread. Each child thread must also have a limit to its stack size. The resulting layout in memory is rather like Figure 5.2.

Each thread receives an adjacent block of memory of fixed size for its stack. There is a finite address space available, so memory used for stack space is taken from memory that can be used for the heap. For 64-bit applications, the address space is sufficiently large so that this is not a problem. For 32-bit applications, it is relatively easy to run out of address space. If each thread is allocated an 8MB stack, then there can be at most 512 simultaneous threads (512 ∗ 8 MB = entire 4GB address space). Hence, for some appli-cations, it can be a good idea to assess how much memory is actually required for the stack of each child thread. The absolute minimum acceptable memory to provide to a thread is stored in the variable PTHREAD_STACK_MIN. This size would provide no space on the stack for local variables or making function calls, so it would rarely be used with-out also including some additional space.

Study Material, Lecturing Notes, Assignment, Reference, Wiki description explanation, brief detail

Multicore Application Programming For Windows, Linux, and Oracle Solaris : Using POSIX Threads : Creating Threads |