Chapter: Multicore Application Programming For Windows, Linux, and Oracle Solaris - Using Automatic Parallelization and OpenMP

| Study Material, Lecturing Notes, Assignment, Reference, Wiki description explanation, brief detail |

Using OpenMP for Dynamically Defined Parallel Tasks

The OpenMP 3.0 specification introduced tasks. A task is a block of code that will be executed at some point in the future by one of the team of threads.

Using OpenMP for Dynamically Defined Parallel Tasks

 

The OpenMP 3.0 specification introduced tasks. A task is a block of code that will be executed at some point in the future by one of the team of threads. Every time the task directive is encountered at runtime, a new task is created and added to the list of tasks to be completed. This facility enables OpenMP to tackle many of the problems that previ-ously could only be elegantly addressed using threads. As an example, it is possible to write a version of the echo server from Chapter 5, “Using POSIX Threads,” using OpenMP tasks. This example combines parallelization across loops, parallel sections, and nested parallelization, together with parallel tasks.

 

The application uses parallel sections to start both a client and a driver thread. Listing 7.38 shows the source code to do this. The code uses nested parallelism, so this needs to be explicitly enabled by calling omp_set_nested() with a nonzero value. The parallel section explicitly requests two threads using the num_threads(2) clause. Note that for correct execution, the code relies on having at least two virtual CPUs. If the code is run on a system with only a single virtual CPU, the code will not function cor-rectly because it will stall while executing the echothread() code and will never get to execute the driverthread().

 

Listing 7.38  Using OpenMP Parallel Sections to Start Two Threads

#include <stdio.h> #include <unistd.h> #include <stdlib.h> #include <sys/types.h> #include <sys/socket.h> #include <netinet/in.h> #include <netdb.h> #include <arpa/inet.h> #include <strings.h> #include <pthread.h> #include <errno.h> #include <omp.h>

 

...

 

int main()

 

{

 

omp_set_nested( 1 );

 

#pragma omp parallel sections num_threads( 2 )

 

{

 

#pragma omp section

 

{

 

echothread();

 

}

 

#pragma omp section

 

{

 

driverthread();

 

}

 

}

 

}

 

Listing 7.39 shows the code for the driver or client part of the application. This code uses a parallel for loop in the driver code to launch multiple requests to the server in parallel. The driver code shares a single sockaddr_in structure between all the threads. Each thread gets a private copy of the variable s, which holds the ID of the socket that the thread has opened to the server. Each iteration of the loop will send a string to the server and then wait for its response.

 

Listing 7.39   Driver Thread That Generates Multiple Connections to the Server

void driverthread()

 

{

 

int s;

 

struct sockaddr_in addr;

bzero( &addr, sizeof(addr) );

 addr.sin_family = AF_INET;

 

addr.sin_addr.s_addr = inet_addr( "127.0.0.1" );

 

addr.sin_port = htons( 5000 );

 

#pragma omp parallel for shared( addr ) private( s )

 

for( int count=0; count<10000; count++ )

 

{

 

s = socket( PF_INET, SOCK_STREAM, 0 );

 

printf( "Driver thread %i ready\n", omp_get_thread_num() );

 

if ( connect( s, (struct sockaddr*)&addr, sizeof(addr) )==0 )

 

{

 

char buffer[1024];

 

for ( int i=0; i<10; i++ )

 

{

 

sprintf( buffer, "Sent %i\n", i );

 

if ( send( s, buffer, strlen(buffer)+1 ,0 )!= strlen(buffer)+1 )

 

{

 

printf( "send size mismatch\n" );

 

}

 

bzero( buffer, sizeof(buffer) ); read(s, buffer, sizeof(buffer) );

}

 

}

 

else

 

{

 

perror( "Connection refused" ); exit( 0 );

 

}

 

shutdown( s, SHUT_RDWR ); close( s );

}

 

}

Listing 7.40 shows the server code. This takes an incoming connection and launches a new task to handle that incoming connection.

 

Listing 7.40   Server Code to Handle Incoming Connections

void echothread()

 

{

 

int s;

 

int true = 1;

 

struct sockaddr_in addr;

s = socket( PF_INET, SOCK_STREAM, 0 );

 

if ( s == -1 ) { printf( "Socket error %i\n", errno ); }

 

if ( setsockopt(s, SOL_SOCKET, SO_REUSEADDR, &true, sizeof(int))==-1 ) { printf( "setsockopt error %i\n", errno ); }

 

bzero( &addr, sizeof(addr) ); addr.sin_family = AF_INET; addr.sin_addr.s_addr = htonl( INADDR_ANY ); addr.sin_port = htons( 5000 );

 

if ( bind( s, (struct sockaddr*)&addr, sizeof(addr) ) != 0 ) { printf( "Bind error %i\n", errno ); }

 

listen( s, 4 );

 

#pragma omp parallel

 

{

 

#pragma omp single

 

while( 1 )

 

{

 

struct sockaddr client;

 

int size = sizeof(client);

 

int stream = accept( s, &client, &size );

 

#pragma omp task

 

{

 

char buffer[1024];

 

if ( stream >= 0 )

 

{ printf("Accepted by thread ID %i\n", omp_get_thread_num()); } else

 

{ printf("Accept error %i\n", errno); }

 

while ( recv( stream, buffer, sizeof(buffer), 0 ) )

 

{

 

send( stream, buffer, strlen(buffer)+1, 0 );

 

}

 

close( stream );

 

printf( "Stream closed\n" );

 

}

 

}

 

}

 

}

 

 

The code uses three OpenMP directives. We have already met the omp parallel directive, which denotes the start of a parallel region, but not the omp single directive, which tells the compiler that only one thread is to execute the enclosed code. As dis-cussed earlier, all the threads will execute the code in the parallel region by default. This single thread is responsible for accepting incoming connections and then producing the new tasks that handle the details of the connection.

 

Finally, the omp task directive encloses the region of code that is to be executed as the task. The variable stream is scoped as firstprivate by default, so each task gets a private copy of the variable. Within the task, this variable is assigned the value that it holds at the time that the task was created. The new task then handles the echoing back of data that is sent on that particular socket.

 

Listing 7.41 shows the results of compiling and running this code on a four-way machine. The resulting applications needs to be linked with the socket library (-lsocket) and the network services library (-lnsl). The key thing to observe is that the threads sending and receiving the sockets change, indicating that the work is being distributed across all the available threads.

 

Listing 7.41 Output from Client-Server Code Parallelized Using Nested OpenMP Directives

% cc -O -xopenmp -xloopinfo omp_sockets.c -lsocket -lnsl

 

"omp_sockets.c", line 24: PARALLELIZED, user pragma used "omp_sockets.c", line 33: not parallelized, loop inside OpenMP region "omp_sockets.c", line 76: not parallelized, loop has multiple exits "omp_sockets.c", line 95: not parallelized, not a recognized for loop % ./a.out

 

Echo socket setup Driver thread 0 ready Driver thread 1 ready Driver thread 3 ready Driver thread 2 ready Accepted by thread ID 1 Accepted by thread ID 3 Accepted by thread ID 2 Driver thread 3 ready Stream closed

 

Accepted by thread ID 3 Driver thread 1 ready Driver thread 0 ready Stream closed

Accepted by thread ID 2

 

...


Study Material, Lecturing Notes, Assignment, Reference, Wiki description explanation, brief detail


Copyright © 2018-2020 BrainKart.com; All Rights Reserved. Developed by Therithal info, Chennai.