Chapter: Network Programming and Management : Application Development

Handling the Interrupted System Calls

The basic rule that applies here is that when a process is blocked in a slow system call and the process catches a signal and the signal handler returns, the system call can return an error of EINTR. (Error interruption).

Handling the Interrupted System Calls:

to return EINTR. Even if some system supports the SA_RESTART flag, not all interrupted system calls

may automatically be started. Most Berkeley-derived implementation never automatically restart.

To handle interrupted accept, we change the call to accept (in the server programme) in the

following way:

for ( ; ; ) {

clilen = sizeof(cliaddr);

if ( ( connfd = accept (listenfd, (SA) & cliaddr, & clilen)) < 0) {

if (errno == EINTR

continue;

else

err_sys (― accept error‖);

}

IN this code, the interrupted system call is restarted. This method works for the functions read,

write, select and open etc. But there is one function that cannot be restarted by itself. – connect. If this function returns EINTR, we cannot call it again, as doing so will return an immediate error. In this case we must call select to wait for the connection to complete.

Wait () and waitpid () functions: pid_t wait(int *statloc);

pid_t waitpid (pid_t pid, int *statloc, int options);

wait and waitpid both return two values: the return value of the function is the process ID of the terminated child, and the termination status of the child (integer) is returned through statloc pointer. (Termination status are determined when three macros are called that examine the termination status and tell if the child terminated normally, was killed by a signal or is just the job control stopped. Additional macros tell more information on the reason.)

If there are no terminated children for the calling wait, but the process has one or more children that are still executing, then wait blocks until the first of the existing children terminate.

waitpid gives us more control over which process to wait for and whether or not to block. pid argument specify the process id that we want to wait for. a value of -1 says to wait for the first of our children to terminate. The option argument lets us specify additional options. The most common option is WNOHANG This tells the kernel not to block if there are no terminated children; it blocks only if there are children still executing.

The wait pid argument specifies a set of child processes for which status is requested. The waitpid() function shall only return the status of a child process from this set.

• If pid is equal to (pid_t) -1, status is requested for any child process. IN this respect, waitpid() is similar to wait().

• If pid is greater than 0, it specifies the process ID of a single child process for which status is requested.

• If pid is 0, status is requested for any child process whose process groups.

Difference between wait and waitpid:

To understand the difference, TCP / IP client programme is modified as follows:

The client establishes five connections with the server and then uses only the first one (sockfd[0] ) in the call to str_cli. The purpose of establishing multiple connections is to spawn multiple children from the concurrent server .

When the client terminates, all open descriptors are closed automatically by the kernel and all the five serve children terminate at about the same time. This causes five SIGCHLD signals to be delivered to the parent at about the same time, which we show in Figure below:

It is this delivery of multiple occurrences of the same signal that causes the problem that we are talking about. By executing ps, one can see that other children still exist as zombies.

Establishing a signal handler and calling wait form the signal handler are insufficient for preventing zombies. The problem is that all five signals are generated before the signal handler is executed, and the signal handler is executed only one time because Unix signals are not normally queued. Also this problem is non deterministic.

(If the client and server are on the same host, one child is terminated leaving four zombies. If we run the client and server on different hosts, the handler is executed twice once as a result of the first signal being generated and since the other four signals occur while the signal handling is executing, the handler is called once more This leaves three zombies. However depending on the timing of the FIN arriving at the server host, the signal handler is executed three or even four times).

The correct solution is to call waitpid instead of wait. Following code shows the server version of our sig_ chld function that handles SIGCHLD. Correctly. This version works because we call waitpid within the loop, fetching the status of any of our children that have terminated. We must specify the WNOHNG option; This tells waitpid not to block if there exists running children that not yet terminated. . In the code for wait , we cannot call wait in a loop, because there is no to prevent wait from blocking if there exits running children that have not yet terminated. Following version correctly handles a return EINTR from accept and it establishes a signal handler that called waitpid for all terminated children.

The following programme shows the implementation of waitpid(). The second part is the implementation of complete server programme incorporating the signal handler.

The three scenarios that we encounter with networking programme are :

We must catch SIGCHLD signal when forking child processes. We must handle interrupted system calls when we catch signal.

A SIGCHLD handler must be coded correctly using waitpid to prevent any zombies form being left around

Study Material, Lecturing Notes, Assignment, Reference, Wiki description explanation, brief detail

Network Programming and Management : Application Development : Handling the Interrupted System Calls |