*BSD News Article 14314


Return to BSD News archive

Xref: sserve comp.unix.ultrix:17360 comp.unix.programmer:8913 comp.unix.bsd:11794
Newsgroups: comp.unix.ultrix,comp.unix.programmer,comp.unix.bsd
Path: sserve!newshost.anu.edu.au!munnari.oz.au!constellation!osuunx.ucc.okstate.edu!moe.ksu.ksu.edu!zaphod.mps.ohio-state.edu!uwm.edu!msuinfo!uchinews!ellis!besp
From: besp@ellis.uchicago.edu (Anna Pluzhnikov)
Subject: Re: Peculiar behavior of pclose() -- mistery solved (long)
Message-ID: <1993Apr11.044800.6582@midway.uchicago.edu>
Sender: news@uchinews.uchicago.edu (News System)
Reply-To: besp@midway.uchicago.edu
Organization: University of Chicago
References: <PRZEMEK.93Mar28212112@rrdstrad.nist.gov>
Date: Sun, 11 Apr 1993 04:48:00 GMT
Lines: 158

In article <PRZEMEK.93Mar28212112@rrdstrad.nist.gov> przemek@rrdstrad.nist.gov (Przemek Klosowski) writes:
>
>I came across a peculiar behavior of pclose(); when only one file is
>popen()'ed, everything works as the man page and Stevens' book 
>claim: pclose() closes the pipe and kills the child process.

pclose() doesn't kill the child, it just closes pipe to the child, so
the child gets EOF and exits by itself.

>
>However, if two filehandles are popen()'ed, the subsequent pclose() 
>never returns (on Ultrix it sits forever in wait()).
>
>Does anyone have an idea what causes it and how to get around it?
> 
>This behavior has been checked on ConvexOS and Ultrix 4.3. I enclose a
>short program demonstrating this behaviour.
>
As well as on Linux, but not on SunOS.
>
>------------------------cut here-------------------------------------
>
>#include <stdio.h>
>
>int
>main(){
>  FILE * fp1, *fp2;
>  
>  /* this works fine (it better!!!) */
>  fp1 = popen ("cat","w");
>  fprintf (fp1,"This is output to first filehandle \n");   fflush (fp1);
>  pclose(fp1);
>
>  /* this is broken (nobody ran two filters at the same time, eh?) */
>  fp1 = popen ("cat","w");
>  fp2 = popen ("cat","w");
>  fprintf (fp1,"This is output to first filehandle\n");   fflush (fp1);
>  fprintf (fp2,"This is output to second filehandle\n");  fflush (fp2);
>
>  pclose(fp1);			/* this call never returns */
>  pclose(fp2);
>}


This was a fun problem to solve. I might even use it when I next interview
somebody claiming to be a UNIX wizard :-)

Let's see what's going on. (My system is Linux mozart 0.99.pl6-2 03/07/93 i486)
21:35:10 > cc -o tmp tmp.c
21:35:23 > strace tmp
uselib("/usr/lib//libc.so.4") = -1 (No such file or directory)
uselib("/lib//libc.so.4") = 0
brk(0) = 0x2000
brk(5000) = 0x5000
brk(6000) = 0x6000
/* this is inside first call to popen() */
pipe([3,4]) = 0
fork() = 1666
close(3) = 0
fstat(4, [dev 0 0 ino 0 nlnks 1 ...]) = 0
brk(7000) = 0x7000
write(4, "This is output to first filehand".., 36This is output to first filehandle 
) = 36
/* and this is inside first call to pclose() */
close(4) = 0
sgetmask() = 0
ssetmask([SIGHUP SIGINT SIGQUIT]) = 0
wait4(-1,  - [SIGCHLD]
EXITED(0), 0, (struct rusage *)0) = 1666
sgetmask([SIGHUP SIGINT SIGQUIT]) = 0x7
ssetmask(0) = 0x7
/* second popen() */
pipe([3,4]) = 0
fork() = 1668
close(3) = 0
/* third popen() */
pipe([3,5]) = 0
fork() = 1670
close(3) = 0
fstat(4, [dev 0 0 ino 0 nlnks 1 ...]) = 0
write(4, "This is output to first filehand".., 35This is output to first filehandle
) = 35
fstat(5, [dev 0 0 ino 0 nlnks 1 ...]) = 0
write(5, "This is output to second filehan".., 36This is output to second filehandle
) = 36
/* second pclose() */
close(4) = 0
sgetmask() = 0
ssetmask([SIGHUP SIGINT SIGQUIT]) = 0
wait4(-1, 
/* second pclose() never returns ??? */

And here is sits forever ...
In another xterm:

21:35:40 > ps -l
 F   UID   PID  PPID PRI NI SIZE  RSS WCHAN      STAT TT   TIME COMMAND
 4   101  1664    58   1  0   50  148 wait4      S    p2   0:00 strace tmp
34   101  1665  1664   1  0   30  156 wait4      S    p2   0:00 tmp
 4   101  1668  1665   1  0  270  296 pause      S    p2   0:00 sh -c cat
 4   101  1669  1668   1  0   26   88 pipe_read  S    p2   0:00 cat
 4   101  1670  1665   1  0  270  296 pause      S    p2   0:00 sh -c cat
 4   101  1671  1670   1  0   26   88 pipe_read  S    p2   0:00 cat
 4   101  1681  1677  24  0   81  176            R    p1   0:00 ps -l
(Irrelevant processes deleted)

So, for some reason the first "cat" (pid 1669) doesn't die even though it's 
parent close()d the necessary file.

Let's look some more:
21:35:55 > fstat
USER     COMMAND    PID    FD  DEV   INUM   SZ|DV MODE       NAME
(this is the parent, as expected it only has a pipe to the second shell)
anna     tmp       1665    wd  3,2    750     480 drwx--x--x 
anna     tmp       1665  text  3,2   5667   14104 -rwxr-xr-x 
anna     tmp       1665   lib  3,2   1457  623620 -rwxr-xr-x /lib/libc.so.4.3.3
anna     tmp       1665     0  3,2    161   4,194 crw--w--w- /dev/ttyp2
anna     tmp       1665     1  3,2    161   4,194 crw--w--w- /dev/ttyp2
anna     tmp       1665     2  3,2    161   4,194 crw--w--w- /dev/ttyp2
anna     tmp       1665     5* (pipe -> 0x75d000 0 1r 1w)

(this is the first shell, and it still has the pipe open !!! to whom ???)
anna     sh        1668    wd  3,2    750     480 drwx--x--x 
anna     sh        1668  text  3,2      6  218112 -r-xr-xr-x 
anna     sh        1668   lib  3,2    206  173060 -r-xr-xr-x /lib/libc.2.2.2
anna     sh        1668     0* (pipe <- 0x1ac000 0 1r 1w)
anna     sh        1668     1  3,2    161   4,194 crw--w--w- /dev/ttyp2
anna     sh        1668     2  3,2    161   4,194 crw--w--w- /dev/ttyp2

anna     cat       1669    wd  3,2    750     480 drwx--x--x 
anna     cat       1669   lib  3,2   1457  623620 -rwxr-xr-x /lib/libc.so.4.3.3
anna     cat       1669     0* (pipe <- 0x1ac000 0 1r 1w)
anna     cat       1669     1  3,2    161   4,194 crw--w--w- /dev/ttyp2
anna     cat       1669     2  3,2    161   4,194 crw--w--w- /dev/ttyp2

(AHA!!! this is the second shell, and it has it's stdin going from pipe 
from "tmp", but it also has an open pipe going to the first shell, which it
inherited from "tmp" when we fork()ed inside of the third popen())
anna     sh        1670    wd  3,2    750     480 drwx--x--x 
anna     sh        1670  text  3,2      6  218112 -r-xr-xr-x 
anna     sh        1670   lib  3,2    206  173060 -r-xr-xr-x /lib/libc.2.2.2
anna     sh        1670     0* (pipe <- 0x75d000 0 1r 1w)
anna     sh        1670     1  3,2    161   4,194 crw--w--w- /dev/ttyp2
anna     sh        1670     2  3,2    161   4,194 crw--w--w- /dev/ttyp2
anna     sh        1670     4* (pipe -> 0x1ac000 0 1r 1w)

anna     cat       1671    wd  3,2    750     480 drwx--x--x 
anna     cat       1671   lib  3,2   1457  623620 -rwxr-xr-x /lib/libc.so.4.3.3
anna     cat       1671     0* (pipe <- 0x75d000 0 1r 1w)
anna     cat       1671     1  3,2    161   4,194 crw--w--w- /dev/ttyp2
anna     cat       1671     2  3,2    161   4,194 crw--w--w- /dev/ttyp2
anna     cat       1671     4* (pipe -> 0x1ac000 0 1r 1w)

So, to solve the problem, we should just reverse the order of pclose()s, and
everything would be happy ever after.

But! There is yet another mistery to solve: how come the original program
works under SunOS ?