*BSD News Article 71567


Return to BSD News archive

#! rnews 2859 bsd
Newsgroups: alt.sys.sun,comp.bugs.4bsd
Path: euryale.cc.adfa.oz.au!newshost.anu.edu.au!harbinger.cc.monash.edu.au!news.rmit.EDU.AU!news.unimelb.EDU.AU!munnari.OZ.AU!news.mel.connect.com.au!yarrina.connect.com.au!minotaur.labyrinth.net.au!zen!sjg
From: sjg@zen.void.oz.au (Simon J. Gerraty)
Subject: SunOS: shutdown(s, 1) ineffective on socketpair()
Organization: Zen programming...
Message-ID: <1996Jun20.164411.12575@zen.void.oz.au>
Date: Thu, 20 Jun 1996 16:44:11 GMT
Lines: 80
Xref: euryale.cc.adfa.oz.au alt.sys.sun:10330 comp.bugs.4bsd:2101

I have reason to believe that SunOS (4.0 and 4.1.4 tested so far) has
a bug wrt shutdown(2) on a socketpair.  I'm wondering whether this is
a known problem and if perhaps there is a patch for it.

Background etc:

I've got this great new tool: SSLrshd - rsh over SSL so no
hosts.equiv,.rhosts etc. Totally cool, secure etc etc. (no not ready
for realease).

Because of the encryption going on, the server cannot simply exec the
command as rshd normlly does, so SSLrshd uses a socketpair to talk to
a child running the command.  The following code is used to shuffle
data between the client and the child.


int
shuffle(net, child)
	int net, child;
{
	int pid, ppid;

	ppid = getpid();

	/*
	 * <sjg> BSD really should support signal(SIGCHLD, SIG_IGN)
	 * and have automatic reapping as in sys V.
	 * Anyway we have the parent do stdin since it normally runs
	 * out before stdout.  Inetd will reap the parent, and the
	 * child will be inherited by init.
	 * Thus we avoid zombies, having to cleanup for ourselves
	 * and the portability issues that introduces.
	 */
	switch(pid = fork()) {			/* yes again! */
	case -1:
		error("cannot fork");
		exit(1);
	case 0:
		copy(child, net);
		kill(ppid, SIGTERM);
		break;
	default:
		copy(net, child);
		shutdown(child, 1);
#ifdef sun
		/*
		 * <sjg> looks a lot like a bug...
		 * child never sees EOF so never dies.
		 */
		sleep(10);			/* REVISIT */
		shutdown(child, 2);		/* this stop everything */
#endif
		exit(0);
		break;
	}
}

Anyway, running this on a 4.4BSD derrived system, everything works
as expected, and using fstat, I can confirm there are no extra fd's
about that would cause the child not to see EOF.

Running on SunOS, the child never exits, unless the above #ifdef sun
code is added, and then sure enough after a 10s delay everything
shutdown as expected.  This simple fact demonstrates that EOF is being
seen by the server and shutdown(child, 1) is being called as expected.

The above hack is ok for the trivial case and for demos etc but it is
hardly workable - no fixed delay then die is. Eg:

	echo "sleep 20; echo You missed" | SSLrsh sunos

will not work.

So, is this a known bug, and is there a common work around?

--sjg
-- 
Simon J. Gerraty        <sjg@zen.void.oz.au>

#include <disclaimer>   /* imagine something _very_ witty here */