*BSD News Article 53085


Return to BSD News archive

Path: euryale.cc.adfa.oz.au!newshost.anu.edu.au!harbinger.cc.monash.edu.au!yarrina.connect.com.au!classic.iinet.com.au!swing.iinet.net.au!news.uoregon.edu!usenet.eel.ufl.edu!hookup!news.kei.com!news.mathworks.com!newsfeed.internetmci.com!news.sprintlink.net!news.charm.net!sowebo!cnordin
From: cnordin@charm.net (Craig Nordin)
Newsgroups: comp.unix.bsd.bsdi.misc
Subject: Re: Zombies? We don't need no stenkin' Zombies!
Date: 22 Oct 1995 03:23:32 -0400
Organization: Charm.Net : Baltimore Local Internet Access, Hon
Lines: 46
Message-ID: <cnordin.814346511@sowebo>
References: <cnordin.813997458@news.vni.net> <463ffn$gsd@lscruz.scf.lmsc.lockheed.com>
NNTP-Posting-Host: sowebo.charm.net


Idled looks like a plan!   Thank goodness for a plan!


Also, a very nice letter was received from a BSDI person
who really writes well and seems to know exactly where
my Zombies are comming from.  I'm reposting it here:


-----------------------------------------------------------------------
>Date: Wed, 18 Oct 1995 11:56:18 -0700
>From: Chris Torek <torek@elf.bsdi.com>
>Message-Id: <199510181856.LAA02975@elf.bsdi.com>
>To: cnordin@hq.vni.net
>Subject: Re: Zombies?  We don't need no stenkin' Zombies!
>Organization: Berkeley Software Design, Inc.

This is due to a combination of bugs, one large and one small.

The large bug is in the program that runs forever; the small bug
is in the kernel.

POSIX has a lot of details as to exactly how job control is to work.
One of the details involves sending SIGHUP and taking away access to
the terminal (serial port or pty or whatever).  The existing kernels
(through 2.0.1) do the latter a bit wrong, and give programs an error
instead of an EOF (or maybe vice versa) when they try to read from
the taken-away-terminal.

The big bug is that the program fails to check for error/EOF -- after
all, they are talking to a user and therefore errors and EOFs are
`impossible', so if they get one they just try again, and again, and...

Some of these programs might appear to be OK if the kernel bug were
fixed (as it should be in 2.1, using a temporary hack until we can
fix it `right').  Others might still loop forever, and the ones that
would work on hangup would still fail on other errors (which may be
impossible now, but should always be considered possible someday).
-- 
In-Real-Life: Chris Torek, Berkeley Software Design Inc
El Cerrito, CA	Domain:	torek@bsdi.com	+1 510 234 3167

-- 

    cnordin@charm.net