Return to BSD News archive
Xref: sserve comp.bugs.4bsd:1937 comp.os.386bsd.bugs:539
Newsgroups: comp.bugs.4bsd,comp.os.386bsd.bugs
Path: sserve!newshost.anu.edu.au!munnari.oz.au!news.Hawaii.Edu!ames!agate!howland.reston.ans.net!noc.near.net!uunet!pipex!uknet!mcsun!sun4nl!eur.nl!pk
From: pk@cs.few.eur.nl (Paul Kranenburg)
Subject: Re: flock broken - I could use some help
Message-ID: <1993Apr21.184636.1121@cs.few.eur.nl>
Sender: news@cs.few.eur.nl
Reply-To: pk@cs.few.eur.nl
Organization: Erasmus University Rotterdam
References: <C5t8wH.Hs@moxie.hou.tx.us>
Date: Wed, 21 Apr 1993 18:46:36 GMT
Lines: 59
In <C5t8wH.Hs@moxie.hou.tx.us> hackney@moxie.hou.tx.us (Greg Hackney) writes:
>It appears that flock() is causing panic's and reboots.
>The problem was discovered using 'lprm' in the LPD package, and also in
>the retry feature of smail3.1.28:
>Scenario: A process flocks() a file for a long period (2 minutes
> or more). A 2nd process tries to flock the file in the
> blocking mode, but it alarms out and exits. Later, when the
> first process exits, the system panics and reboots.
>Below is some test code to make it fail. Compile master.c and slave.c,
>then type: ./master&
> ./slave
>In 120 seconds, the system will crash (if your's works like mine).
Your system isn't like mine. In mine, the kernel overwrote some process's
text :-)
The problem is a dangling pointer left in the lockf structure belonging to
the current lock holder. The offending process frees its lock structure
after breaking out of sleep() as a result of a signal. Possible fix:
scan the list of waiting locks to remove the lock that isn't going to be
used.
It might be the case that this bug present in other NET/2 (or later) based
systems as well, so I follow up to `comp.bugs.4bsd' as well.
-pk
------- ufs_lockf.c -------
*** /tmp/da16367 Wed Apr 21 20:36:22 1993
--- ufs/ufs_lockf.c Wed Apr 21 20:35:47 1993
***************
*** 155,160 ****
--- 155,174 ----
}
#endif /* LOCKF_DEBUG */
if (error = tsleep((caddr_t)lock, priority, lockstr, 0)) {
+ struct lockf *b = lf_getblock(lock);
+
+ /* Don't leave a dangling pointer in block list */
+ if (b == block) {
+ /* Still there, find us on list */
+ while (b->lf_next != NOLOCKF) {
+ if (b != lock) {
+ b = b->lf_next;
+ continue;
+ }
+ b->lf_next = b->lf_next->lf_next;
+ break;
+ }
+ }
free(lock, M_LOCKF);
return (error);
}