*BSD News Article 14864


Return to BSD News archive

Xref: sserve comp.bugs.4bsd:1937 comp.os.386bsd.bugs:539
Newsgroups: comp.bugs.4bsd,comp.os.386bsd.bugs
Path: sserve!newshost.anu.edu.au!munnari.oz.au!news.Hawaii.Edu!ames!agate!howland.reston.ans.net!noc.near.net!uunet!pipex!uknet!mcsun!sun4nl!eur.nl!pk
From: pk@cs.few.eur.nl (Paul Kranenburg)
Subject: Re: flock broken - I could use some help
Message-ID: <1993Apr21.184636.1121@cs.few.eur.nl>
Sender: news@cs.few.eur.nl
Reply-To: pk@cs.few.eur.nl
Organization: Erasmus University Rotterdam
References: <C5t8wH.Hs@moxie.hou.tx.us>
Date: Wed, 21 Apr 1993 18:46:36 GMT
Lines: 59

In <C5t8wH.Hs@moxie.hou.tx.us> hackney@moxie.hou.tx.us (Greg Hackney) writes:


>It appears that flock() is causing panic's and reboots.

>The problem was discovered using 'lprm' in the LPD package, and also in
>the retry feature of smail3.1.28:

>Scenario: A process flocks() a file for a long period (2 minutes
>          or more). A 2nd process tries to flock the file in the
>          blocking mode, but it alarms out and exits. Later, when the
>          first process exits, the system panics and reboots.

>Below is some test code to make it fail. Compile master.c and slave.c,
>then type:    ./master&
>              ./slave

>In 120 seconds, the system will crash (if your's works like mine).

Your system isn't like mine. In mine, the kernel overwrote some process's
text :-)

The problem is a dangling pointer left in the lockf structure belonging to
the current lock holder. The offending process frees its lock structure
after breaking out of sleep() as a result of a signal. Possible fix:
scan the list of waiting locks to remove the lock that isn't going to be
used.

It might be the case that this bug present in other NET/2 (or later) based
systems as well, so I follow up to `comp.bugs.4bsd' as well.

-pk

------- ufs_lockf.c -------
*** /tmp/da16367	Wed Apr 21 20:36:22 1993
--- ufs/ufs_lockf.c	Wed Apr 21 20:35:47 1993
***************
*** 155,160 ****
--- 155,174 ----
  		}
  #endif /* LOCKF_DEBUG */
  		if (error = tsleep((caddr_t)lock, priority, lockstr, 0)) {
+ 			struct lockf	*b = lf_getblock(lock);
+ 
+ 			/* Don't leave a dangling pointer in block list */
+ 			if (b == block) {
+ 				/* Still there, find us on list */
+ 				while (b->lf_next != NOLOCKF) {
+ 					if (b != lock) {	
+ 						b = b->lf_next;
+ 						continue;
+ 					}
+ 					b->lf_next = b->lf_next->lf_next;
+ 					break;
+ 				}
+ 			}
  			free(lock, M_LOCKF);
  			return (error);
  		}