*BSD News Article 26087


Return to BSD News archive

Path: sserve!newshost.anu.edu.au!munnari.oz.au!bunyip.cc.uq.oz.au!harbinger.cc.monash.edu.au!yeshua.marcam.com!usc!howland.reston.ans.net!gatech!bloom-beacon.mit.edu!mcrcim.mcgill.edu!homer.cs.mcgill.ca!storm
From: storm@cs.mcgill.ca (Marc WANDSCHNEIDER)
Newsgroups: comp.os.386bsd.misc
Subject: Re: Power outage mangles /etc
Date: 16 Jan 1994 05:12:42 GMT
Organization: SOCS, McGill University, Montreal, Canada
Lines: 52
Message-ID: <2haica$js0@homer.cs.mcgill.ca>
References: <1994Jan13.222516.11328@allegra.att.com>
NNTP-Posting-Host: mnementh.cs.mcgill.ca

In article <1994Jan13.222516.11328@allegra.att.com>,
Henning G. Schulzrinne <hgs@allegra.att.com> wrote:
>Last night, we had a brief power outage. All the 250-odd Suns and SGIs
>in this part of the lab came back up, with no dire consequences. My
>eight NCR PCs running NetBSD 0.9 all had random stuff (apparently from
>some include files) written over the /etc/ directory and who knows
>where else, forcing a reinstall of the operating system. fsck
>complained, but couldn't fix anything. While the NCR PCs were on a
>surge suppressor, most workstations are not. Is this expected behavior
>(i.e., are commercial derivatives of BSD like SunOS just plain more
>robust), is it the PC hardware or just a very bad day?

	i will guess that you have wd type drives running on
	these machines---ie ide, mfm, etc type drives.

	if this is the case, then you might have been hit by a pretty serious
	bug in the wd driver.  nobody seems to be sure what causes it, but
	there are some pretty annoying consequences.

	i had 50% of the files in the  /etc directory overwritten once with
	the output of ls -alF from /src/XFree86/mit/server/ddx/x386 and
	on another occasion, a large percentage of the files in the /
	partition replaced with random block and cahracter special
	files.

	somebody who was once doing some investingating found that
	things seemed to be off by exactly 512 bytes or something like
	that, and th ewrong buffers were being written out [i honestly don't
	remember the details...].

	what is curious is that it happened to all 8 machines at the same
	time...  it tends to suggest that this -wasn't- the driver bug, 
	seeing as it's such a cooincidence, but it's exactly the same
	symptoms.....

	could you provide more details about the machines?

	i'm going to be pounding on a spare machine starting this week
	to see if i can shed any light into the problem.  

	the problem is very easily solved by going to scsi drives, as i did
	[also for the dma transfer], but this really is an unacceptable
	way to fix a broken driver ;-)


							marc 'em.

-- 
-----------------------------------------------------------------------------
Marc Wandschneider					    Seattle, WA
Barney the Dinosaur sings! You faint... Barney sings!  Barney sings! --More--
You Die... --More--