*BSD News Article 19223


Return to BSD News archive

Path: sserve!newshost.anu.edu.au!munnari.oz.au!constellation!osuunx.ucc.okstate.edu!moe.ksu.ksu.edu!vixen.cso.uiuc.edu!uwm.edu!wupost!uunet!decwrl!decwrl!rtech!amdahl!amdahl!agc
From: agc@uts.amdahl.com (Alistair G. Crooks)
Newsgroups: comp.os.386bsd.questions
Subject: Re: Hard Disk Errors
Message-ID: <31oq030bd8jL00@amdahl.uts.amdahl.com>
Date: 6 Aug 93 09:35:34 GMT
References: <23ho68$3ms@stimpy.css.itd.umich.edu> <1993Aug05.175822.1082@crash>
Organization: Amdahl Corporation, Sunnyvale CA
Lines: 55

> Alex Tang (altitude@css.itd.umich.edu) wrote:
> : Hi.  I've got NetBSD with one IDE drive and one SCSI drive.  Currently, I have
> : a swap partition on both.  My problem is that i keep getting both soft and
> : hard errors on my ide drive (both in the filesystem and in the swap.).  I'll
> : get a message that says something like 
> : 
> : hard error writing wd0e
> : fsbn 2352 of 2352-2359 (bn 39343 cn 154 bn 4 sn 5)
> 
> I get the same thing. When my system gets this hard error it crashes.

OK, from what I've found out so far, Chris Demetriou told me that an
IDE disk is just an st506 with bad sector forwarding.  Reasoning that
my IDE drive (an st2383a) was very old, and therefore was not doing
bad sector forwarding properly, (hence finding bad blocks where the
BIOS media analysis had found them, and remembering that low-level
formatting of IDE disks won't work), I tried installing NetBSD with
the disk labbelled as st506, and telling it to use bad144 (the DEC
standard) to forward bad sectors.

It went away, correctly left a bad sector cylinder at the end of the
disk, and it said that it had correctly written the bad sector table. 
However, when I had installed the kernel on the disk (at the end of
the install procedure), it tried to boot from the kernel, and then
said "Bad badsect table".  This is because it does a BIOS read of the
last disk cylinder, and finds no bad sector table there.  Now this
could be for 3 reasons:

1. the table wasn't written correctly in the first place.
2. Using the BIOS means that translation may take place during the read,
and I've got a 1747 cylinder IDE disk, way above the 1024 cylinder limit
set by the BIOS (thanks, IBM/Microsoft).
3. the badsect table was being overwritten during the disklabel.

The bad144 code needs sorting out anyway.  In the NetBSD kernel, there
are 4 places where the DKBAD_MAGIC number is defined - it should be in
.../sys/dkbad.h (this is mentioned in an "XXX" comment - I've not
looked in the NetBSD current sources yet, so I don't know if it's been
done). And I haven't checked out the source to /sbin/bad144 yet.

However, help is at hand here - I'm getting another disk today, and
hope to be able to build a NetBSD system on that - then I'll be able
to play about with the other (bad) disk, and see what the score is.

And if you think this is frustrating, I tried loading Linux (SLS 1.03),
and it couldn't even make a file system on the disk.

More as it happens...

Regards,
Alistair
--
Alistair G. Crooks (agc@uts.amdahl.com)			     +44 252 346377
Amdahl European HQ, Dogmersfield Park, Hartley Wintney, Hants RG27 8TE, UK.
[These are only my opinions, and certainly not those of Amdahl Corporation]