*BSD News Article 29461


Return to BSD News archive

Newsgroups: comp.os.386bsd.questions
Path: sserve!newshost.anu.edu.au!harbinger.cc.monash.edu.au!bunyip.cc.uq.oz.au!munnari.oz.au!news.Hawaii.Edu!ames!elroy.jpl.nasa.gov!swrinde!ihnp4.ucsd.edu!usc!howland.reston.ans.net!sol.ctr.columbia.edu!usenet.ucs.indiana.edu!bigbang.astro.indiana.edu!ahabig
From: ahabig@bigbang.astro.indiana.edu (Alec Habig)
Subject: Re: AHA1542 hanging with FreeBSD 1.1beta
Message-ID: <Co3zGC.64o@usenet.ucs.indiana.edu>
Sender: news@usenet.ucs.indiana.edu (USENET News System)
Nntp-Posting-Host: bigbang.astro.indiana.edu
Organization: Indiana University Astrophysics, Bloomington, IN
References: <Co3MBA.s3@pegasus.com>
Date: Mon, 11 Apr 1994 19:00:11 GMT
Lines: 51

richard@pegasus.com (Richard Foulk) writes:
>I have a 486/66 with an AHA1542CF SCSI card that locks up about
>once a day with the disk activity light on.
>
>Does this sound familiar to anyone?

heh.  I just was going to post about a similar problem on my computer.  Here's
what the FAQ has to say.  I'd like some more information about the suggested
fix, though :

	a) will the timeout be included in any future versions of FreeBSD's wd
	   drivers;
	b) Does anyone have the hacked source, to save us the time/danger of
	   hacking through the drivers ourselves, looking for the right while 
	   loop?

My problem is similar to Richard's, but doesn't leave the HD light on.  OS/2 on
my system will also experience the hangs, but continues happily after about 10
seconds.  I have an ISA IDE drive (generic controller), on a sluggy 25MHz (no
cache) 386DX, running FreeBSD 1.1-beta, 8MB RAM.

	Alec

From the FAQ : ----------------------------------------------------------------

4.1.7	The system hangs with the HD light on after intense disk usage.

	Brett Lymn (blymn@mulga.awadi.com.AU)  Provides us with a
	description of the problem and the steps that he had to take
	to fix it:

        It seems that, on some disk subsystems, the controller and the
	hard disk get out of synchronization when they are being used 
	intensively.  The result of this is that the disk completes a 
	command but the controller still believes the disk not to have 
	completed the command, so the controller status register 
	indicates the disk is busy when it is not really.  The standard 
	wd drivers are too trusting of the hardware and expect it to do 
	the right thing all the time.  There are a few while loops in 
	the wd drivers that loop on a status change from the disk 
	controller, however; if the problem I have described takes place 
	then the wd driver will be stuck looping waiting for the disk to 
	not be busy - which never happens, so you lock the machine because 
	this is a kernel level wait.  To fix this problem I put a timeout 
	into the while loops so that after a specified time the wd driver 
	will give up waiting for the drive to become ready, reset the 
	controller and retry the command.  In my experience the retry 
	always succeeds.

	Ed.Note:  The retry doesn't ALWAYS work, but it IS better than 
	just waiting for the drive to wake back up (which it never does).