*BSD News Article 3160


Return to BSD News archive

Path: sserve!manuel!munnari.oz.au!hp9000.csc.cuhk.hk!uakari.primate.wisc.edu!sdd.hp.com!cs.utexas.edu!uunet!usc!sol.ctr.columbia.edu!destroyer!ubc-cs!uw-beaver!yenbut
From: yenbut@cs.washington.edu (Voradesh Yenbut)
Newsgroups: comp.unix.bsd
Subject: Re: Hang up on heavy disk load (Re: Adaptec SCSI bug?)
Message-ID: <1992Aug6.055725.8222@beaver.cs.washington.edu>
Date: 6 Aug 92 05:57:25 GMT
References: <1992Aug3.150008.188@cotton.nc.u-tokyo.ac.jp>> <TMH.92Aug3185956@hektor.cs.tu-berlin.de> <1552@hcshh.hcs.de>
Sender: news@beaver.cs.washington.edu (USENET News System)
Organization: Computer Science & Engineering, U. of Washington, Seattle
Lines: 41


Let me share some of my experience. 

My system had hang up problem on SCSI driver since I used MINIX.  The
problem occurred after a SCSI drive had been accessed for a while.  I
found that invalid CCB messages (if I recalled correctly, the actual
message showed only either code number or none at all) may occur and
then the 1542B board did not make an interrupt so the system hanged.
After spending some time debugging, I found lowering the DMA transfer
speed of 1542B to the lowest one, i.e., 3.3 MB/s, fixes the problem.

The same problem occurred when I run BSD386, so my patch to as.c was
applied to lower the speed, but I found two other problems.

   1) The SCSI disk could not be read reliably (multiple runs of fsck
on the same file system did not return the same result) and the system
may hang or crash (usually with vm_fault) after heavy disk load --
etcdist had never been completely extracted.  The problems did not
occur with an ESDI disk on my system.

After several desparate attempts, last weekend I found if a (64KB)
buffer is provided inside as.c for 1542B to put data in by disk read
operations and then the buffer content is copied back to requester's
buffer. Disk can be read more reliably and etcdist can be fully
extracted. It sounds to me like a buffer may be paged out during a DMA
operation, or something is at odd on SCATTER reading with multiple
buffers.

   2) Occasionally "NMI port 61 a0, port 70 ff" appears or the system
gets panic with trap type 19 when a SCSI disk is accessed.  When I get
the message, the system may have to be rebooted afterward.  So far, I
haven't figured out what was wrong.  I really appreciate if anybody
has information on this.

My system uses:
	Micronics 486  80486-25 MHz ISA system CPU board

with the following disks and controllers:

	Ultrastor 12(F) with WREN 3 ESDI as the first drive
	Adaptec 1542B	with CDC94171 and Microp 1598