*BSD News Article 91682


Return to BSD News archive

Path: euryale.cc.adfa.oz.au!newshost.carno.net.au!harbinger.cc.monash.edu.au!msunews!agate!ihnp4.ucsd.edu!munnari.OZ.AU!news.ecn.uoknor.edu!feed1.news.erols.com!news.maxwell.syr.edu!news.apfel.de!fu-berlin.de!news.belwue.de!LF.net!pi
From: pi@complx.LF.net (Kurt Jaeger)
Newsgroups: comp.unix.bsd.freebsd.misc
Subject: 2.1.7 && adaptec 2940U[W] crashing
Date: 20 Mar 1997 21:19:16 GMT
Organization: LF.net GmbH
Lines: 54
Message-ID: <5gs9kk$5bu$1@news.LF.net>
NNTP-Posting-Host: complx.lf.net
Xref: euryale.cc.adfa.oz.au comp.unix.bsd.freebsd.misc:37499

Hi!

We recently upgraded our news server to 2.1.7 (it was 2.1.0 before that),
because with 2.1.0 it consistently hung (no panic).

That problem was solved with 2.1.7, it now crashes and reboots
reliably after approx. 24 hours of operations.

Error messages are something along the following:
 
[crash 1]
Mar 13 18:59:39 news /kernel: sd0(ahc0:0:0): timed out while idle, LASTPHASE == 0x1, SCSISIGI == 0x0
Mar 13 18:59:39 news /kernel: SEQADDR == 0x11

[crash 2]
Mar 13 23:35:53 news /kernel: sd1(ahc0:1:0): timed out while idle, LASTPHASE == 0x1, SCSISIGI == 0x0
Mar 13 23:35:53 news /kernel: SEQADDR == 0xd
Mar 13 23:41:09 news /kernel: panic: ahc0: Timed-out command times out again

At least its not a specific disk that causes that panic. All the SCSI
features (disconnect/reconnect, scsi speed > 5 MB/sec etc) were
disabled after that, no effect. There are 4 disks in the system.

We switched controllers (2940 first, 2940UW after), no difference.

I compared the aic7xxx.c code of the 2.1.0 kernel with the 2.1.7
one. Big difference -- maybe the reason the older kernel hung was
because it was not able to handle additional timeouts in a reasonable
way ? The new one at least tries it.

Anyway, the next crash didn't take long, now yet another drive, and
it looks like it was pretty confusing to the buffer cache.

---------
sd2(ahc0:2:0): timed out while idle, LASTPHASE == 0x1, SCSISIGI == 0x0
SEQADDR == 0x8
pid 25152 (in.nnrpd), uid 23: exited on sig\M^?\^Ganic: ahc0: Timed-out command 
times out again

syncing disks... 78 78 78 78 78 78 78 78 78 78 78 78 78 78 78 78 78 78 78 78 giv
ing up
Automatic reboot in 15 seconds - press a key on the console to abort

---------

The next thing I'd like to do is a) shorten the scsi-cable (currently
approx. 1.5 m) and b) to enable savecore -- anyone with a guide on
how to use its result 8-} ?

-- 
MfG/Best regards, Kurt Jaeger                                  23 years to go !
LF.net GmbH        pi@LF.net
Vor dem Lauch 23   fon +49 711 90074-23
D-70567 Stuttgart  fax +49 711 7289041