*BSD News Article 46585


Return to BSD News archive

Path: sserve!newshost.anu.edu.au!harbinger.cc.monash.edu.au!simtel!news.kei.com!newshost.marcam.com!zip.eecs.umich.edu!newsxfer.itd.umich.edu!europa.chnt.gtegsc.com!howland.reston.ans.net!vixen.cso.uiuc.edu!sdd.hp.com!svc.portal.com!news1.best.com!shell1.best.com!not-for-mail
From: rcarter@best.com (Russell Carter)
Newsgroups: comp.unix.bsd.freebsd.misc
Subject: Re: two crash problems, anyone have any ideas? (similar experiences?)
Date: 8 Jul 1995 20:44:23 -0700
Organization: Best Internet Communications, Inc. (info@best.com)
Lines: 49
Distribution: world
Message-ID: <3tnjan$35g@shell1.best.com>
References: <3tfjvp$hbe@blob.best.net>
NNTP-Posting-Host: shell1.best.com

In article <3tfjvp$hbe@blob.best.net>, Matt Dillon <dillon@best.com> wrote:
>Configuration:
>    
>    128M memory, 130+ users, three SCSI disks (barracudas), load averages
>    around 10, NCR PCI SCSI controller, Etherlink III (ISA) ethernet.
>    pentium-90.
>
>    FreeBSD 2.0.5-RELEASE-BEST (SHELL) #4: Thu Jun 29 01:57:18 PDT 1995
>    (with last set of patches patched in).
>
>Problem #1:
>
>    Heavily loaded machine is running along.  Then, for no good reason,
>    anything requiring disk I/O comes to a screaming halt... if I happen
>    to have a vmstat running, it continues to go, but attempting to do
>    anything (such as ^C or run a program from an existing shell prompt)
>    blocks forever.
>
>    The vmstat shows a large number of processes blocking, virtually none
>    running, and disk I/O going to zero.
>
>    It is possible that some paging still works... I am able to send a packet
>    to my little rebooter daemon (which was swapped out and managed to swap
>    itself in) and it responds that it is calling reboot(), and at that 
>    point the kernel is able to sync its disks, but is UNABLE to dump ... 
>    just freezes solid.
>
>    So, unfortunately, it is impossible to get a crash dump out of the 
>    situation.  Worse, the machine doesn't panic, and it is impossible
>    to reboot it without hitting the hard reset.
>
>    The machine lasts anywhere from 30 minutes to a day and a half before
>    crashing and burning.
>

Precise symptoms duplicatable on my lightly loaded P54C-100, 64 MB, ncr+cdrom
+st32550N+conner4326 DAT *WHENEVER* I try to backup 1GB+ of
stuff to the DAT *AND* the DAT is living inside the case.  I have watched
the tps/s go to zero using ncrcontrol, then the file systems just *ping*
vanish.  Any further access causes an input/output error.  My solution: pull
the DAT out of the case.  Your possible problem: a drive overheating?

Don't have any ideas about Problem #2, though.

If you find the solution to #1 I'm interested in hearing it.

Regards,
Russell