*BSD News Article 85594


Return to BSD News archive

Path: euryale.cc.adfa.oz.au!newshost.carno.net.au!harbinger.cc.monash.edu.au!news.rmit.EDU.AU!news.unimelb.EDU.AU!munnari.OZ.AU!news.ecn.uoknor.edu!news.wildstar.net!newsfeed.direct.ca!portc01.blue.aol.com!cliffs.rs.itd.umich.edu!howland.erols.net!feed1.news.erols.com!news
From: Ken Bigelow <kbigelow@www.play-hookey.com>
Newsgroups: comp.unix.bsd.freebsd.misc
Subject: Re: Major system problems...
Date: Wed, 25 Dec 1996 20:35:49 +0000
Organization: Erol's Internet Services
Lines: 89
Message-ID: <32C19025.4FE0@www.play-hookey.com>
References: <59nn0r$h64@robin.theramp.net>
Reply-To: kbigelow@www.play-hookey.com
NNTP-Posting-Host: kenjb05.play-hookey.com
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
X-Mailer: Mozilla 3.0 (Win16; U)
Xref: euryale.cc.adfa.oz.au comp.unix.bsd.freebsd.misc:33101

Jordan Klein wrote:
> 
> I am having major failures on my system, with FreeBSD and need some
> assistance interpreting the errors.  Let me describe my hardware
> before going further:
> 
> 486/120 system board, AMD processor, 256 K cache
> 48mb non-parity ram, 3 16mb simms
> Genoa Phantom 64V-2001 video with 2mb vram (S3 Trio64+)
> Adaptec AHA-2940 PCI SCSI controller
> Generic NE2000 ethernet card (isa)
> Sound Blaster 16 (isa)
> Fujitsu M1606S-512 1 gig drive (internal, id 0)
> Fujitsu M2694ES-512 1 gig drive (internal, id 1)
> NEC 8Xi CDROM (internal, id 2)
> Quantum FIREBALL1080S 1 gig drive (external, id 3)
> Archive Python 28388-XXX 4mm DAT (external, id 4)
> 
> My SCSI bus is terminated properly (last internal drive, and last
> external drive).  This system was working fine when I first
> started with FreeBSD, and for about a year with Linux, Windows 95,
> and Windows NT (3.51 and 4.0).
> 
> Here's the problem.  When I'm using it, things seem fine at first.
> It boots just fine, I have no problem getting my PPP networking
> going, X, etc.  But after a while, things deteriorate.  Worse,
> it seems things deteriorate faster if I do something that does
> a lot of disk access.  I've had data corruption, random reboots,
> kernel panics, and so forth.  I know that I'm having hardware
> problems, but I can't identify the source.  I've already swapped
> the system board, that did fix anything.  I've run memory tests
> on my simms, and that showed them working fine.  If someone can
> identify the following error messages and give me an idea what's
> failing, I would be VERY grateful.  Most of the hardware is
> still under warranty, so it's just a matter of getting it
> replaced.
> 
> Two most recent errors:
> ahc0: Issued Channel A Bus Reset.  2 SCBs aborted
> sd1(ahc0:1:0) timed out in dataout phase, SCSISIGI==0x0
> SEQADDR==0x0
> spec_getpages: I/O read error
> vm_fault: pager input (probably hardware) error, PID 236 failure
> Fatal trap 12: page fault while in kernel mode
> 
> This was when I tried to play an audio cd with cdcontrol.
> 
> And next, while rebooting because netscape 3.01 was segfaulting
> (signal 11) on me:
> Fatal trap 12: page fault while in kernel mode
> fault virtual address   = 0x83015a61
> fault code              = supervisor read, page not present
> instruction pointer     = 0x8:0xf0178590
> code segment            = base 0x0, limit 0xfffff, type 0x1b
>                         = DPL 0, pres 1, def32 1, gran 1
> processor eflags        = interrupt enabled, resue, IOPL=0
> current process         = 9029 (reboot)
> interrupt mask          =
> panic: page fault
> 
> I get these kinds of errors at random times, and all I can think
> is that it's probably either my memory, or something on my
> SCSI bus is failing.  Any help in pinpointing the offending
> hardware would be greatly appreciated.  Thank you.

I encountered a wide range of random faults for awhile, using hardware
that all diagnostic software I had said was perfectly good. Through a
long series of trial-and-error moves, I finally isolated it to the cache
RAM. I can't be sure that your problem is the same, of course, but it's
easy to check.

Try disabling the external cache in the ROM BIOS (internal cache is on
the CPU; you can leave it up). Then try the system and see what happens.
If it now works correctly, you may want to check out the cache chips
directly. I discovered that FreeBSD is enough more efficient than
MeSs-DOS that it will almost certainly overrun 20-ns cache chips.
However, by using 15-ns cache chips, I could run the cache with no
trouble.

It could also be a pattern fault in one of your SIMMs, but I'd check the
cache first.

I hope this helps!
-- 
Ken

Are you interested in   |
byte-sized education    |   http://www.play-hookey.com
over the Internet?      |