*BSD News Article 48755


Return to BSD News archive

Path: euryale.cc.adfa.oz.au!newshost.anu.edu.au!harbinger.cc.monash.edu.au!simtel!news.sprintlink.net!in2.uu.net!noc.near.net!news3.near.net!news-server.bos.locus.com!orchard.la.locus.com!janus.la.locus.com!not-for-mail
From: fmayhar@janus.la.locus.com (Frank Mayhar)
Newsgroups: comp.unix.bsd.freebsd.misc
Subject: Show-stopper crashes on 2.1-STABLE and 2.1.0-072695-SNAP.
Date: 17 Aug 1995 14:15:00 -0700
Organization: Locus Computing Corporation, Los Angeles, California
Lines: 306
Message-ID: <410bgk$3vij@janus.la.locus.com>
Reply-To: fmayhar@locus.com
NNTP-Posting-Host: janus.la.locus.com

I sent this to hackers, but since my home uucp provider has decided to
not accept any email from freefall.cdrom.com, it's doing me little good.
At the risk of injecting some hard-core technical discussion, I'm posting
this here to see if I can finally get this problem resolved.

The (slightly modified) message as sent to hackers follows:

My less immediately critical problem is with the XFree86 3.1.1 server; I
run the S3 server against an ISA Actix Graphics Engine Ultra +, and after
a short time, it appears to hang.  A ps (from an async terminal) shows it
racking up CPU; sometimes there are visual flashes or short-lived bits of
garbage on the screen shortly before it crashes.  Is there a known hardware
incompatibility between the ASUS motherboard and the Actix card?  Note
that I've ordered a Number Nine GXE64Pro, so this should shortly become
a non-issue, but I thought I would mention it anyway.

Now the critical problem:

I just upgraded my hardware to a P100/PCI motherboard (configuration is
below), and needed to upgrade to FreeBSD 2.0.5 or better since 1.1.5.1
(which I had been running) doesn't support my new hardware.  I've had
nothing but headaches.  The real problem is that I can't run News; email
works okay, but INN pounds the system hard enough that it crashes almost
immediately (if I don't run News, it crashes eventually, but it takes a
lot longer -- although I've had it crash immediately after boot).

My configuration:

Pentium 100, ASUS P55TP4XE motherboard, pipeline burst SRAM, 32 MB 60ns
memory, Adaptec 2940W, Maxtor LXT340sy + 2 Toshiba MK538FBs (340 MB, and
two 1.2 GB), Archive Viper 2150S tape, Actix GraphicsEngine Ultra +.

Crash:

Running 2.1.0-072695-SNAP, and INN 1.4 compiled with MMAP.  I get a
compressed uucp newsfeed; while unpacking news, the system quickly
crashes with:

Fatal trap 12: page fault while in kernel mode
fault virtual address		= 0xf81ef050
fault code			= supervisor write, page not present
instruction pointer		= 0x8:0x4018679c
code segment			= base 0x0, limit 0xfffff, type 0x1b
				= DPL 0, pres 1, def 321, gran 1
processor eflags		= interrupt enabled, resume, IOPL=0
current process			= 11503 (gzip)
interrupt mask			= net tty bio
panic: page fault

I finally got a trace courtesy of DDB (not corresponding to the above
message, but the same fault code and under the same circumstances):

_trap_fatal(efbffb4c,c,f0c10500,efbffb4c,f0b43e00) at _trap_fatal+0x277
_trap_pfault(efbffb4c,0,f024fb10,f01cbd60,f26db6f8) at _trap_pfault+0x158
_trap(10,10,f26eb6f8,f01cbd60,efbffb94) at _trap+0x27b
calltrap(f01cbd60,2bfe0000,0,f26db6f8,0) at calltrap+0x15
_vm_hold_load_pages(f26eb6f8,f2bbc000,f2bbe000,f26db6f8) at _vm_hold_load_pages+0x4c
_allocbuf(f26eb6f8,2000,efbffc98,efbffd10,ffffffff) at _allocbuf+0x8a
_getblk(f0c7cd00,b,2000,0,0) at _getblk+0x23a
_bread(f0c7cd00,b,2000,ffffffff,efbffc98) at _bread+0x21
_ffs_blkatoff(efbffd10,f0c7cd00,efbfff0c,efbffef8,f0bf2700) at _ffs_blkatoff+0xc3
_ufs_lookup(efbffd74,0,efbfff0c,efbffee8,1) at _ufs_lookup+0x44a
_lookup(efbffee8,0,efbfff94,602,f0277f2c) at _lookup+0x256
_namei(efbffee8,0,efbfff94,f0c10500,f0c10500) at _namei+0x122
_vn_open(efbffee8,602,1b4,efbfff94,f0c10500) at _vn_open+0x5a
_open(f0c10500,efbfff94,efbfff8c,842006,0) at _open+0x97
_syscall(27,27,287d4,0,efbfd2fc) at _syscall+0x161

That was with 2.1.0-072695-SNAP.  This is with 2.1-STABLE:

I tried INN built with MMAP.  No joy, it looked like it crashed in a page
fault due to an mmap (nothing on the stack below the _trap_pfault()).  I
rebuilt INN with READ instead of MMAP.  It got further this time, but
finally crashed in _cache_lookup():

Fatal trap 12: page fault while in kernel mode
fault virtual address	= 0xfccb074c
fault code		= supervisor write, page not present
instruction pointer	= 0x8:0xf01265f5
code segment		= base 0x0, limit 0xfffff, type 0x1b
			= DPL 0, pres 1, def32 1, gran 1
processor eflags	= interrupt enabled, resume, IOPL = 0
current process		= 4580 (innd)
interrupt mask		= 
panic: from debugger

Stack trace:
_cache_lookup(f0b90e00,efbffef8,efbfff0c,f0b90e00,efbfff0c) at _cache_lookup+0x195
_ufs_lookup(efbffd74,0,efbfff0c,efbffee8,1) at _ufs_lookup+0xdb
_lookup(efbffee8,0,efbfff94,602,f02535c4) at _lookup+0x256
_namei(efbffee8,0,efbfff94,f0cd6d00,f0cd6d00) at _namei+0x122
_vn_open(efbffee8,602,1b4,efbfff94,f0cdbd00) at _vn_open+0x5a
_open(f0cd6d00,efbfff94,efbfff8c,878006,0) at _open+0x97
_syscall(26,26,286d4,0,efbfd2f4) at 0x161

I've poked around in cache_lookup() and ufs_lookup() looking for any obvious
errors, but found nothing I could point to as the culprit; someone more
knowledgable of the code needs to take a look at it, and maybe give me some
advice as to what to look at.

I can reproduce these pretty much at will (and they quite often happen all
by themselves, on an only slightly loaded system), so if there's anything
else that anyone needs me to look at, let me know.  If it's already solved,
so much the better (but I doubt that it is).  I'm available pretty much
any time for diagnostic work, so if someone wants to get on the phone with
me and work on this thing, send me email (at home at 'frank@exit.com' or
at work at 'fmayhar@locus.com', one will get forwarded to the other
automatically) and I'll give you my number.

Thanks in advance for any help anyone can give me.  I quite understand that
everyone is very busy (join the crowd), but this is being a great aggravation,
particularly since my only other choice is to leave my $2000 worth of nice
new hardware sitting idle while I go back to running 1.1.5.1 on a crufty old
486/33 box.

Here's my config file, and a copy of the 'dmesg' output, just for grins:
#
# GENERIC -- Generic machine with WD/AHx/NCR/BTx family disks
#
#	$Id: GENERIC,v 1.46 1995/06/11 19:31:11 rgrimes Exp $
#

machine		"i386"
cpu		"I586_CPU"
ident		TINKER
maxusers	64

#options		MATH_EMULATE		#Support for x87 emulation
options		INET			#InterNETworking
options		FFS			#Berkeley Fast Filesystem
options		NFS			#Network Filesystem
options		MFS			#Memory File System
options		MSDOSFS			#MSDOS Filesystem
#options		"CD9660"		#ISO 9660 Filesystem
options		PROCFS			#Process filesystem
options		"COMPAT_43"		#Compatible with BSD 4.3
options		"SCSI_DELAY=15"		#Be pessimistic about Joe SCSI device
#options		BOUNCE_BUFFERS		#include support for DMA bounce buffers
options		UCONSOLE		#Allow users to grab the console
options		"NSWAPDEV=4"
options		SYSVSHM
options		SYSVSEM
options		SYSVMSG
options		KTRACE			#kernel tracing

options		DDB			#kernel debugger

config		kernel	root on sd0 

controller	isa0
controller	pci0

controller	fdc0	at isa? port "IO_FD1" bio irq 6 drq 2 vector fdintr
disk		fd0	at fdc0 drive 0
disk		fd1	at fdc0 drive 1
#tape		ft0	at fdc0 drive 2

#controller	wdc0	at isa? port "IO_WD1" bio irq 14 vector wdintr
#disk		wd0	at wdc0 drive 0
#disk		wd1	at wdc0 drive 1

#controller	wdc1	at isa? port "IO_WD2" bio irq 15 vector wdintr
#disk		wd2	at wdc1 drive 0
#disk		wd3	at wdc1 drive 1

#controller	ncr0
#controller	ahc0
controller	ahc0	at pci? bio irq ? vector ahcintr

#controller	bt0	at isa? port "IO_BT0" bio irq ? vector btintr
#controller	uha0	at isa? port "IO_UHA0" bio irq ? drq 5 vector uhaintr
#controller	ahc1	at isa? bio irq ? vector ahcintr
#controller	ahb0	at isa? bio irq ? vector ahbintr
#controller	aha0	at isa? port "IO_AHA0" bio irq ? drq 5 vector ahaintr
#controller	aic0    at isa? port 0x340 bio irq 11 vector aicintr
#controller	nca0	at isa? port 0x1f88 bio irq 10 vector ncaintr
#controller	nca1	at isa? port 0x350 bio irq 5 vector ncaintr
#controller	sea0	at isa? bio irq 5 iomem 0xc8000 iosiz 0x2000 vector seaintr

controller	scbus0

device		sd0

device		st0

device		cd0	#Only need one of these, the code dynamically grows

device		wt0	at isa? port 0x300 bio irq 5 drq 1 vector wtintr
#device		mcd0	at isa? port 0x300 bio irq 10 vector mcdintr
#device		mcd1	at isa? port 0x340 bio irq 11 vector mcdintr

#controller	matcd0	at isa? port ? bio

#device		scd0	at isa? port 0x230 bio

# syscons is the default console driver, resembling an SCO console
device		sc0	at isa? port "IO_KBD" tty irq 1 vector scintr
# Enable this and PCVT_FREEBSD for pcvt vt220 compatible console driver
#device		vt0	at isa? port "IO_KBD" tty irq 1 vector pcrint
#options		"PCVT_FREEBSD=210"	# pcvt running on FreeBSD 2.1
options		XSERVER			# include code for XFree86

device		npx0	at isa? port "IO_NPX" irq 13 vector npxintr

device		sio0	at isa? port "IO_COM1" tty irq 4 vector siointr
device		sio1	at isa? port "IO_COM2" tty irq 3 vector siointr
device		sio2	at isa? port 0x338 tty irq 12 vector siointr
#device		sio2	at isa? port "IO_COM3" tty irq 5 vector siointr
#device		sio3	at isa? port "IO_COM4" tty irq 9 vector siointr

device		lpt0	at isa? port? tty irq 7 vector lptintr
#device		lpt1	at isa? port? tty
#device		lpt2	at isa? port? tty

# Order is important here due to intrusive probes, do *not* alphabetize
# this list of network interfaces until the probes have been fixed.
# Right now it appears that the ie0 must be probed before ep0. See
# revision 1.20 of this file.
#device de0
#device ed0 at isa? port 0x280 net irq  5 iomem 0xd8000 vector edintr
#device ed1 at isa? port 0x300 net irq  5 iomem 0xd8000 vector edintr
#device ie0 at isa? port 0x360 net irq  7 iomem 0xd0000 vector ieintr
#device ep0 at isa? port 0x300 net irq 10 vector epintr
#device ix0 at isa? port 0x300 net irq 10 iomem 0xd0000 iosiz 32768 vector ixintr
#device le0 at isa? port 0x300 net irq 5 iomem 0xd0000 vector le_intr
#device lnc0 at isa? port 0x280 net irq 10 drq 0 vector lncintr
#device lnc1 at isa? port 0x300 net irq 10 drq 0 vector lncintr
#device ze0 at isa? port 0x300 net irq 5 iomem 0xd8000 vector zeintr
#device zp0 at isa? port 0x300 net irq 10 iomem 0xd8000 vector zpintr

controller	snd0
device sb0      at isa? port 0x220 irq 7 conflicts drq 1 vector sbintr
device pas0     at isa? port 0x388 irq 10 drq 6 vector pasintr
device opl0     at isa? port 0x388

pseudo-device	loop
pseudo-device	ether
pseudo-device	log
#pseudo-device	sl	1
# ijppp uses tun instead of ppp device
#pseudo-device	ppp	1
pseudo-device	tun	2
pseudo-device	pty	16
pseudo-device	gzip		# Exec gzipped a.out's
pseudo-device	bpfilter	4	#Berkeley packet filter


FreeBSD 2.1-STABLE #0: Wed Aug 16 08:34:38  1995
    root@exit.com:/usr/src/sys/compile/TINKER
CPU: 99-MHz Pentium 735\\90 or 815\\100 (Pentium-class CPU)
  Origin = "GenuineIntel"  Id = 0x525  Stepping=5
  Features=0x1bf<FPU,VME,PSE,MCE,CX8,APIC>
real memory  = 33161216 (8096 pages)
avail memory = 31178752 (7612 pages)
Probing for devices on the ISA bus:
sc0 at 0x60-0x6f irq 1 on motherboard
sc0: VGA color <16 virtual consoles, flags=0x0>
sio0 at 0x3f8-0x3ff irq 4 on isa
sio0: type 16550A
sio1 at 0x2f8-0x2ff irq 3 on isa
sio1: type 16550A
sio2 at 0x338-0x33f irq 12 on isa
sio2: type 16450
lpt0 at 0x378-0x37f irq 7 on isa
lpt0: Interrupt-driven port
lp0: TCP/IP capable interface
fdc0 at 0x3f0-0x3f7 irq 6 drq 2 on isa
fdc0: NEC 72065B
fd0: 1.44MB 3.5in
fd1: 1.2MB 5.25in
wt0 not found at 0x300
npx0 on motherboard
npx0: INT 16 interface
sb0 at 0x220 irq 7 drq 1 on isa
sb0: <SoundBlaster 2.0>
pas0 at 0x388 irq 10 drq 6 on isa
pas0: <CDPC rev 0>
opl0 not probed due to I/O address conflict with pas0 at 0x388
Probing for devices on the pci0 bus:
	configuration mode 1 allows 32 devices.
chip0 <CPU-PCI bridge> rev 2 on pci0:0
chip1 <PCI-ISA bridge> rev 2 on pci0:7
ahc0 <Adaptec 2940 SCSI host adapter> rev 0 int a irq 11 on pci0:12
ahc0: reading board settings
ahc0: Reading SEEPROM...done.
ahc0: 2940 Wide Channel, SCSI Id=7, aic7870, 16 SCBs
ahc0: Downloading Sequencer Program...Done
ahc0: Probing channel A
ahc0 waiting for scsi devices to settle
ahc0: target 0 synchronous at 4.4MB/s, offset = 0xf
(ahc0:0:0): "MAXTOR LXT-340S 6.74" type 0 fixed SCSI 1
sd0(ahc0:0:0): Direct-Access 324MB (665154 512 byte sectors)
ahc0: target 1 synchronous at 6.67MB/s, offset = 0xf
(ahc0:1:0): "TOSHIBA MK538FB 6061" type 0 fixed SCSI 2
sd1(ahc0:1:0): Direct-Access 1170MB (2396970 512 byte sectors)
ahc0: target 2 synchronous at 6.67MB/s, offset = 0xf
(ahc0:2:0): "TOSHIBA MK538FB 6030" type 0 fixed SCSI 2
sd2(ahc0:2:0): Direct-Access 1172MB (2400302 512 byte sectors)
pci0: uses 4096 bytes of memory from fbff7000 upto fbff7fff.
pci0: uses 256 bytes of I/O space from e400 upto e4ff.
bpf: lo0 attached
bpf: tun0 attached
bpf: tun1 attached
-- 
Frank Mayhar frank@exit.com