*BSD News Article 4971


Return to BSD News archive

Path: sserve!manuel!munnari.oz.au!spool.mu.edu!agate!soda.berkeley.edu!wjolitz
From: wjolitz@soda.berkeley.edu (William F. Jolitz)
Newsgroups: comp.unix.bsd
Subject: Re: Fixed: Runs at 8MHz, Crashes at 33MHz, 386bsd
Date: 13 Sep 1992 19:13:10 GMT
Organization: U.C. Berkeley, CS Undergraduate Association
Lines: 44
Message-ID: <1903s6INNr9u@agate.berkeley.edu>
References: <1992Sep11.200736.20247@qualcomm.com> <1992Sep12.134712.17755@news.cs.indiana.edu> <j=xnk_a.alm@netcom.com>
NNTP-Posting-Host: soda.berkeley.edu
Keywords: bug

In article <j=xnk_a.alm@netcom.com> alm@netcom.com (Andrew Moore) writes:
>In article <1992Sep12.134712.17755@news.cs.indiana.edu> "Michael Squires" <mikes@moose.cs.indiana.edu> writes:
>
>    QAPlus (DOS) long memory test uncovered an intermittent RAM error
                  ^^^^
If this means 32-bit wide memory test, then I think you are correctly isolating
the problem. Note that most PC memory checkers only do 16bit loads/stores.

When I was originally doing 386bsd, I had real stability problems with my
386SX noname laptop. I eventually traced it to memory corruption. Yet, in
running various PC memory checkers for *days* without a peep, I could not
find the problem. I wrote a trivial one in the bootstrap that ran in 32bit
protected mode, and it failed in less than a second.

The problem turned out to be two problems. 1) a mechanical problem with the
extended memory card's socket (cured with a pliers), and 2) the "extended
setup" for the chipset did not set the "no interleave" bit for my odd-banked
386SX. BTW, Windows-3 also tended to be screwed by the same feature. Two
I/O instructions and no more problems.

This problem reccured when I got an ethernet card later. I found that it was
due to the I/O extension "backplane" in the laptop missing a few connections
to the toshiba slot. The company ECO'ed the laptop when the  trouble was
isolated, and mumbled something about expecting customers only to ever use
the slot for modems.

The moral of the story here is that for many PC's, 32-bit operation can 
still be the "undiscovered country".

When I was working on 2.8BSD, the PDP 11/40 in the virus lab suddenly began
doing "impossible" things. Immediately I ran diagnostics. Nothing. Got DEC
out, and they found nothing. In frustration, I keyed in the following
machine program through the front panel:
	cmp $0,$0
	beq	1b
	halt
It ran for about 30 seconds and stopped. Starting it again, it ran for a
minute and stopped. After much work swapping boards, backplanes, power
supplies, and phone calls, the DEC CE and I carefully cleaned the processor's
edge connectors, and it succeeded in running the "diagnostic" all night.


Bill.