*BSD News Article 17808


Return to BSD News archive

Path: sserve!newshost.anu.edu.au!munnari.oz.au!news.Hawaii.Edu!ames!haven.umd.edu!uunet!mcsun!sunic!isgate!veda.is!adam
From: adam@veda.is (Adam David)
Newsgroups: comp.os.386bsd.bugs
Subject: pk-0.2.4 triggers hardware cache bug
Message-ID: <C9Huv6.II@veda.is>
Date: 1 Jul 93 16:23:15 GMT
Organization: Veda Systems, Iceland
Lines: 29

I have been having problems with the new 0.1.2.4 kernel, it now crashes
as often as it did before I modified the code in aha1542.c to invalidate
the CPU cache after any SCSI DMA read. Yes, the chipset is a bogus design.

I'm wondering what changed in the patchkit that would cause this problem
to reappear. I checked and doublechecked, the modification is still in place
and enabled. Also I tried inserting an extra INVALIDATE_CACHE at the start
of the aha interrupt routine but it did not improve.

For the last 3 days this machine has been crashing anywhere between 16 and 32
times a day, it crashed 3 times while writing this message. With patchkit 0.2.3
and the INVALIDATE_CACHE at the start of aha_done() I was seeing uptimes from
3 to 30 days (usual maximum of 10 days).

As far as I can tell the only card to use DMA is the aha1542b controller,
there is no floppy inserted and no floppy access except during boot probing,
so it seems to be only the SCSI which is causing these frequent crashes at
the moment. The changes to sd.c do not seem to have any bearing on the matter.
Therefore (not aware of any other relevant changes) it appears to be something
in the new spl or irq stuff that is triggering this. Any further ideas?

I agree that people should not cater to companies that make broken hardware,
but then why are we using i386-isa systems anyway. Surprisingly many motherboard
(chipset) designs expect it to be the responsibility of the system software
to ensure cache coherency is maintained during DMA access. 386bsd 0.1 expects
the hardware to do the right thing instead. Obviously this is not compatible.

--
adam@veda.is