*BSD News Article 15939


Return to BSD News archive

Newsgroups: comp.os.386bsd.questions
Path: sserve!newshost.anu.edu.au!munnari.oz.au!news.Hawaii.Edu!ames!haven.umd.edu!uunet!news.smith.edu!sophia.smith.edu!jfieber
From: jfieber@sophia.smith.edu (J Fieber)
Subject: Elusive SCSI problems (Adaptec 1542C)  <snarl!>
Message-ID: <1993May12.023959.20121@sophia.smith.edu>
Keywords: Adaptec 1542C, Quantum, SCSI, 386bsd
Sender: root@sophia.smith.edu (Operator)
Organization: Smith College
Date: Wed, 12 May 1993 02:39:59 GMT
Lines: 134

All ye out in netland, I hope some kind soul can help me
with a little problem...

Supposedly the Adaptec 1542C works fine with 386bsd with
the 0.2.3 patchkit.  Others have testified here that it does.

For me, it does not and for a week I have been tearing my
hair out trying to find the source of the problem and don't 
feel a bit closer to the solution at this point.  

Here is the scoop thus far:


SETTING:
  A UM486V AIO motherboard --
   - 33MHz 80486dx
   - 6 ISA, 2 VL-bus slots
   - 256k external cache
   - local-bus IDE controller w/ connor 240 meg drive
   - floppy controller
   - 2 serial, 1 parallel, 1 game port
   - AMI bios
  Orchid Fahrenheit 1280 VLB video card
  Adaptec 1542C SCSI
   - Quantum P105s hard drive
   - Archive Viper tape drive (2150s)
  386bsd 0.1 with 0.2.3 patchkit installed.
   

PROBLEM:
  Reading and writing from SCSI devices is unreliable and writing
  frequently results in a trashed filesystem.  There are occasional
  crashes when as well when reading/writing SCSI devices.  
  
  It does not always fail.  For example, several hours last night every
  thing worked without a single error.  However, the *only* pattern
  I have been able to detect is that it works consistantly good or 
  bad in any one session (i.e. it will work okay but if I reboot,
  then it will not work at all until I reboot again at which point
  who knows what it will do?!).  It seems as though something that 
  happens at boot time determines the reliability of the SCSI system.

  The *only* things I have manipulated that have any observable 
  consequences are things that make it not work at all, such as removing
  termination from both ends of the bus.  Nothing else appears to 
  have any impact at all.  
  

These some of the things I have tried that...

DMA:
  I've tried the test built into the Adaptec card that does DMA
  to memory and checks the results.  All tests have passed.

CACHE:
  I've tried disabling the external cache, then the internal 
  as well.  This had no impact on the problem.

TERMINATION:
  Everything seems to work the same whether one end of the bus
  is terminated, it does not matter which, or both ends are
  terminated.  The card can't even find any devices if neither
  end is terminated.

CABLES:
  I've tried three different cables, one of which was only a
  couple inches long that I made just for testing.  All three
  cables work fine in my Amiga.  Likewize, they all have identical
  behaviour on the 486 box.

MEMORY:
  I've had problems with memory before; I had to replace one 
  SIMM shortly after I got the machine.  I also on rare occasion
  get NMI errors which I gather can be related to "cheap" memory.
  Thus I set out on a mission to find some flakey chips.

  The memory is 1 meg simms arranged in 2 banks of 4.  I put 
  together a series of test ranging from simply copying a file to
  the SCSI disk to much more cpu and disk intensive tasks.  Then 
  I pulled one bank out and tried a bunch of different groups out 
  of the 70 possible groups of four.  I later discovered through
  retesting that I could not get even remotely consistent results
  from using the *same* set of simms.  Also, if there were more 
  than 4 bad chips it would be impossible to find them.  

  Related, it was suggested that the bus on time of the 
  Adaptec board might be too long.  I found commands is Julian's 
  SCSI drivers to set these values on the board and got default 
  values from a very skeptical Adaptec technical support person.  
  I diddled with these values but it didn't fix the problem.  
  (Though it is entirely possible that I diddled incorrectly...
  Would some kernel hacker tell me how to adjust this properly?)

SHADOWING:
  Someone suggested that I may need to turn various "shadowing" 
  features off in BIOS.  I tried and the situation didn't improve. 
  In fact, one thing I turned off caused 386bsd to not even boot!

ADAPTEC CONFIG:
  I've tried about every configurable thing on the Adaptec board. 
  Some things made matters worse, but nothing made them better.  
  Recently I have been using the default configuration.

MS-DOS/WINDOWS
  Everything appears to work fine in this environment.  I see this
  as meaning that either (a) 386bsd is broken or (b) that 386bsd 
  is exposing something else (hardware) that is broken that ms-doe
  doesn't use or care about.


What I would like to do is stick the Adaptec board, my scsi drive
and my IDE drive (with 386bsd all installed) in another machine
and see what happens.  However, this will likely be quite difficult
if not impossible.  It *migt* be possible to scare up some other 
SIMMS so I could test the bad memory thesis some more.  Does 
anybody know of any more things I could/should try that I can 
do with my machine alone that might lead me closer to a solution?

What is the possiblity that there is a bug in the scsi driver?

If it is a hardware problem, I need some solid evidence as to 
whether the problem is with the computer or with the adaptec board
before I go demanding replacements for either.  It would be a lot
easier to convince dealers if this would break under ms-dog.

Thanks in advance.

-john

[NEWSFLASH: I decided to try out my Quantum PD210s and couldn't
 even disklabel it!?!?]
-- 
=== jfieber@sophia.smith.edu ================================================
======================================= Come up and be a kite!  --K. Bush ===