*BSD News Article 22974


Return to BSD News archive

Newsgroups: comp.os.386bsd.bugs
Path: sserve!newshost.anu.edu.au!munnari.oz.au!constellation!osuunx.ucc.okstate.edu!moe.ksu.ksu.edu!vixen.cso.uiuc.edu!howland.reston.ans.net!usc!cs.utexas.edu!utnut!torn!nott!cunews!revcan!micor!latour!diana!db
From: db@diana.ocunix.on.ca (Dyane Bruce)
Subject: Re: SCSI disk I/O error
Message-ID: <1993Oct28.011738.16789@diana.ocunix.on.ca>
Organization: db Software
References: <1993Oct23.203652.4718@diana.ocunix.on.ca> <CFK6Fu.Grw@world.std.com>
Date: Thu, 28 Oct 1993 01:17:38 GMT
Lines: 81

In article <CFK6Fu.Grw@world.std.com> hd@world.std.com (HD Associates) writes:
>In article <1993Oct23.203652.4718@diana.ocunix.on.ca>,
>Dyane Bruce <db@diana.ocunix.on.ca> wrote:
...
>>I sometimes get a "sd0:reset" console error with subsequent consistent
>>"I/O error" on any command from the shell. Extracting gsrc
...
>valid" flag and disallows further I/O to the disk until it is fully
>closed and reopened.  Thus the big red switch.

  Ok. I hadn't looked at my old out of date copy of the draft SCSI II
specs yet. :-) 

>I think the sd driver could be extended to look at the additional sense
>code.  ASC=0x28 is "Not ready to ready transition, medium may have
>changed" and ASC=0x29 is "Power on, reset, or bus device reset
>occurred".  We could ignore ASC=0x29 and treat ASC=0x28 the same way as
>we are now, that is, no more I/O to an open device that someone may
>have changed.

  I haven't found this in my old copy of the spec. I suspect
it might not be in my ancient copy. :-(

...
>Two points:
>
>1. I just looked through the source and don't see the "sd0:reset"
>message anywhere in any of the revs I have.  Netbsd is packed up right
>now, though, so it could be changed to say that in there.  You want to
>look around for the SDVALID flag.

  umm. For NetBSD 0.9 Look in /usr/src/sys/scsi/sd.c line 1200
Its in the function "sd_interpret_sense"

(your mileage will vary depending on local patches)

		case 0x6:
			/*
			 * If we are not open, then this is not an error
			 * as we don't have state yet. Either way, make
			 * sure that we don't have any residual state
			 */
			if(!silent)
				printf("sd%d: reset\n", unit); 
			sd->flags &= ~(SDVALID | SDHAVELABEL);
			if (sd->openparts)
				return EIO;
			return ESUCCESS;	/* not an error if nothing's open */


I think what might work is this...

...
				printf("sd%d: reset\n", unit); 
			sd->flags &= ~(SDVALID | SDHAVELABEL);
			if( sense->ext.extended.info[0] == 0x29 )
			  return( ESUCCESS;
			if (sd->openparts)
				return EIO;

>2. Is your disk really giving back a UNIT ATTENTION?  If so, why?  It
>would be interesting to dump the full sense information when you get
>that condition and see what your drive is telling you.

  I will hack the driver to dump this info to the screen at least.
What is frustrating is that once the driver "hangs" up is that it won't
log to the system log. (I might be able to get it to log
to a floppy :-), There is _only_ SCSI in this system,
no IDE etc.) It's a very nasty condition.

  What has been doubly frustrating is I have gone through the exercise
of double checking _all_ terminations, removing all external
devices and running with only internal drive (resetting 
terminations again etc.) Making sure the covers are on etc. and
the monitor is away from the cables :-). And still had _some_ problems.
Go figure. Maybe the drive just likes to do reset....

-- 
Dyane Bruce				db@diana.ocunix.on.ca
29 Vanson Ave. Nepean On, K2E 6A9	So who first started the tradition of
613-225-9920				putting witty sayings in sigs anyway?