*BSD News Article 71921


Return to BSD News archive

Path: euryale.cc.adfa.oz.au!newshost.anu.edu.au!harbinger.cc.monash.edu.au!nntp.coast.net!zombie.ncsc.mil!news.mathworks.com!news.PBI.net!ns2.mainstreet.net!sloth.swcp.com!news.swcp.com!russo
From: russo@camel.swcp.com (Thomas Russo)
Newsgroups: comp.unix.bsd.freebsd.misc
Subject: An odd behavior when re-installing to new disk [long(winded)]
Date: 24 Jun 1996 20:22:35 GMT
Organization: Southwest Cyberport
Lines: 109
Message-ID: <RUSSO.96Jun24142236@camel.swcp.com>
NNTP-Posting-Host: camel.swcp.com


I've been upgrading bits and pieces of my old PC over the past week, and this
weekend had a fairly odd experience.  I would like to share it with you in the
hopes that someone out there can give me insight into what might have been
going on, and what I might do in the future to guard myself against
catastrophic disk problems.

The task:  replace my working 820MB IDE drive with a 2GB IDE drive and get my
           FreeBSD system running again with the new disk.

The intended procedure: make a full level 0 dump of all filesystems onto my
       4mm DDS scsi tape drive (please no comments on why I would bother
       putting a new IDE drive into a system that already has a SCSI
       controller... I simply found the premium for a scsi drive not to be
       worth it in my case) immediately before shutting down the system for
       its surgery, install new disk, install FreeBSD off of the CD-ROM onto 
       the new disk, then boot it and load the backups from tape to recover 
       the system I started with, only with lots more space available.

What actually happened: 

First off, as I expected, the atapi floppy wouldn't work for me, since it has
the wd2 and wd3 lines configured in the kernel, and for my setup (CD-ROM on
second IDE controller, configured as master) it seems I must build a kernel
without those devices in order for it to be probed properly --- I see that
some folks here have the same problem, and others don't.  I couldn't figure
out a way to disable them without building a kernel, which would have meant
actually getting the system installed first, which would have meant that I'd
already accomplished my task, so I wrote off a CD-ROM installation and just
installed from a DOS partition after loading in all of the dists from the
CD-ROM.  No problems, and no complaints, since I know the ATAPI support is
only experimental, and also that if the wd2 and wd3 devices were disabled that
would cause someone else's installation to fail.  C'est la vie.  Installing
from the DOS partition was only a tiny bit more work, and it was slightly
gross to get DOS all over my hands for 10 minutes.  Still, it worked, and at
this point I had a minimal binary installation of FreeBSD.

That done, I booted, used the -c option to let the kernel know that my SCSI
controller is at an alternate address (so as not to conflict with another
card), and sure enough, the system came up, probed for scsi devices and found
aha0 with all the right addresses, irqs and drqs, and st0, correctly
identified as a Wangtek 3100.

Great, now all that I need to do is to log in and do a restore from
/dev/nrst0.

No go.  No activity from the drive, and eventually start getting messages on
the console of st(0:2:0) timeouts.  I tried lots of stuff, including disabling
every imaginable uninstalled device using -c on bootup, removing every card
that wouldn't be used during the re-install, and anything else I could think
of which might have caused the device to stop working since I put the new disk
in.  

To make a long story a little less long, I ultimately (at about 4:00 in the
morning) had to re-install the old disk as IDE master, the new disk as slave
on the same controller, boot my previously working version of FreeBSD (damn
good thing the drive was still completely functional!), mount each individual
partition of the new drive under /mnt, restore rf /dev/nrst0 for each one,
take the old drive out, etc. etc. etc.  Worked like a charm without a single
device timeout and no problems whatsoever, and as of today my system is
happily running away with all sorts of extra space on its brand spanking new
Cyrix 6x86-P120+, after years of life as a 386-40.  All of the devices
function, and it is clear that the problem was not one of hardware, but of
software configuration, the only thing that changed between the procedure that
worked and the one that didn't.

This process leaves me a bit nervous.  I've been backing up onto that SCSI
tape drive for months now, often testing that the backups are readable, etc.
assuming that I am safe because when my disk finally shuffles off this mortal
coil I can recover everything using the technique I outlined in "the intended
procedure".  And here I am, totally unable to use the tapes except under my
own kernel.  I should add that my own kernel is not much different from the
generic kernel --- I removed a bunch of devices I never plan to own (i.e. all
the scsi controllers other than the one I actually have (the Adaptec 1542CP),
the network cards other than the one I have, proprietary interface CD-ROMs,
etc), added ATAPI support so I can use the IDE CD-ROM, removed wd2 and wd3
support, and added all of the sound card lines (I tried adding only the lines
for my own sound card, a Soundblaster16, but that didn't work because
apparently there is interconnection between the drivers).  That's it.  I
really don't see what it is about my custom kernel which could let it see and
use my tape drive, when the generic kernel sees it, recognizes it, but won't
ever talk to it.  If anyone might have a hint or two for me, I'd love to hear
it.

So what's a man to do?  I'd sorta like to build my own boot floppy along the
lines of atapi.flp so I can use all of my devices during an emergency
re-install.  How does one do that?  I took a quicky look in /usr/src/release,
but it wasn't immediately obvious how one goes about building the boot floppy
alone, although I could see that the makefile had some useful hints.  Any
offer of explicit instructions will be graciously accepted.

Anyhow, to cook all this down to something digestible:
  1) Is there a simple explanation for the st0 timeouts described?
  2) Has anyone else ever seen this kind of thing, or is my machine just a
     mutant? 
  3) Is there a way to make atapi.flp recognize an IDE CD-ROM on the second
     controller, given that the only way I've ever gotten the system to see it
     has been by disabling wd2 and wd3 as well as enabling ATAPI and wcd0?
  4) How hard is it to generate a custom boot/install floppy?  I've looked in 
     /usr/src/release and I think I might need a bit more info to get it done.
     I'm guessing that if I were to use a custom floppy which contained my own
     kernel to do the installation I'd be able to re-install onto my current 
     system using my own cd-rom and tape drives in the event that my disk
     dies and needs to be replaced.

--
Tom Russo                                   WWW: http://www.swcp.com/~russo/
Never put off until tomorrow what you can do today, because if you like it
today you can do it again tomorrow.