*BSD News Article 91842


Return to BSD News archive

Newsgroups: comp.unix.bsd.bsdi.misc
Path: euryale.cc.adfa.oz.au!newshost.carno.net.au!harbinger.cc.monash.edu.au!news.mira.net.au!news.vbc.net!vbcnet-west!knews.uk0.vbc.net!vbcnet-gb!azure.xara.net!xara.net!uknet!usenet1.news.uk.psi.net!uknet!EU.net!cpk-news-hub1.bbnplanet.com!news.bbnplanet.com!news.maxwell.syr.edu!news.bc.net!info.ucla.edu!nnrp.info.ucla.edu!nntp.club.cc.cmu.edu!goldenapple.srv.cs.cmu.edu!das-news2.harvard.edu!spdcc!dyer
From: dyer@spdcc.com (Steve Dyer)
Subject: BSDI 2.1, file systems > 4gb and off_t
Message-ID: <E7JAKK.B7p@spdcc.com>
Organization: S.P. Dyer Computer Consulting, Cambridge MA
Date: Mon, 24 Mar 1997 06:12:20 GMT
Lines: 58
Xref: euryale.cc.adfa.oz.au comp.unix.bsd.bsdi.misc:6447

I've been having a LOT of trouble installing a Maxtor 5.1gb EIDE drive
into a system running BSDI 2.1 (I described this in more detail a
few weeks ago).  I partitioned the drive using disksetup, with the
usual small root (a) and swap (b) partitions, with the remaining 'h'
partition using the rest of the drive's sectors.  This 'h' partition is
9772560 512-byte sectors in length.  A "newfs" followed by a mount
works fine, but once the file system starts getting written to, it begins
to develop errors (directories changing to garbage inodes, etc.)
and the system inevitably crashes.

I was a bit worried about the EIDE installation, but double-checked
everything, and it seems OK.  So, I then worried whether I was seeing
4bg wraparound within the filesystem code.  I wrote a user-level test
program which writes over each block in the unmounted 'h' partition
with a array of 64 quad_t integers, with each array element containing
the ordinal sector number of the sector.  So, sector 0 contains an array
of 64 zeroes (each 8 bytes lone), sector 100 ... 64 100's, and so on.  

for each sector b from 0 until the end of the partition (sector 9772560)

	fill a buffer representing the sector with its ordinal position
	in each 8-byte longlong integer (fill in the buffer with 64
	copies of the integer b)

	set the write byte offset into the disk partition to b * 512
	bytes (this is an 8 byte long-long quantity, as supposedly is the
	BSDI internal representation of the write offset)

	write the sector buffer into the partition (using the
	just determined write offset)

endfor

I'm using lseek(fh, b*512, SEEK_SET), and 'b' is of type off_t.
Sizeof(off_t) is 8.  It appears that the lseek works fine (that is,
I'm not accidentally passing a 32-bit long quantity to lseek),
because everything works fine until b reaches 8388608.
8388608*512 is (da-da!) 2^32.  And, in fact, we then start
seeing wraparound, with blocks 0, 1, 2 and so forth getting
overwritten.  Why should this be, given that off_t is
a quad_t (8-byte 64-bit integer) in BSDI 2.1, and that internally,
all the kernel calculates file offsets in terms of off_t/quad_t
data types?

I also don't know whether this test program is a red herring, but
it's sure consistent with the corruption I see when this file
system is mounted and allowed to be written for 10-15 minutes
during a restore.

Somehow I suspect I'm misunderstanding something fundamental
about BSDI 2.1's handling of disk partitions and file systems
larger than 8388608 sectors, and the use of lseek with off_t offsets.

Can anyone comment?

-- 
Steve Dyer
dyer@ursa-major.spdcc.com