*BSD News Article 9474


Return to BSD News archive

Received: by minnie.vk1xwt.ampr.org with NNTP
	id AA5823 ; Fri, 01 Jan 93 01:57:01 EST
Xref: sserve comp.unix.wizards:28116 comp.unix.bsd:9531
Newsgroups: comp.unix.wizards,comp.unix.bsd
Path: sserve!manuel.anu.edu.au!munnari.oz.au!metro!ipso!runxtsa!bde
From: bde@runx.oz.au (Bruce Evans)
Subject: Disk tuning (was Re: file system layout)
Message-ID: <1993Jan1.102538.4626@runx.oz.au>
Organization: RUNX Un*x Timeshare.  Sydney, Australia.
References: <1992Dec30.163131.17280@ll.mit.edu> <lk6u47INN47f@appserv.Eng.Sun.COM> <28188@dog.ee.lbl.gov>
Date: Fri, 1 Jan 93 10:25:38 GMT
Lines: 76

In article <28188@dog.ee.lbl.gov> torek@horse.ee.lbl.gov (Chris Torek) writes:
>Incidentally, Margo Seltzer found the following bug in the original BSD
>block allocator:
>
>[ffs_alloc.c: your line numbers will differ; in fact your code might
>even differ a bit, and/or be found in ufs_alloc.c.  Presumably this is
>fixed in SunOS.]
>***************
>*** 458,464 ****
>	 */
>	nextblk = bap[indx - 1] + fs->fs_frag;
>! 	if (indx > fs->fs_maxcontig &&
>! 	    bap[indx - fs->fs_maxcontig] + blkstofrags(fs, fs->fs_maxcontig)
>! 	    != nextblk)
>		return (nextblk);
>	if (fs->fs_rotdelay != 0)
>--- 464,469 ----
>	 */
>	nextblk = bap[indx - 1] + fs->fs_frag;
>! 	if (indx < fs->fs_maxcontig || bap[indx - fs->fs_maxcontig] +
>! 	    blkstofrags(fs, fs->fs_maxcontig) != nextblk)
>		return (nextblk);
>	if (fs->fs_rotdelay != 0)
>

>Kirk got the test backwards originally, so the tunefs -a parameter
>never mattered....  I no longer have VAXen, but I am curious as to
>whether this fix might improve performance on the old RA81 disks.  I
>was never able to tune those at all; the performance was always
>miserable, never more than a few hundred K per second.

386BSD-0.1 still has the old version (assuming that the above patch is
not backwards :-).

I tried tuning a Seagate 330MB ESDI disk under 386BSD yesterday.  The
tunefs -a parameter worked like I expected: increasing it increased read
performance and reduced write performance.  This is because my disk
controller (a WD1007V-SE1) takes too long in between separate write
commands and the 386BSD driver does not coalesce contiguous blocks
into a single write command (nor does ufs give it enough opportunities
to do so).  Separate read commands are not so much of a problem because
the slow controller is disguised by caching in the controller.

I get best results with tunefs -a 1 -d 1 and a block size of 4K.  The
default rotational delay is 4 msec, which corresponds to 14 sector times
or 2 blocks or at best 33% efficiency :-(.  A rotational delay of 0
works poorly for writing.  Rotational delays of 1 msec and 2 msec both
correspond to 1 block and work OK (for at best 50% efficiency).  A larger
block size wouldn't help because _some_delay between blocks is required,
and the minimum delay of 1 block time always reduces the maximum
efficiency to 50%.  (Actually, for large files I get very close to 25%
efficiency on 15Mb/sec drive and not so close to 50% efficiency on a
10Mb/sec drive, because the faster drive has to be formatted at 2:1
interleave for the controller's caching to keep up.  At 2:1 interleave
the timing is not very critical so it is easier to get close to the
maximum possible efficiency.)

Minix and linux (both using the minix-1.5 file system) get much better
performance from this controller.  E.g., under my version of Minix
(which is more or less standard except for the disk driver), reading and
writing a 2MB file on recently built 64MB Minix file system with only
4MB free was about 50% faster than reading and writing a 2MB file on an
empty 166MB 386BSD file system.  The minix-1.5 fs uses a block size of
only 1K and attempts to lay out the blocks contiguously.  It uses my
modifications of reading ahead up to 17K and always writing all dirty
blocks whenever any dirty block has to be written, so that the driver
usually has a large number of contiguous blocks to work with.  The
driver collects disk-contiguous blocks and makes the largest possible
i/o requests to the controller.  Directory blocks are _not_ written
immediately. This goes well with writing all dirty blocks whenever one
dirty block has to be written and makes operations like `ar xo libc.a'
and `rm -rf junk' 10 to 20 times faster than with the 386BSD ufs.  It
increases the chance of a crash messing up the file system, but I prefer
the reduced chance of a crash because of less i/o.
-- 
Bruce Evans  (bde@runx.oz.au)