*BSD News Article 19673

Newsgroups: comp.os.386bsd.development
Path: sserve!newshost.anu.edu.au!munnari.oz.au!news.Hawaii.Edu!ames!agate!howland.reston.ans.net!europa.eng.gtefsd.com!uunet!pipex!uknet!gdt!aber!fronta.aber.ac.uk!pcg
From: pcg@aber.ac.uk (Piercarlo Grandi)
Subject: Re: Hard disk geometry translation (was V86 mode ...)
In-Reply-To: torvalds@klaava.Helsinki.FI's message of 13 Aug 1993 23: 21:02 +0300
Message-ID: <PCG.93Aug18183353@decb.aber.ac.uk>
Sender: news@aber.ac.uk (USENET news service)
Nntp-Posting-Host: decb.aber.ac.uk
Reply-To: pcg@aber.ac.uk (Piercarlo Grandi)
Organization: Prifysgol Cymru, Aberystwyth
References: <107725@hydra.gatech.EDU> <1993Aug9.224939.19834@fcom.cc.utah.edu>
	<24cc1hINNo8@kralizec.zeta.org.au> <CBo9C6.9ED@sugar.neosoft.com>
	<24gt3e$gg7@klaava.Helsinki.FI>
Date: Wed, 18 Aug 1993 17:33:53 GMT
Lines: 45

>>> On 13 Aug 1993 23:21:02 +0300, torvalds@klaava.Helsinki.FI (Linus
>>> Torvalds) said:

Linus>  - I personally think the "translation overhead" mentioned by some folks
Linus>    as a source of inefficiency for the filesystems (either due to the
Linus>    controller getting slower due to translation or due to the fs not
Linus>    knowing about the real geometry) is mostly a load of bull-sh*t.  It
Linus>    may have made sense 10-20 years ago, but I doubt the FFS disk
Linus>    geometry optimizations are really worth it these days with
Linus>    controllers that do sector mapping etc (the BSD 4kB blocks are
Linus>    probably a *much* larger win when compared to linux' 1kB blocks). 

I tend to agree with this. In practice the biggset, and by far, wins,
are from keeping blocks physically clustered, and from issuing
multisector reads and writes for sequential access.

The first is done by switching from a time-ordered free list as in V7 to
a space-ordered free list (typically a bitmap), the second by better
adaptive read/write clustering, or, lamely, by raising the block size.

Both these optimizations work irrespective of the geometry.

One reason for which the BSD FFS needs to know the physical geometry is
that some versions of it have a very dubious optimization that is a big
lose with disks with an onboard cache: when allocating a block a free
one is not found scanning forward in the current cylinder, before
going on looking in the next cylinder, the FFS tries to look for one
anywhere (in a suitable rotational position) in the current one, *going
back*.

The other reason for knowing the physical geometry is to do rotational
latency optimizations; but most modern disks have a read ahead/track
read cache, and interleaving/rotational optimization is pointless.

Indeed when using the FFS with a drive that has a cache/read ahead
buffer I always set the interleaving to 0, the rps to 1, and so on,
precisely to defeat rotational optimization, which conflicts with the
good working of the cache.

As far as I know the highest performing filesystems around, the Veritas
extend based, and the SCO/ISC bitmap based ones, just do a good job of
static (disk layout) and dynamic (multiblock read/write) clustering.

The ext2fs currently lacks only the latter -- going to FFS like 4/8KB
blocks and fragment management is not necessary.