*BSD News Article 34213

Xref: sserve comp.os.386bsd.questions:12295 comp.os.386bsd.misc:3165
Path: sserve!newshost.anu.edu.au!harbinger.cc.monash.edu.au!msuinfo!agate!howland.reston.ans.net!europa.eng.gtefsd.com!news.umbc.edu!eff!cs.umd.edu!newsfeed.gsfc.nasa.gov!cesdis1.gsfc.nasa.gov!not-for-mail
From: becker@cesdis.gsfc.nasa.gov (Donald Becker)
Newsgroups: comp.os.386bsd.questions,comp.os.386bsd.misc
Subject: Re: Whats wrong with Linux networking ???
Date: 12 Aug 1994 15:55:57 -0400
Organization: NASA Goddard Space Flight Center -- Greenbelt, Maryland USA
Lines: 41
Message-ID: <32gk4d$ee@cesdis1.gsfc.nasa.gov>
References: <327nj0$sfq@sundog.tiac.net> <328fn2$i9p@news.panix.com> <32bflj$lig@cesdis1.gsfc.nasa.gov> <CuDJox.HE2@calcite.rhyolite.com>
NNTP-Posting-Host: cesdis1.gsfc.nasa.gov
Mime-Version: 1.0
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: 7bit

[[ Summary: we continue our networking uhmmm, discussion.  It's now focusing
on NFS semantics and implementation choices .  Hopefully our readers will
find this more informative than the average flamefest. ]]

In article <CuDJox.HE2@calcite.rhyolite.com>,
Vernon Schryver <vjs@calcite.rhyolite.com> wrote:
>In article <32bflj$lig@cesdis1.gsfc.nasa.gov> becker@cesdis.gsfc.nasa.gov (Donald Becker) writes:
>>The NFS protocol assures the client that when the write-RPC returns, the
>>data block has been committed to persistent storage.  For common
>>implementations that means the block has been physically queued for writing,
>>not just put in the buffer cache. ...
>
>An NFS server that only queues the block for writing before responding
>instead of waiting for the disk controller to say that the write has
>been completed does not meet the NFS "stable storage" rules.  Such a

Yes, Vernon, I deliberately used the word "queue" there.  (I was going to
explain it, but felt it would detract from the main point of the article.)
It's not the operating system buffer cache I'm referring to, but the disk
controller queue.  Most modern disk controllers, both IDE and SCSI, actually
just queue write requests and return immediately.  Sure, the vulnerability
window is limited to tens of milliseconds, but I suspect most systems
technically violate the "committed to stable storage" rule.   Not that I
think this is particularly bad or dangerous...

>>                          ...   You can get around this by writing a client
>>implementation that allows multiple outstanding write requests for each
>>writing thread, at the expense of write order inconsistency.
>> ...
>me as a odd consideration for an NFS client, given that one cannot know
>about the NFS requests of other clients and given the chaos caused by
>retransmissions.

A simple, common example: 'tail -f logfile', where "logfile" is written by a
NFS client.  With multi-threaded writes it could show spurious zeroed blocks,
while a single-threaded client would produce the expected results.
-- 
Donald Becker					  becker@cesdis.gsfc.nasa.gov
USRA-CESDIS, Center of Excellence in Space Data and Information Sciences.
Code 930.5, Goddard Space Flight Center,  Greenbelt, MD.  20771
301-286-0882	     http://cesdis.gsfc.nasa.gov/pub/people/becker/whoiam.html