*BSD News Article 62279


Return to BSD News archive

Path: euryale.cc.adfa.oz.au!newshost.anu.edu.au!harbinger.cc.monash.edu.au!newsroom.utas.edu.au!munnari.OZ.AU!spool.mu.edu!news.sol.net!news.moneng.mei.com!uwm.edu!vixen.cso.uiuc.edu!howland.reston.ans.net!psinntp!psinntp!psinntp!spunky.RedBrick.COM!nntp.et.byu.edu!cwis.isu.edu!news.cc.utah.edu!park.uvsc.edu!usenet
From: Terry Lambert <terry@lambert.org>
Newsgroups: comp.unix.bsd.freebsd.misc,comp.os.linux.development.system
Subject: Re: async or sync metadata [was: FreeBSD v. Linux]
Date: 14 Feb 1996 03:06:41 GMT
Organization: Utah Valley State College, Orem, Utah
Lines: 44
Message-ID: <4frjk1$19t@park.uvsc.edu>
References: <4er9hp$5ng@orb.direct.ca> <DMI5Mt.768@pe1chl.ampr.org> <4fi6gq$3tr@dyson.iquest.net> <4fjodc$o8j@venger.snds.com> <jlemonDMMDq5.1yF@netcom.com> <DMnow5.43G@pe1chl.ampr.org>
NNTP-Posting-Host: hecate.artisoft.com
Xref: euryale.cc.adfa.oz.au comp.unix.bsd.freebsd.misc:14452 comp.os.linux.development.system:18127

rob@pe1chl.ampr.org (Rob Janssen) wrote:
] Essentially, with sync metadata you will have to remove all recently
] modified files off your disk after a crash, because you cannot be sure
] of the validity of their contents.

Bullpucky.  Only if you have bad applications.

] With ext2fs you won't lose the entire filesystem just because some
] metadata is not in a consistent state.  fsck will fix that, and you will
] know about it (instead of having some false trust that everythink is OK).
] I don't know how FFS behaves in this regard, it may be true that you
] lose your entire filesystem if some inode or free space map has not been
] properly updated, but that would not be a clever design IMHO.

Yet more bullpucky.  Async means async, which means that it
may reorder the writes as it sees fit.  Ext2fs *may* actually
get the writes in exactly the same order UFS did, with the same
results.

More likely, Ext2fs will get the writes in a different order,
making deterministic recovery impossible.  You will get a
consistent file system state after the cleaner runs, but you
won't necessarily get the state intended by the async writes
that were lost.

For a database and an index file for the database, this means
that the index state and the data file state may not be mutually
consistent.

This is exactly the same inconsistency that UFS introduces with
async data writes, except you only lose only one state back on
UFS (remember my N*(N-1) calculation for determinism?).

To combat this, it is up to the program to specify O_WRITESYNC
on opens, so it can do deterministic state recovery (ie: know
it needs to reindex), or uses a transaction log and/or multistage
commit process (which is exactly what databases have always done).


                                        Terry Lambert
                                        terry@cs.weber.edu
---
Any opinions in this posting are my own and not those of my present
or previous employers.