*BSD News Article 65035


Return to BSD News archive

Path: euryale.cc.adfa.oz.au!newshost.anu.edu.au!newshost.telstra.net!act.news.telstra.net!psgrain!usenet.eel.ufl.edu!gatech!newsfeed.internetmci.com!inet-nntp-gw-1.us.oracle.com!news.caldera.com!news.cc.utah.edu!park.uvsc.edu!usenet
From: Terry Lambert <terry@lambert.org>
Newsgroups: comp.os.linux.development.system,comp.unix.bsd.freebsd.misc
Subject: Re: Ideal filesystem
Date: 4 Apr 1996 04:10:58 GMT
Organization: Utah Valley State College, Orem, Utah
Lines: 126
Message-ID: <4jvi4i$oim@park.uvsc.edu>
References: <4hptj4$cf4@cville-srv.wam.umd.edu> <3140C968.20699696@netcom.com> <4istou$ri9@floyd.sw.oz.au> <4j0bmo$ftv@park.uvsc.edu> <jlemonDoqBq5.1Bx@netcom.com> <4jerrj$f12@park.uvsc.edu> <4joiil$r75@narses.hrz.tu-chemnitz.de>
NNTP-Posting-Host: hecate.artisoft.com
Xref: euryale.cc.adfa.oz.au comp.os.linux.development.system:20645 comp.unix.bsd.freebsd.misc:16614

fachat@physik.tu-chemnitz.de (Andre Fachat) wrote:
]
] Terry Lambert (terry@lambert.org) wrote:
] : jlemon@netcom.com (Jonathan Lemon) wrote:
] : ] Hm.  Shells just look at the inode to see if the file is executable,
] : ] in order to add it to the hash list.  So what if there's some (unspecified)
] : 
] : For a "binary" that was a directory, there is a need to search
] : every directory in every directory in the path for "a.out" or
] : whatever you call your actual non-fork binary.
] 
] You need to search only _one_ directory for a.out - the one directory
] with the name of the program you want to start.
] Only for each program execution, there is One more directory search,
] in a (assumption) small bundle directory.

A modern shell will search its path as a result of an event
(startup, user command, path change, etc.) and produce a
name-to-full path hash list for executables in the path.

So when I type "ls", instead of statting or trying to run an
"ls" command in every possbile path in path search order, it
knows that "ls" mean "/bin/ls".

Even if "/bin" is at the end of my path.

The search being discusses is the path search to build the hash.

The current design which allows this is based on the idea of
trading startup time for execution time, with the idea that
you run a shell once per some number of commands, so it's a
good trade.

If you have to search each potential executable as a directory
to see if it has an "a.out" (executable file) "fork", then
you have increased the search time one order of magnitude on
an expotential curve (directories rarely contain one file).

Which means you need to now go back and question the time
assumptions that made you build the hash in the first place.


Rob's suggested optimization, unfortunately, adds a failure
mode that a exhaustive search for possible executables doesn't
have.  Using Rob's suggestion, you add possibly bad hash
entries for directories that are marked as executable but
which have no a.out "fork".  Further, because these are
directories, not attribute lists, there's no way to make
the attribute on the directory "go away" when the a.out
is deleted, or "magically appear" when the a.out is added
the first time.  The result is a lot of mucking around
with the programs that create a.out files to make them do
the job for you.

] But then the linear search for a file in a directory is,
] IMHO a hack and should be replaced by something fast like binary tree
] or so... (when talking about performance problems.)

Agreed.  I believe Rob's position is "change as little as
possible to make things work on a case-by-case basis" (he
can correct me if he wants).  Changing UFS flexname support
into HPFS style btree storage would contradict that philosophy.

IMO opinion, we should not be content to "just get by with
directories", even if we can solve half of the problems we
want to solve by doing that.

[ ... ]

] This is a point to think about. But then, most filesystem corruption comes
] from not written (meta)data when writing something to disk. Normal
] operation should not write something to a binary...

Or intentional mucking around with the subfiles of these
"executables".  You don't need a crash to cause a failure.

I'd point out, though, that directory entries are metadata,
so you are doing exactly what exposes you to risk when you
manipulate the fork namespace, if it's a directory.

] How do you handle the "signle focal file system object" in a Unix 
] filesystem. Is is a "Unix-File" with a builtin structure that contains
] fields for EAs, binaries...? You then also have to search the file
] for the binary field when executing (see above). 
]
] Are the EAs stored in a special block, that is pointed to by something in 
] the standard inode? Then this special block can get lost like
] any other block.

Both good questions.

The magic here is that by default, a linker will create a file
with one or more forks (it might, for instance, put in default
icon information as part of the "link").

But since the operations have to go through the user/kernel
interface, you can track the file system events and make sure
that the updates to what's actually on disk is idempotent.

That is, all changes will be recovered, or none of them will.

A good example of why you might want this is the Windows95
verion of the "4DOS" program, which has a text version string
as part of its icon.

If you are in the middle of an update of your 4DOS program
from one version to another, you want both the "a.out" and
"default_icon" forks updated, or you want none of them updated.

You just can't make the guarantee about multiple files in a
directory.


So it boild down to "it's irrelvant (only for answering these
particular questions) how the association is actually made in
the kernel -- what is important is that it is made in the
kernel (not in user space, and not in an exposed transaction
space, like a directory, where the fork manipulation isn't
guaranteeably idempotent).


                                        Terry Lambert
                                        terry@cs.weber.edu
---
Any opinions in this posting are my own and not those of my present
or previous employers.