*BSD News Article 14312


Return to BSD News archive

Newsgroups: comp.os.386bsd.development
Path: sserve!newshost.anu.edu.au!munnari.oz.au!constellation!osuunx.ucc.okstate.edu!moe.ksu.ksu.edu!zaphod.mps.ohio-state.edu!magnus.acs.ohio-state.edu!csn!hellgate.utah.edu!fcom.cc.utah.edu!cs.weber.edu!terry
From: terry@cs.weber.edu (A Wizard of Earth C)
Subject: Re: File Truncation Philosophy
Message-ID: <1993Apr11.035322.19610@fcom.cc.utah.edu>
Sender: news@fcom.cc.utah.edu
Organization: Weber State University  (Ogden, UT)
References: <1993Apr2.072443.790@cm.cf.ac.uk> <1993Apr8.002028.2376@fcom.cc.utah.edu> <1993Apr8.025858.22137@uvm.edu>
Date: Sun, 11 Apr 93 03:53:22 GMT
Lines: 136

In article <1993Apr8.025858.22137@uvm.edu> wollman@sadye.emba.uvm.edu (Garrett Wollman) writes:
>In article <1993Apr8.002028.2376@fcom.cc.utah.edu> terry@cs.weber.edu (A Wizard of Earth C) writes:
>>I can live with the canonical fix, but want the prettier fix, since
>>the file would act like it *wasn't* a swap store... the same actions
>>would be required on any inode write attempt, not just truncation.
>
>I can't live with the prettier fix, unless you have a way to make
>other memory-mapped files work correctly.  Remember that, as far as
>the VM system is concerned, an executable is just another
>memory-mapped file.  This is the reason why the ``obvious'' fix could
>not be made in the vnodepager: it doesn't know that the file it's
>paging is an executable---only execve() knows that.  I would argue
>that that is as it should be.

Well, I've promised some people to elaborate on "the pretty fix" in some
private mail, and it's kinda been misinterpreted here.  I am *not*
suggesting going back to the previous VM code (even though this will work)
because it would mean losing the "instant start" benefits, as well as the
non-modified text page benefits (non-modified text pages take vm cache
but do not take swap).


First, let's consider the issues, boiled down to the essentials, based on
0.1 PL0.2.2 with none of the recently suggested patches installed:

o	Processes swap pages from text files instead of swap

	- startup is faster because the copy-to-swap is avoided
	- this saves on swap
	- due to the lack of a unified VM/buffer cache, this is slower,
	  since a copy to vm cache from FS buffer cache is required for
	  each page in

o	The VTEXT flag is not correctly set on files during exec to
	indicate an EBUSY or ETXTBSY should result from an attempt to
	open or truncate the file.

	- Setting VTEXT on the vnode in exec returns the proper error
	  codes on attempts to truncate or write a running programs
	  original image.
	- If the image is open before the VTEXT is set by it being run,
	  however, write and truncate uperations subsequent to executing
	  the image opened do *NOT* correctly return errors.
	- Returning an ETXTBSY is *NOT* Posix compliant; the error
	  ETXTBSY is *NOT* supported in Posix.
	- Images open for write or in such a way as to allow truncation
	  should not be allowed to execute (EBUSY?).
	- Disallowing the running of images which may be potentially
	  modified is also *NOT* Posix compliant.
	- NFS does not provide a way of sharing current vnode flags
	  across exported/imported file systems; there is no way to
	  solve this problem in the current implementation of NFS.  We
	  are lucky that this is unlikely to be a problem given "normal"
	  usage of NFS does not export executable directories as
	  writable, nor is concurrent access of user images by writing
	  and execution on differnt hosts likely (although it is possible).


Implementing the "pretty" soloution on top of the "EBUSY/ETXTBSY" will
need the following:

o	Posix compliance.

	- The EBUSY/ETXTBSY returns have to be hidden (preferrably in the
	  VFS/vncalls layer) to ensure the presentation of a Posix
	  compliant interface to the user.  Basically, corrective action
	  is taken at the hiding layer, and the operation is retried.
	  The resulting EBUSY/ETXTBSY are *internal* and not exposed to
	  the services consumer, who expects Posix compliance.

o	Writing to files can not be allowed to crash the system.  As an
	alternative to the *UGLY* (no Posix compliant) soloution *and*
	an alternative to copying the program to swap on start up as in
	the old VM system, the following can be done:

	- On open for write/trucation, the text pages from the file
	  belonging to the executable image are copied to swap or are
	  copied to memory pages marked as swappable and dirty.  This
	  will result in protection of the image from overwrite of it's
	  swap store (since it will no longer be using the file as the
	  swap store).  A file can be determined to be open for a swap
	  store for an image by examining it's flags to determine if
	  VTEXT is set *before* the vnode reference count is bumped.
	- Potentially, and additional flag indicating the process image
	  was copied to swap could be used to allow subsequent invocations
	  even during or following modification of the file.
	- Files already open for text access must be assumed to be open
	  for writing (unless we add another flag to the in core vnode
	  to indicate whether or not the vnode is considered writable),
	  and running the process must require on of:
	  + refuse to run an image undergoing changes.
	  + copy the current file image to swap as per the old VM approach
	    and *then* run as a process.
	  The first soloution is, to my mind, superior, although it again
	  raises the spectre of Posix compliance.

o	Speedups.

	- The VM and buffer cache must be unified to minimize the swap
	  overhead on swap-from-file for the file-as-swapping-store case
	  during normal use..

>I haven't tried out the execve implementation of the ``obvious'' fix
>to see if it works yet.  (I'm so reluctant to reboot my machine when
>my NTP is doing 
>	xntpd[78]: offset 0.006774 freq -62.73834 comp 4
>so well.)

Good numbers -- and the "obvious" fix fails in the "file already open
for writing/truncation when execution takes place" case.  A write to
an executing image is not prevented provided the open occurred prior
to VTEXT being set.  The setting of VTEXT needs to respect a non-zero
reference count when VTEXT is not already set.  None of the currently
posted fixes resolves this issue.  Mark Tinguely is currently researching
the issue (he's one of the people I promised this writeup to).


PS to Garrett:  I have been unable to contact you via email -- please
contact me with any information or preferences on inclusion of your
loadable module interface in the 0.1.5 release, and any information you
feel relevant to Sun-style shared libs in 0.1.5.  I have some stuff, but
I'd have to remove a lot of "not ready for public consumption" code to
use them instead (plus it reduces friction if I concede beforehand 8-).


					Terry Lambert
					terry@icarus.weber.edu
					terry_lambert@novell.com
---
Any opinions in this posting are my own and not those of my present
or previous employers.
-- 
-------------------------------------------------------------------------------
                                        "I have an 8 user poetic license" - me
 Get the 386bsd FAQ from agate.berkeley.edu:/pub/386BSD/386bsd-0.1/unofficial
-------------------------------------------------------------------------------