*BSD News Article 14185

Newsgroups: comp.os.386bsd.development
Path: sserve!newshost.anu.edu.au!munnari.oz.au!spool.mu.edu!howland.reston.ans.net!zaphod.mps.ohio-state.edu!magnus.acs.ohio-state.edu!csn!hellgate.utah.edu!fcom.cc.utah.edu!cs.weber.edu!terry
From: terry@cs.weber.edu (A Wizard of Earth C)
Subject: Re: File Truncation Philosophy
Message-ID: <1993Apr7.234429.1714@fcom.cc.utah.edu>
Sender: news@fcom.cc.utah.edu
Organization: Weber State University  (Ogden, UT)
References: <C4tJ6C.C17@ns1.nodak.edu>
Date: Wed, 7 Apr 93 23:44:29 GMT
Lines: 86

In article <C4tJ6C.C17@ns1.nodak.edu> tinguely@plains.NoDak.edu (Mark Tinguely) writes:
>		******** Request for Comments ********
>
> As most of you know that with 386bsd installing a new copy of a running
> program causes the running program to crash and core. This happens due to
> couple of things. Fist, with the Mach VM used in 386bsd, executing
> instructions are paged directly from the executable in the filesystem
> (whereas old BSD VMs copy the executable to swap and page from there).
> Secondly, programs like "cp" TRUNCATES the existing file when copying in
> the new copy of the program. When the executing program does the next page
> fault the fault will fail and the program will crash/core.

How does Mach fix this?  After all, the problem is introduced by not having a
process local copy of the image in swap to page from, and Mach either has
the same behaviour or has fixed it.

My personal take on this would be to have the pages marked as in use but not
in core, and simply mark them copy on write.  In theory, this would cause
the O_TRUNC to work "correctly", but it's kind of iffy.  Any future reference
to the same file would be required to use the same PTDI's (the iffy part,
since direct page references for all pages in a file aren't made in itrunc).

Another alternative would be to return EBUSY from itrunc() in ufs_inode.c if
the file is being used as a swap store (sys/errno.h defines ETXTBSY, but for
some reason this is only if _POSIX_SOURCE isn't defined).  This is what
Xenix and SVR4 have traditionally done (I can't count how many times I've
gotten "text file busy" on Xenix systems when trying to overwrite an
executable).  An "rm" of the file wouldn't get rid of the blocks as long as
there was an open vp for the file.  The cp after the rm would work fine and
the currently executing image wouldn't fail.  This may mean we would need
an additional tag on the vnode (it may already be there; I haven't checked
it out and don't have time to right now) to tell that the file is opened
as a swap store by a process.

I *don't* advocate changing "cp", unless it's to take advantage of mmap()
to avoid data copies across the user/kernel boundry (an optimization used
in most modern implementations).

> The easy fix is to move or remove the file before installing the new
> program. The filesystem does work correctly and keep a copy of the executable
> and the VM still finds and uses this copy.

Right; a user action is required.  As long as we bring text pages from the
file instead of swap, I think this is the only method that should be allowed
(meaning we mod the fs to disallow truncation of "busy" files).  The big
issue is disallowing corruption of a text page swap store.  Whether this is
done by inhibiting the corruption at the point it would take place (itrunc)
or preempting it (marking the pages as file swap pages, ie: in use, and
copy on write), is irrelevant.

> It would be NICE to not have to worry about unlinking the file associated
> with running programs before making our copies. Nate and all the others
> working the patchkit are interested in this, also very important if the user
> is restoring from a backup (as I learned once).

Definitely!  The image replacement should be disallowed but non-fatal (ie:
the constant warnings about replacing "tar" on Xenix package installations).

> The philosophy question is should we change "cp" and "cat" to unlink (remove)
> the file before opening? Or even lower in the filesystem (as would need be in
> the restore example).
>
> I can think of several reasons to not do this:
>	1) won't have the same inode.
>	2) won't cover all cases -- using open(2) and O_TRUNC will still 
>	   cause the same problem.

So will a creat(), which is an open with O_TRUC|O_CREAT.  Definitely,
definitely, definitely don't "hack" a "fix" to cp, cat, and others which,
while it won't dork UFS, will dork other file systems.	The problem is in
the kernel and either needs a VM fix (inconvenient and expensive) or an
FS fix (inconvenient int that all FS's would be required to support it).
One of these two must be our answer.


					Terry Lambert
					terry@icarus.weber.edu
					terry_lambert@novell.com
---
Any opinions in this posting are my own and not those of my present
or previous employers.
-- 
-------------------------------------------------------------------------------
                                        "I have an 8 user poetic license" - me
 Get the 386bsd FAQ from agate.berkeley.edu:/pub/386BSD/386bsd-0.1/unofficial
-------------------------------------------------------------------------------