*BSD News Article 60060


Return to BSD News archive

Path: euryale.cc.adfa.oz.au!newshost.anu.edu.au!harbinger.cc.monash.edu.au!news.mel.connect.com.au!munnari.OZ.AU!news.hawaii.edu!ames!lll-winken.llnl.gov!sol.ctr.columbia.edu!hamblin.math.byu.edu!park.uvsc.edu!usenet
From: Terry Lambert <terry@lambert.org>
Newsgroups: comp.unix.bsd.netbsd.misc,comp.unix.bsd.bsdi.misc,comp.unix.solaris,comp.unix.aix
Subject: Re: ISP hardware/software choices (performance comparison)
Date: 19 Jan 1996 20:16:32 GMT
Organization: Utah Valley State College, Orem, Utah
Lines: 116
Distribution: inet
Message-ID: <4dou70$8v8@park.uvsc.edu>
References: <4cmopu$d35@vixen.cso.uiuc.edu> <4dh52u$1uk@park.uvsc.edu> <4digah$a7r@durban.vector.co.za> <4dklfv$27e@park.uvsc.edu> <4dlrag$fmn@nntpb.cb.att.com>
NNTP-Posting-Host: hecate.artisoft.com
Xref: euryale.cc.adfa.oz.au comp.unix.bsd.netbsd.misc:2051 comp.unix.bsd.bsdi.misc:2208 comp.unix.solaris:57848 comp.unix.aix:69193

dyson@inuxs.inh.att.com (John S. Dyson) wrote:
]
] In article <4dklfv$27e@park.uvsc.edu>,
] Terry Lambert  <terry@lambert.org> wrote:
] >
] >There are certain undesirable features of BSD that I will
] >(grudgingly, being a BSD advocate) admit.  I haven't brought
] >up memory overcommit, for instance, because it's a failing of
] >almost every modern OS -- even though it is mostly correctable.
] >Even BSD 4.4 and its derivatives.
] >
] >Since everyone has it, it's hard to use memory overcommit as
] >an argument for or against any OS as an ISP platform.
] 
] OhOh.. A place where Terry and I disagree.  I had a
] customer using SVR4 where they would have given their first-born
] for swap space overcommit.  Running out of swap-space caused them no-end
] of problems (and of course, they weren't using it all.)  Properly designed
] overcommit would have helped them out (BTW, the overcommit in FreeBSD is NOT
] properly designed -- I know, I did the code, and it was partly left over
] from 4.4Lite.)  They could NOT add additional disk -- they had maxed out
] their configuration.  They are also a good (major) customer...
] 
] I think that allowing properly designed overcommit can give breathing room in
] many applications.

My problem with overcommit stems primarily from the EBUSY
return code, which I believe to be an abomination before God.

The problem is that in all cases, it is possible (though no one
implements this) to resolve the problem without returning an
EBUSY.  At the same time, you won't necessarily lose most of
the benefits of overcommit for anything other than diskless or
dataless systems.



What is required for the general case of a file you want to
modify or install or write over, etc., is to look and see if
the VTEXT bit is set on the in core vnode, which indicates
the vnode is being references as a swap store.  Note that some
releases of SVR4 systems, even today, do not set the VTEXT bit
on shared library images when they are mapped as text.  Big
mistage in the mmap interface on these systems.

If the VTEXT bit is present, rather than returning EBUSY and
failing the operation, you force the file contents to swap
and establish a handle (anonymous vnode) to the pages.  This
is the moral equivalent of a copy on write fault for all pages
referred to by the inode pointed to by the vnode.

This makes EBUSY an internal return code only, since the condition
that would trigger its return is handled instead of bailed on.


Now we still have the problem of the VTEXT bit's reference
locality.

Since the VTEXT bit is set on the vnode, it is not associated
directly with the file as an attribute -- it is a vnode instance
attribute.  What this means is that an NFS client or NFS server
will not see a VTEXT bit set on a vnode an NFS client
It also means that the NFS server won't report a locally set
VTEXT bit to an NFS client until the client attempts an operation.

The net result of the bit being local is that file modifications
are only disallowed when the VTEXT bit on the server vnode can
prevent it.

So it is possible for a client or server to modify a file out
from under another client, causing that client to behave badly
or crash.


A simple way to resolve this problem without altering the NFS
protocol is to add a flag on a per file system type basis for
NFS clients.  The flag would indicate that the file system was
"remote".

Then, you modify the execution class loader (or the exec system
call for systems that don't support multiple ABIs) such that a
load of an executable from such a file system will force the
entire image into local swap.

This will have the effect of replacing the file being used as
swap store with actual pages in swap (doing this would necessitate
an additional bit to indicate "clean but in swap").  You would
probably close the local alias vnode for the NFS server file
after doing this.

You will note that the ability to force "clean" pages (that are
about to be modified) to swap is a necessary part of resolving
the EBUSY issue, above.


This would have the side effect that a dataless machine (or a
diskless machine with no swap) will no longer hang when paging
in from an image that was loaded from the NFS server -- a
perrenial problem with diskless/dataless Sun's 4.x and 5.x
systems (less of a problem for OS's other than Sun's, since
other OS's typically do not have nearly the support for network
centric machine configurations).


Getting back to John's posting, this may in fact be what he
means by "properly designed overcommit", since this issue has
come up on the BSD and MACH lists in the past (386BSD used the
MACH memory allocator, which is an overcommit-based design).


					Regards,
                                        Terry Lambert
                                        terry@cs.weber.edu
---
Any opinions in this posting are my own and not those of my present
or previous employers.