*BSD News Article 6885


Return to BSD News archive

Newsgroups: comp.unix.bsd
Path: sserve!manuel.anu.edu.au!munnari.oz.au!sgiblab!zaphod.mps.ohio-state.edu!uakari.primate.wisc.edu!ames!sun-barr!cs.utexas.edu!hellgate.utah.edu!jaguar.cs.utah.edu!mike
From: mike%jaguar.cs.utah.edu@cs.utah.edu (Mike Hibler)
Subject: VM limits and adding swap space
Date: 21 Oct 92 22:28:10 MDT
Message-ID: <1992Oct21.222810.2807@hellgate.utah.edu>
Originator: mike@jaguar.cs.utah.edu
Organization: University of Utah CS Dept
Lines: 71

You have a fundamental limit on how much VM you can use:

        memory size + backing store (swap) size

Because the Mach-based VM does "lazy allocation" of swap space for objects
and because the implementor of the prototype VM system didn't put in checks
to enforce this limit (aka "lazy implementation" :-) you can screw yourself
Big Time.  Its actually a little more complicated because, once allocated,
swap space remains associated with the object until the object is deallocated
and you wind up with pages in both memory and swap.  Between that and swap
fragmentation, the mem+swap figure winds up being an upper bound.

Anyway, you are left with two choices 1) use less VM or 2) get more
memory/swap.  In the former category lie such things as:

        implement shared libraries (admirable),
        run fewer background processes (easily doable),
        avoid swap space fragmentation (dubiously profitable),
        use vi instead of emacs (unacceptable),
        put your kernel on a diet (heretical),
        run DOS (unconscionable)

In the latter you are faced with either:

        get more memory (long a favorite solution of ours),
        add more swap space

Since adding swap space seems to be on people's minds, I'll make some
serious comments.  Given that you can't add another disk or re-partition,
there are two straight forward ways I can think of off-hand to get swapping
to files:

1. Use a "file (vnode) disk" driver that makes a regular file look like a
   disk.  Essentially you create a special file, say /dev/vn0c, which is
   associated with a regular file, say /var/swapfile.  Reading/writing
   the special file will access the contents of the regular file.  The
   advantage here is that BSD already knows how to swap to a device so
   you don't have to hack the rest of the kernel.  You just form the
   association between /dev/rvn0c and /var/swapfile, do swapon for vn0c
   and its off and running.  If you are brave, you can even swap over NFS
   this way.  The main disadvantages are an extra level of indirection in
   swapping and potential pollution of your buffer cache.  We have used
   this approach from time to time (our BSD driver is in the net-2
   release in the sys/hp300/dev directory).

2. Make the vnode pager the "default" pager in place of the swap pager.
   In theory the interface is structured such that this would work.
   However, there is some code to be written.  First you need to
   initialize swap files either at boot time or by adding a hook in the
   pager for swapon.  Second you need code in vnode_pager_alloc to assign
   an object to a swap file.  There are also some semantic issues
   associated with how to treat requests for non-existent pages as well
   as policy issues when dealing with multiple swap files.  The big
   disadvantage of this approach is that it is wasted effort since the
   existing pager interface is going to go away (Real Soon Now in 4.4).

The current plan in 4.4 is to merge the buffer cache (vnode) and page
cache (vm_object) and in the resulting homogeneous society ("I'm a vnode,
you're a vnode, we're all a vnode...") introduce some sort of special
"swap filesystem" which has all the desired traits (instantaneous access,
infinite capacity).

Finally, while on the subject of VM, let me dispel one misconception.
the net-2, 4.4bsd, 386BSD, BSD/386 VM system is "Mach based", but it is
based on the old 2.0 release, not 2.5 or the current 3.0.  In particular
this means that there is no external (user-level) pager support.  Also, the
VM system in "the pure kernel" (3.0) is radically different than what was
in 2.0 or is in BSD so you should not make generalizations about "Mach VM"
based on what is in BSD.  There are some similarities (e.g. some data
structures, the pmap interface, the pageout strategy) but the Mach
VM system is considerably more powerful.