*BSD News Article 25018


Return to BSD News archive

Xref: sserve comp.os.386bsd.bugs:1915 comp.unix.internals:6594 comp.unix.bsd:13083
Newsgroups: comp.os.386bsd.bugs,comp.unix.internals,comp.unix.bsd
Path: sserve!newshost.anu.edu.au!munnari.oz.au!news.Hawaii.Edu!ames!agate!spool.mu.edu!uwm.edu!msuinfo!netnews.upenn.edu!dsinc!jabber!candle!root
From: root@candle.uucp (Bruce Momjian)
Subject: Swap overallocation in i386 BSD's
Organization: a consultant's basement
Date: Wed, 15 Dec 1993 04:38:11 GMT
X-Newsreader: TIN [version 1.2 PL2]
Message-ID: <CI27Jo.DGy@candle.uucp>
Lines: 146

I am running BSD/386 from BSDI.  When running with 5MB of RAM, I found
that the system locked up about once every week.  In researching the
problem with Mike Karels of BSDI, I think we have found a bug that
exists on BSD/386 and most free 386-based *BSD systems.  Here are the
details.

First, let me define copy-on-write(COW):  When a process forks, the OS
maps the address space of both the parent and child to the same memory
pages, and both process start running.  If either process makes changes
to its shared memory pages, the OS makes a copy of the shared page.  One
process gets the original, another gets the copy.

Ok, here is the bug we have found:  If a process forks a child, and the
parent writes to its memory pages (forcing a COW), and those pages are
paged out to swap before the child exec's or exits, the parent's and
child's<!> swap space is not released until the parent exits.

The ramifications of this is that if you have a long-running process
that forks a lot, like a shell, and your system does a lot of paging,
those long-running process will allocated more and more swap until they
exit.  It is particularly a problem with non-csh shells (csh, uses
vfork and exec), because they often run scripts by forking themselves,
and the child running the script may exist for quite some time without
exec'ing or exit'ing.

Here are Mike Karels more detailed words on the subject:

---------------------------------------------------------------------------

... The problem here is that if the process forks, and the parent modifies
data pages while the child exists, it must make copies of those pages
(copy-on-write after fork).  If those copies are paged out, then both
the copies and the originals will occupy space until the parent exits,
even if the child exits.

I think I described the chains of shadow objects that were accumulating,
and the fact that those are supposed to get coalesced.  It turns out
that the code to coalesce does not work if an object has been paged out.
This is the scenario that causes problems:

	- a long-lived program forks repeatedly,
	- the parent modifies data space before the child does exec or exit,
	  and
	- the parent's modified pages get paged out before child does exec
	  or exit.

The only situation in which this seems to be a problem is if a login
shell (or any long-running interactive shell) runs scripts by forking
and running them directly.  This will not happen with csh; I don't
know about ksh or bash.  (It does not happen with csh because it uses
vfork, and re-exec's itself if running a csh script).  It also does
not happen if the scripts are "executable" scripts, i.e. those that start
with #!/bin/sh.  It is also a problem only if the script or other system
activity uses enough memory for the shell to be paged out while the
script is running.

The bad news is that this problem is not easy to solve... However, I
think there are some workarounds that can be used for the moment.

---------------------------------------------------------------------------

My experience with 5MB of RAM and 20MB of swap running several screens
(no X, no networking) was that because I never logged out, my shell
accumulated swap space until it ran out.  About every 7 days the system
had to be rebooted (everything had stopped running).

I hope this helps explain some lockup problems some people may be
having.  Has anyone solved this problem?  I don't know the specifics of
why it is occurring, or why it is hard to solve, but if someone has
already solved it, I would love to hear about it.

Attached is a program that illustrates the problem.  With MAKE_CHILD
undefined, swap space is allocated the first time through the loop, and
stays pretty constant.  With MAKE_CHILD defined, swap decreases rapidly
each time through the loop until the system runs out of swap space and
locks up.  Note that each child is killed before the loop is restarted,
yet the swap space continues to decline rapidly.  You will need to
define some things at the top before you compile, including your systems
program for monitoring swap space.

---------------------------------------------------------------------------


/* show swap overallocation bug in child processes */
/* Bruce Momjian, root@candle.uucp */

/* tabs = 4 */

#include <stdio.h>
#include <unistd.h>
#include <stdlib.h>
#include <signal.h>

#define MAKE_CHILD

/* make this higher if you have more than 8 MB of RAM */
#define SYSTEM_RAM	4

/* program to show remaining swap space, vmstat? */
#define SHOWSWAP	"swaptotal"		

int k = 1024;

void main()
{
	char *y;
	int c_pid;
	int j;
	char *t;

	/* make my address space big */
	if ( (y=malloc(SYSTEM_RAM*k*k)) == NULL)
	{
		perror("Malloc");
		exit(1);
	}

	while (1)
	{
#ifdef MAKE_CHILD
		if ((c_pid = fork()) == 0)
			sleep(1000);
#endif

		/* parent touches memory to force COW copy */
		for (j=0,t=y; j < SYSTEM_RAM*k*k; j+=k)
		{
			*t = 'x';
			t += k;
		}

#ifdef MAKE_CHILD
		kill(c_pid,SIGHUP);
#endif

		puts("done ");
		system(SHOWSWAP);
	}
	/* NOT REACHED */
}

-- 
Bruce Momjian                          |  830 Blythe Avenue
root%candle.uucp@bts.com               |  Drexel Hill, Pennsylvania 19026 
  +  If your life is a hard drive,     |  (215) 353-9879(w) 
  +  Christ can be your backup.        |  (215) 853-3000(h)