*BSD News Article 58422

Path: euryale.cc.adfa.oz.au!newshost.anu.edu.au!harbinger.cc.monash.edu.au!news.rmit.EDU.AU!news.unimelb.EDU.AU!munnari.OZ.AU!spool.mu.edu!howland.reston.ans.net!newsfeed.internetmci.com!newsxfer2.itd.umich.edu!newsxfer.itd.umich.edu!nntp.cs.ubc.ca!cs.ubc.ca!keats.ugrad.cs.ubc.ca!not-for-mail
From: c2a192@ugrad.cs.ubc.ca (Kazimir Kylheku)
Newsgroups: comp.unix.bsd.bsdi.misc,comp.unix.advocacy
Subject: Re: multiple httpds vs threads vs ... (was BSDI Vs. NT...)
Date: 28 Dec 1995 17:00:57 -0800
Organization: Computer Science, University of B.C., Vancouver, B.C., Canada
Lines: 46
Message-ID: <4bvek9INNc3i@keats.ugrad.cs.ubc.ca>
References: <taxfree.3.00C439A1@primenet.com> <4bri4a$q2r@noao.edu> <4brlt5$8un@sungy.germany.sun.com> <4bsaa6$8l8@elf.bsdi.com>
NNTP-Posting-Host: keats.ugrad.cs.ubc.ca
Xref: euryale.cc.adfa.oz.au comp.unix.bsd.bsdi.misc:1875 comp.unix.advocacy:12733

In article <4bsaa6$8l8@elf.bsdi.com>, Chris Torek <torek@bsdi.com> wrote:
>The actual cost difference is, as I said, machine-dependent, and
>usually moderate.  For a while the major part of the cost was cache
>and TLB flushing (or more precisely, recovery therefrom), and was
>scaling (worsening) with processor speed, so many modern CPUs have
>multiple `contexts' and label cache entries (lines and TLB slots)
>with a `context' or `process' or `address-space' ID.  This makes
>a thread-and-address-space switch be about the same work as a thread
>switch alone, except that the cache size is, in effect, divided by
>the number of active address spaces.  (These machines also take an
>extra hit when you exceed the number of IDs, whether it be 8, 16,
>64, 1024, 4096, or what-have-you.)

Yes. 

Let's not forget that the major architecture designers all provide operating
systems. The RISC machines out there are implicitly designed with C and UNIX in
mind. I'm talking HP-PA, MIPS, Alpha, SPARC, POWER...


By the way, those context switches aren't all that expensive, and don't happen
all that often. It is a complete myth that using fork() instead of
"lightweight" threading is a pig. There isn't one shred of evidence to support
that. Forking has a slightly higher start-up cost, because paging information
has to be cloned, and pages that are written to have to be copied. Thereafter,
the context switching overhead is negligible (compared to LWP).

If you want the shared memory IPC, you can use shared memory maps that are
optionally backed by file storage: BSD or SYSV style---take your pick. And the
nice thing is that you still have your private data space; sharing is not an
all-or-nothing proposition.

>These differences do indeed tend to get overwhelmed by other factors,
>at least on `balanced' architectures.  The Pentium is not particularly
>well balanced, however---it is quite fast and needs good cache and
>TLB performance, but lacks context identifiers.  Relatively speaking,
>switches are more expensive on a Pentium than, well, pretty much
>anything else modern.

It's not all that bad on a Pentium either. I jacked up the clock tick of Linux
to a 1000 Hz on a P90, and lowered the time slice quantum to a few
milliseconds. It doesn't seem to affect CPU-intensive tasks very much at all.
Just slightly. Boy, did my polled-IO terminal kick ass, though...
-- 
I got my BSD degree from the U of NIX.
Better dead than Redmond.