*BSD News Article 59023


Return to BSD News archive

Path: euryale.cc.adfa.oz.au!newshost.anu.edu.au!harbinger.cc.monash.edu.au!news.mel.connect.com.au!munnari.OZ.AU!news.ecn.uoknor.edu!news.ysu.edu!usenet.ins.cwru.edu!gatech!newsfeed.internetmci.com!uwm.edu!lll-winken.llnl.gov!osi-east2.es.net!oracle.pnl.gov!mica.inel.gov!cwis.isu.edu!news.cc.utah.edu!park.uvsc.edu!usenet
From: Terry Lambert <terry@lambert.org>
Newsgroups: comp.unix.bsd.netbsd.misc,comp.unix.bsd.bsdi.misc,comp.unix.solaris,comp.unix.aix
Subject: Re: ISP hardware/software choices (performance comparison)
Date: 14 Jan 1996 22:05:20 GMT
Organization: Utah Valley State College, Orem, Utah
Lines: 368
Distribution: inet
Message-ID: <4dbun0$j2f@park.uvsc.edu>
References: <4cmopu$d35@vixen.cso.uiuc.edu> <4crnbe$8a@olympus.nwnet.net> <4cs2kn$kfg@cynic.portal.ca> <4cu7t0$mg5@engnews2.Eng.Sun.COM> <4cv8j1$59k@park.uvsc.edu> <4cvjpk$rpf@durban.vector.co.za> <4d43bt$es8@park.uvsc.edu> <4d5vhg$38p@mail.fwi.uva.nl>
NNTP-Posting-Host: hecate.artisoft.com
Xref: euryale.cc.adfa.oz.au comp.unix.bsd.netbsd.misc:1825 comp.unix.bsd.bsdi.misc:1959 comp.unix.solaris:56537 comp.unix.aix:68124

casper@fwi.uva.nl (Casper H.S. Dik) wrote:
>
> Terry Lambert <terry@lambert.org> writes:
> 
> >I could do the same going the other direction.  I might even be
> >able to drag certain former high level Sun employees screaming
> >into the debate.  8-).
> 
> I'd be interested in such a list..  When I was asked
> some time ago "If you like Solaris so much, you must know of some
> advantages SunOS has over Solaris".
> I drew blank.

1)	Smaller footprint.
2)	More robust.  Uptimes in months or years.
3)	Non-STREAMS TCP/IP
4)	Available for Motorolla based hadware so a heterogeneous
	environment can provide the user with a near-identical
	interface acress platforms (Intel isn't an argument here;
	remember the 386i?).
5)	Compiles most net sources "out of the box" without
	modification or use of a compatability environment.
6)	Large amount of research materials are directly
	applicable (papers, code, etc.).
7)	Interfaces for FS writers is usable without source
	license.
8)	NFS is reliable, does not violate protocol specification.
9)	System clock accurate to 4uS; useful for non-statistical
	profiling.
10)	Select timeout resoloution is sufficient for finite
	state automaton based parallel service engines without
	requiring buzz loops.

That's a start.


] >The SunOS 4.x LWP mechanism is better than the threading
] >mechanism in SunOS 5.x because it optimizes the benefits
] >threads are intended to have over processes: minimization
] >of overhead associated with IPC and context switching.
]  
] But it doesn't do preemptive thread scheduling, doesn't do
] poriorities well and blcoks in page faults and "fast" I/O.

These issues are the result of the failure to maintain the
code, not a result of basic flaws in the model such as exist
in the 5.x implementation regading utilization of process
quantum.


] >Specifically, the n:m mapping of user space to kernel threads
] >in SunOS 5.x means that you *must* have a kernel thread for
] >each potential blocking operation initiated by a user thread.
] 
] Incorrect.  You need to have a kernel thread for each
] blocking operation, not for each *potential* blocking opeartion.
] If all threads are blocked, the OS gives you c change to start a new one.

Subtle distinction.  OK.  I'll buy it.  Please amend my statement:

"the n:m mapping of user space to kernel threads in SunOS
 5.x means that you *must* have a kernel thread for each
 blocking operation initiated by a user thread, OR you
 must block, giving up the processor with quantum still
 remaining, but unusable (will happen on a per thread
 basis anyway)".

[ ... The LWP scheduling mechanism ... ]

] How did you handle multiple threads on multiple
] processors?

I didn't.  You must have missed the start of my previous
article, where I already ceded MP to Solaris because of
lack of contunuing maintenance of the SunOS 4.x code.

] How did you handle page faults and disk/tape I/O?

You mean how did I establish a dedicated signal handling
thread, or are you talking about implementing Mach-style
non-resident pagers (why would I have to do that?).

Disk and tape I/O other than via read/write so that it
could be converted to the aioread/aiowrite/aiowait/aiocancel
interface + a context switch?

I didn't handle blocking events not caused by read/write
operations.  The general mechanism for doing so would be
to establish an alternate trap vector, yielding an "aiosyscall"
and assing the "wait/cancel" mechanism for this in general
instead of the aioread/aiowrite calls in specific.

] >The SunOS 4.x NFS is slower than the 5.x because the 4.x
] >implementation did not violate the reliability guarantees with
] >regard to not doing client or server caching without a commit
] >roll-forward facility (such as you get from PrestoServe or some
] >other NVRan facility, and such a facility would be equally
] >applicable to 4.x as well as 5.x).
] 
] What do you mean by this?  In Solaris 2.x as well
] as SunOS 4.x all server update operations are synchronous.
] Clients employ write behind in both OSes.

Server caching, which is not on by default for 5.x.

] >This leaves networking, which I think we can both agree is hurt
] >by the streams implementation replacing the monolithic TCP/IP
] >implementation of traditional BSD.  The 3Mbit/S limit on the
] >network performance under SunOS 4.x was a limitation imposed
] >by the driver's misuse of the AMD Lance chipset buffers, not
] >an inherent "SunOS 4.x problem".
] 
] 3MBit/ in SunOS 4?  SunSO 4 can drive  ethernet at full speed
] so that isn't true.  (Ethernet is 10Mbit/second.
] Not to mention other types of networking (loopback ATM, FDDI),
] all of which are faster for bulk transfers than SunOS 4.x.
] Not to say that this couldn't have been improved in 4.x.

It can't recieve at that speed because the Lance driver sucked.
This is discussed in the 5.x release notes for the 4.x->5.x
"upgrade".

] >As to the VM mechanism, the SLAB allocator has some significant
] >drawbacks with regard to MP.  As you point out, the implementation
] >is largely shared between 4.x and 5.x, so a claim of superiority
] >on that basis is broken.  Both would, in the MP case, be better
] >off with a per processor preallocation page pool (like Sequent
] >uses).  See:
] 
] The Solaris 2.5 memory allocator uses a per-processor cache.

Well, then that's new.  Are there any papers describing this?

Since we are not considering the MP case for 5.x over 4.x
(I gave that one to you), this is somewhat irrelevant.  My
comment regarding the MP case was just a general annoyance
with 5.x after reading the slab allocation papers.  I shouldn't
have introduced the tangent, sorry.


] >There are also significant problems with traversal of mmap'ed
] >files thrashing the buffer cache (the infamous SVR4 "linker
] >cause X to go all funny" problem).
] 
] >The correct fix to the problem isn't to use alternate scheduling
] >classes (cv: UnixWare).  It's to prevent the thrashing.  You
] >could do that rather trivially by implementing per vnode working
] >set quotas that could be overridden with an madvise() call, or
] >you could have a slightly better fix with a lot more work by
] >imposing per process working set restrictions.
] 
] 
] Solaris 2.x isn't cast in stone.  Many traditional Unix algorithms
] don't scale well or at all.  As system grow bigger mor eof these problems
] come to light.  And as time passes more reseach is done and better
] algorithms are implemented.  There's still a lot to be learned
] from mainframes when it comes to large systems and to I/O.
] But the Unix crowd is learning. It's now possible in many Unixes
] not to cache the file just read.  (when you read lts of
] files sequentially you don't want to cache them at all)

Well, except for read-ahead, which adds a copy but subtracts an
I/O latency.

But the point here should not be "why 5.x can be corrected" but
"why 4.x can't be corrected at the same time 5.x is corrected".

Your argument is that people prefer 5.x.  Well, I'm people, and
I prefer 4.x.

The "we can change it" argument is what resulted in ~50% of SVR4
being BSD code.  Chopping the top on an Edsel doesn't make it
a Mustang convertible, even if the Edsel has features the
Mustang does not.  8-).

] >Bzzt.  I "hang out" with kernel hackers who don't necessarily
] >modify their world views to conform to policy statements.  You
] >can pass as many laws as you want, but PI will never be 3.  8-).
] 
] The bulk of the problems you mention here exists in SunOS 4 and
] Solaris 2.x.  But SunOS 4 will see no further changes, Solaris 2
] will.  

Note: the bulk of the problems were 5.x specific.  Not that 4.x
does not have problems, but since I am arguing for it instead of
5.x, I'd be silly to torpedo myself that way.

I'd like to point out that SunOS 4.x seeing no further changes
is a policy decision, not a technical one.


] >In order to get a high resoloution select(), where the timer
] >resolution matches that implied by its parameters in the man
] >page, you have to go to a select(2) rather than a select(3)
] >implementation -- meaning you have to use the 4.x compatability
] >libraries, or suffer with a 10ms timeout resoloution (the same
] >as poll(2), which select(3) is implemented on.
] 
] 
] There is only one select in Solaris 2.x: the libc select.
] That's the one that is called wherever you call select
] from.  It bolted on top of poll().  Neither SunOS 4 select
] not Solaris 2 select honor timeouts smaller than 1/HZ.

There were no guarantess because of the process quantum
utilization potentially causing a program to utilize the
full quantum (10mS) in SunOS 4.x.  But there was the
probability, since the timers were serviced at their call
resoloution to the best of the systems ability, that on
a system not loaded with full quanta-consuming processes,
you would get a better response time.

Average on a lightly loaded 4.x system is better than 200uS.


I'd have to argue that the 4.x binary compatability with
static binaries required support for a select(2), at least
in the aBI environment, and that said facility was implemented
in the 2.3 release.

] However, the older select() had a bug: the sleep time was rounded
] down, not up.

I'm well aware of that in SVR4 select(3)->poll(2) mapping.

] >Can we say "double click" and "mpeg_play"?  8-).
] 
] Not sure hwat you mean here.

Operations that frequently require a better than 10ms (100 events
per second) resoloution (double click is a cheap shot a driver
debouncing).

The only method of assuring better than 100 events per second
on a system with a bad system timer architecture (like SVR4)
is to use idle loops for timing.

] >I'll give you credit, since the select(2) *was* added in Solaris
] >2.3 compatability aBI for 4.x to allow binary compatability
] >with statically linked 4.x applications (a feature missing before).
] 
] Not so, see above.

How do staticlly linked binaries that call select that are
from 4.x run on 5.x?

I think you'll find my name on the bug report.

] The one thing people tend to forget when they talk about putting the
] Solaris 2.x stuff they like into the SunOS 4.x kernel is the simple
] thing that they may end up with similar trouble as the early Solaris 2
] releases.  (For MP machines you need to do kernel multitasking
] or you won't scale well as we saw in 4.x, add fine-grained locking
] and there's the slowdown and bugs you wanted to avoid).
] The Solaris 2.x developers have the hang of it now and Solaris 2.5
] is much improved.

We aren't caring about MP issues, because we know that unless
the problem structure is such that it is parallelizable, you
can't add a bunch of PC's together and get a Cray.

As to non-MP related problems, it depends on who you hire to
write the code, doesn't it?  I mean, you *could* hire some of
the Solaris 2.x developers who already "have the hang of it"
to do the work.


] >I don't see a single *big* feature, other than marketability on
] >government contracts because of the additional checkbox items
] >that don't affect real product functionality.  That makes it
] >change for change sake, in my book.
] 
] But the change is there.  There's no going back.

Policy statement.  No technical merit expressed.

] >That may be true of some people.  On the other hand, there are
] >others (like me) who feel that the academic origin of the BSD
] >code as opposed to the SVR4 code means that it had a 2-4 year
] >develeopement horizon, whereas the SVR4 code was limited to 6
] >month to 1 year cycles because of commercial performance review
] >scheduling caused peoples intellectual integrity to be compromised
] >when it came to making a choice between money and doing the right
] >thing the right way.
] 
] The academic origin of the BSD code didn't mean making a choice
] between doing the right and teh wrong thing.

It meant the difference between research and expediency, said
expediency being brought about by financial pressures.  In such
an environment, expediency wins unless an engineer both risks
his job *and* succeeds.

] BSD introduced a lot of APIs that *all* suck bigtime.
] (non extensible, ill-thoughtout).
] 
] They did well inside the kernel, but IMHO they flunked
] " API design 101".

I agree.  And you'll note that I'm doing my best in the
BSD4.4-Lite dervice code bases to fix this.

Elsewhere in this thread, there have been comments about the
5.x API being nice.  I won't disagree (though I think it is
larger than it needs to be).  It is nice.

But API's are fluff.  The interesting stuff is the engine,
and all the arguments in my "top ten list" at the top of
this article still apply despite the condition of the fluff.

> 
> >The end result is that we think BSD "hangs together" better than
> >SVR4.  Some of us (me included) have tried to fix SVR4 at various
> >times in our careers.
> 
> What do you think is so bad in SV?  The APIs?  The kernel?
> The programs/administrative code?  And which parts of it?

See the list at the top.


] >What is Sun now?  The big fish in the shrinking proprietary
] >hardware pond.  A politically incorrect statement, I admit.
]  
] Proprietary hardware?  Sun will gladly sell you the board designs and
] specs of their latest hardware.  You can license their hardware and their
] software.

I meant "proprietary" as "non-commodity".  Sorry if this was
confusing.

] What amuses me most about this SunOS vs Solaris
] SV vs BSD debate is that people assume SunOS 4 == BSD.

Not me.  I *know* that there is a lot of innovation in 4.x
that came from Sun, only some of which was made public or
duplicated for BSD.

] It was much better than any BSD OS of its eera.
] I mean, you could repartition your disks without having
] to recompile your kernel.

Not true of the 4.4-derived code.

] Much of what people like about SunOS 4 was Sun value-add.
] Lots of that stillis in Solaris 2.x.

The philosophy has changed.


] The other thing people dislike about Solaris 2 is change.

Only when there is no overriding technical justification.

] I found that in many things the change was for the better
] (init.d, jumpstart) Yet it's that change (what, no rc.local!)
] that upsets people most.

Doesn't bother me.  I've been arguing for run states (as opposed
to the slightly different concept, run levels) for BSD since
early 1994.


                                        Terry Lambert
                                        terry@cs.weber.edu
---
Any opinions in this posting are my own and not those of my present
or previous employers.