*BSD News Article 59146


Return to BSD News archive

Path: euryale.cc.adfa.oz.au!newshost.anu.edu.au!harbinger.cc.monash.edu.au!news.mel.connect.com.au!munnari.OZ.AU!spool.mu.edu!howland.reston.ans.net!swrinde!newsfeed.internetmci.com!news.msfc.nasa.gov!sol.ctr.columbia.edu!hamblin.math.byu.edu!park.uvsc.edu!usenet
From: Terry Lambert <terry@lambert.org>
Newsgroups: comp.unix.bsd.netbsd.misc,comp.unix.bsd.bsdi.misc,comp.unix.solaris,comp.unix.aix
Subject: Re: ISP hardware/software choices (performance comparison)
Date: 16 Jan 1996 21:07:43 GMT
Organization: Utah Valley State College, Orem, Utah
Lines: 286
Distribution: inet
Message-ID: <4dh42v$rnv@park.uvsc.edu>
References: <4cmopu$d35@vixen.cso.uiuc.edu> <4crnbe$8a@olympus.nwnet.net> <4cs2kn$kfg@cynic.portal.ca> <4cu7t0$mg5@engnews2.Eng.Sun.COM> <4cv8j1$59k@park.uvsc.edu> <4cvjpk$rpf@durban.vector.co.za> <4d43bt$es8@park.uvsc.edu> <4d5vhg$38p@mail.fwi.uva.nl> <4dbun0$j2f@park.uvsc.edu> <4dg90i$6le@mail.fwi.uva.nl>
NNTP-Posting-Host: hecate.artisoft.com
Xref: euryale.cc.adfa.oz.au comp.unix.bsd.netbsd.misc:1855 comp.unix.bsd.bsdi.misc:1986 comp.unix.solaris:56697 comp.unix.aix:68243

casper@fwi.uva.nl (Casper H.S. Dik) wrote:
] >3)	Non-STREAMS TCP/IP
] 
] Explain why this is an advantage.  Usually this doesn't make
] much of a difference.  Solaris 2.x TCP/IP has some more advanced
] features.  The sockmod/libsocket dichotomy doesn't do the
] maintainability of the socket code much favoiurs though.

Solaris TCP/IP does have some more advanced fetures, like
multicast and RFC 1323 and RFC 1644 (T/TCP support).  Many
routers and terminal servers puke over T/TCP, including Linux,
because they don't expect a SYN or ACK  packet to have data.
These are non-standards conforming TCP implementations, but the
fact remains that it does you little good today.  For RFC 1323,
the PPP Predictor-1 compresson is inhibited because of the
timer.  Take out the timer, and the Predictor-1 revives (see
the recent discussion on the current@FreeBSD.ORG list).

On the whole, these are features that were introduced after
enhancements were no longer being added to the 4.x code: they
are strawmen, even if they were not of questionable value.

A non-Streams TCP/IP is advantageous because of when Streams is
run.  Specifically, the advantage is for UDP and other IP datagram
services.

In a client/server request/response architecture, the problem is
stack latency, not in overall throughput.  With a sliding window
protocol, like TCP/IP, you get one round trip latency averaged
over the entire packet run.  In a request/response implementation
(NetWare, SQL, SMB, etc.), the latency is per packet.

The problem with the latency is when the Streams actually runs:
on the way out of a system call, and at several designated
preemption points within the kernel itself.

The NWU product beat the Native NetWare server on the same hardware
by a small amount in many cases, and really beat it in "PacketBurst"
mode (fixed window, shared ACK),  mostly because the latency was
over several transmissions.  This was even after the MP version's
kernel synchronization added 10% overhead to the measured rate
and the use of ODI drivers (because of the three layer Streams
glue to make them work) added 15%.


] >4)	Available for Motorolla based hadware so a heterogeneous
] >	environment can provide the user with a near-identical
] >	interface acress platforms (Intel isn't an argument here;
] >	remember the 386i?).
] 
] Oh, so you can call running it on Sun proprietary 68K hardware
] an advantage, but running Solaris 2.x on x86 and PowerPC and SPARC
] doesn't count as running on multiple platforms (Ah, I get it:
] you now have a choice of multiple hardware vendors to run Solaris 2
] on, so you're worse of)

4.x ran on SPARC and 68k and x86.
5.x runs on SPARC and PPC and x86, not 68k, and I have yet to
see the released version for the Motorolla Ultra 603/604 board
because of the continued lack of OpenBoot support.

So effectively, you have lost one platform and gained none.

] >5)	Compiles most net sources "out of the box" without
] >	modification or use of a compatability environment.
] 
] Is true for Solaris 2.x for most non-ancient net software.
] It's even more true in Solaris 2.5.

Because of the compatability environment.

] >6)	Large amount of research materials are directly
] >	applicable (papers, code, etc.).
] 
] If you're into OS research, use a research OS.

I do.  It would be nice if the research were convertable to
practice, don't you think?

] >7)	Interfaces for FS writers is usable without source
] >	license.
] 
] I didn't know that it was that easy under SunOS 4/difficult
] under SunOS 5.

You didn't work for Novell trying to drag the documentation
for the synchronization model and API's out of Sun for the
NWFS attributable FS.  I did.  Sun convinced us to go with
5.x for the Sun reference port, then didn't give us the
necessary documentation to duplicate what we already had
running on 4.x.  Despite numerous requests.

] >8)	NFS is reliable, does not violate protocol specification.
] 
] Neither does Solaris 2.x NFS.  Where did you get this idea?

Someone said it was faster with no other justification.  Plus
I had to turn it off on a 2.3 box -- but I admit the possibility
that someone else had turned it on.  If that's the case, it's
change for change's sake, and that's not good.

] >9)	System clock accurate to 4uS; useful for non-statistical
] >	profiling.
] 
] Solaris 2.x system clock accurate to 1us.
] Perhaps you can elaborate on this point.

What is the guaranteed time to run (as opposed to sheduled to run)
after the expiration of a 20uS interval timer?  A 20uS select
tiemout?  I won't ask about poll; I know it is limited to 10mS
resoloution, and thus useless for anthing but statistical profiling
on fast hardware.

Before you suggest HRT's and a RT scheduling class, consider
that the customer will be running the end result in a timeshare
scheduling class.

BTW: you know and I know that the clock uses a probabalistic
offset; it is not *really* "accurate to 1uS".  To do that, it
would have to read the system time, an expensive operation.
Instead, it keeps a count of how long since the last update and
adds a "fudge factor" to get the additional resoloution.  That
is, the accuracy is to 10mS, the system clock update frequency.


] >10)	Select timeout resoloution is sufficient for finite
] >	state automaton based parallel service engines without
] >	requiring buzz loops.
] 
] The select timeout resolution in SunOS 4 was 1/HZ.
] There's one difference though: SunOS 5 select rounds down to
] no sleeping wherasa SunOS 4 select with a non-zero timeout will
] always sleep atleast 1/HZ (Sun bug 1159865)
] (SunOS 4 rounds the sleep time up to the nearest 1/HZ,
] Solaris 2.x rounds down to the nearest ms, calls poll which
] rounds up to the nearest 1/HZ)

This is the guarantee, assuming the process quantum of the
processes between the time the timer expires and the process
that issues the request is fully utilized.

Generally, this will not be the case... that is, you can expect,
but not rely upon, better resouloution in the 4.x case.

For profiling, etc., this means that on an unloaded machine,
you will get better numbers.

] >] How did you handle page faults and disk/tape I/O?
] 
] >You mean how did I establish a dedicated signal handling
] >thread, or are you talking about implementing Mach-style
] >non-resident pagers (why would I have to do that?).
] 
] No, I wondered how you prevented *all* your LWPs from blocking
] when one of them incurred a pagefault.

I didn't.  I relied on locality of reference.  If I'd needed to
worry about this type of thing, I could have gone to wuarchive
and pulled the loadable system call to cause the paged to be
locked in core, and thus prevented paging.

You are, of course, referring to the I/O latency in servicing a
not-present page.  Multiple kernel threads would mean I could
interleave those requests instead of servicing them serially.
Is this your point?

] >] What do you mean by this?  In Solaris 2.x as well
] >] as SunOS 4.x all server update operations are synchronous.
] >] Clients employ write behind in both OSes.
] 
] >Server caching, which is not on by default for 5.x.
] 
] You mean "async writes" as SGI does?  I don't think Solaris 2.x
] has an option for that, I havent' found it.  It's just synchronous
] writes to disk or "buy prestoserve".

Prestoserve relies on the ability to tell it to do async.  As
I've said before, I've had to turn this off on a non-NVRAM
machine, though it could have been someone else (besides Sun)
that turned it on.

] >It can't recieve at that speed because the Lance driver sucked.
] >This is discussed in the 5.x release notes for the 4.x->5.x
] >"upgrade".
] 
] Hm, I distincly remeber running near wire speed TTCP between SunOS 4.x
] le boxes.  You're not confused with the ie (Intel) ethernet chip
] as found in the older VME based servers?  That one didn't do
] to well.

No.  I'm talking about the double-buffering of received packets
on the Lance.  Refer to the discussion of latency above.  The
problem is that the pool retention time required by the 4.x
implementation was higher than allowed by the available Lance
buffers.

On the other hand, if I gave this point (I don't), then the claims
of 5.x being faster than 4.x are specious.

] >Well, except for read-ahead, which adds a copy but subtracts an
] >I/O latency.
] 
] You could "page flip" instead.

Better to read ahead and save the I/O latency.  Most modern
machines are I/O bound, not CPU bound (which is why clock
doubled Intel chips are so funny).

] >But the point here should not be "why 5.x can be corrected" but
] >"why 4.x can't be corrected at the same time 5.x is corrected".
] 
] What do you think SunOS 4.x would have looked liked after being
] corrected to include all the features people miss in SunOS 4?

A lot like BSD-4.4.

] I'm pretty sure people would have complained it was buggy and
] slow. (Look at SunOS 3 vs SunOS 4 debates back then)

I doubt it.  Look at the FreeBSD 1.x vs. 2.x benchmarks.

] And that adding SMP to 4.x wasn't all that easy.  They basically
] gave up when doing 4.1.2 and punted to basically one single lock.

Hey, that's broken.  I won't argue that it isn't.  But it's not
an issue of total architecture incompatability, it's an issue
of the implementation choices the engineers made.

] How would 4.x with Sun SMP have worked.  It's the high end machines
] tnd the concurrency hat gave the bulk of the trouble in the early
] Solaris releases.  I don't belief that those problems could have
] been avoided just by building on a SunOS 4 code base.

Nor were they avoided in SVR4.  Eventually you have to stop
pushing off the work onto different subsystems and code the
thing, no matter what base you come from.  I know that much
of the ES/MP work (the Unisys/USL SVR4 MP effort) was parallel
to the Sun MP developement.

] >Average on a lightly loaded 4.x system is better than 200uS.
] 
] On an unloaded system I still get 10000 us responses from a 1 us
] select call, if no selectable events occur.

I don't know what you are running, but it's not 4.1.3_U1.  I
wrote several animation packages that used X and a select
timeout for load independent cell-based frame rates.  I could
definitely percieve a difference between a NULL timeval * poll,
a 100uS, a 200uS, a 400uS and a 1mS timeout.


*Cleary* this dependended on a load average < 1 to operate
correctly.


In UnixWare and Solaris (SVR4 derivatives) I had to go to buzz
loops because 10mS was too damn slow.

If you are correct, then you are arguing that process scheduling
overhead on Solaris is 50 times more perceptible (the difference
in resoloution vs. a NULL poll) than it wa on 4.x.  And that's
very bad (not that I think you are actually right).


] gettimeofday() works fine if you want to have microsecond accuracy.
] Select doesn't have us accuracy in SunOS 4.


] >I think you'll find my name on the bug report.
] 
] Id?

I have no idea.  It was more than 2 years ago.

] It's been a long way from the top of this article, but
] I think I only gave you 1 or 2 out of 10.

I think that ratio is inverted.


					Regards,
                                        Terry Lambert
                                        terry@cs.weber.edu
---
Any opinions in this posting are my own and not those of my present
or previous employers.