*BSD News Article 72881

Newsgroups: comp.os.linux.networking,comp.unix.bsd.netbsd.misc,comp.unix.bsd.freebsd.misc
Path: euryale.cc.adfa.oz.au!newshost.anu.edu.au!harbinger.cc.monash.edu.au!munnari.OZ.AU!news.ecn.uoknor.edu!news.eng.convex.com!newshost.convex.com!newsgate.duke.edu!news.mathworks.com!news-res.gsl.net!news.gsl.net!uwm.edu!news.nap.net!news1!not-for-mail
From: "John S. Dyson" <toor@dyson.iquest.net>
Subject: Re: TCP latency
X-Nntp-Posting-Host: dyson.iquest.net
Content-Type: text/plain; charset=us-ascii
Message-ID: <31DC8EBA.41C67EA6@dyson.iquest.net>
Sender: root@dyson.iquest.net
Bcc: toor@dyson.iquest.net
Content-Transfer-Encoding: 7bit
Fcc: /usr6/root/nsmail/Sent
Organization: John S. Dyson's home machine
References: <4paedl$4bm@engnews2.Eng.Sun.COM> <4qaui4$o5k@fido.asd.sgi.com> <4qc60n$d8m@verdi.nethelp.no> <31D2F0C6.167EB0E7@inuxs.att.com> <4rfkje$am5@linux.cs.Helsinki.FI>
X-Mozilla-News-Host: snews2.zippo.com
Mime-Version: 1.0
Date: Fri, 5 Jul 1996 03:56:17 GMT
X-Mailer: Mozilla 3.0b5Gold (X11; I; FreeBSD 2.2-CURRENT i386)
Lines: 130
Xref: euryale.cc.adfa.oz.au comp.os.linux.networking:44078 comp.unix.bsd.netbsd.misc:3930 comp.unix.bsd.freebsd.misc:22868

Linus Torvalds wrote:
> 
> In article <31D2F0C6.167EB0E7@inuxs.att.com>,
> John S. Dyson <dyson@inuxs.att.com> wrote:
> >Steinar Haug wrote:
> >
> >> Pentium local           250 usec
> >> AMD Linux local         330 usec
> >> AMD FreeBSD local       350 usec
> >> AMD Linux -> Pentium    420 usec
> >> AMD FreeBSD -> Pentium  520 usec
> >>
> >> So the difference is quite noticeable. Wish I had another P133 here to
> >> test with, but unfortunately I don't.
> >>
> >All this TCP latency discussion is interesting, but how does this
> >significantly impact performance when streaming data through the
> >connection?  Isn't TCP a streaming protocol?
> 
> No. TCP is a _stream_ protocol, but that doesn't mean that it is
> necessarily a _streamING_ protocol.
> 
Okay, you CAN kind-of misuse it by using TCP for a single transaction,
like simple HTTP transactions.  That is the reason for the implementation
of the so far little used protocol extension TTCP.  (FreeBSD has it
for example.)  Also, there are advanced features in www browsers/servers
like Netscape where the connection is kept up for more than one transaction.
(Why be silly to re-establish a connection, when you could have kept the
previous one up?) 
 
> But many applications don't really care about bandwith past a certain
> point (they might need only a few kB/s), but latency can be supremely
> important. The TCP connection might be used for some kind of interactive
> protocol, where you send lots of small request/reply packets back and
> forth (*).
>
With many/most web pages being 1-2K, the transfer rate starts to
overcome the latency, doesn't it?  For very small transactions, maybe
100 bytes the latency is very very important.  How many web pages are that
small???

Now I can understand that there might be specific applications where there
are only a few hundred bytes transferred, but those appear to be in the
minority. (Especially where it is bad that a latency of 100usecs worse
is bad in a SINGLE THREADED environment.)  Note -- in most single threaded
environments, 100usecs is in the noise.
 
> (*) Now somebody is bound to bring up UDP, but no, UDP is _not_ a good
> protocol for many of these things. If you need good performance over
> wildly different network connections, and require in-order and reliable
> connections, UDP sucks. You'd end up doing all the work TCP does in user
> space instead, and TCP would be a lot more efficient. UDP is fine for
> _certain_ applications, but if you think you should always use UDP for
> request/reply, you're wrong.
>
Never mentioned UDP -- TTCP is better though.  TTCP gives some of the
advantages of UDP though.

> >                                                       Data
> >througput starts overshadowing connection latency quickly.
> 
> Definitely not.  It depends on the application, and neither is "more
> important".  However, getting good throughput is usually a lot easier
> than getting good latency - often you can just increase buffer sizes
> (increase the TCP window, increase the MTU).  Getting lower latency is a
> lot harder: you have to actually fix things.  That's why I personally
> think latency numbers are a lot more indicative of system performance.
>
There are a few applications that need very low latency, but remember,
latency != CPU usage also.  You might have a 100usec additional latency,
but that might be buried by another concurrent connection...  As long
as the latency doesn't tie up the CPU, and you have many multiple streams,
it isn't very important, is it?  (Unless you have realtime requirements
in the region of 100usecs.)  I guess it is possible that the application
have a 100usec realtime requirement, isn't it?  :-).

Retorical question: are all of the pipelined CPU's "low quality" because
their latency is long, but their execution rate is fast???  They do
things in parallel, don't they?  (Scheduling 101 :-)).

> 
> Wrong. TCP latency is very important indeed. If you think otherwise,
> you're probably using TCP just for ftp.
>
I guess FreeBSD-current makes it up by being faster with the fork/execs
done by simple www servers. (About 1.1msecs on a properly configured
P5-166.)

> 
> Think Quality (latency) vs Quantity (throughput).  Both are important,
> and depending on what you need, you may want to prioritize one or the
> other (and you obviously want both, but that can sometimes be
> prohibitively expensive).
> 
The quality vs. quantity is interesting, since I consider for certain
applications, slower transfer rates *significantly* impact quality.
The ability to handle many many concurrent connections seriously impacts
quality. Some very low quality trivial algorithms might work well in the
single threaded trivial cases such as lmbench.  (Remember the 1.2.x scheduling
algorithm that had a really bad scalability?)  One connection with a 100usec
latency difference makes little difference.  I guess that I am thinking in
terms of large scale servers, where the perf is needed and not single
user NT-clones... IMO, The most valid measurement is to measure the
quality (latency) with 1000's of connections...

Remember, many times, the more efficient algorithms under load (when you
need them) are sometimes a bit slower in the degenerate case (trivial
benchmarks.)  We have done the same thing with our pageout algorithm
in FreeBSD-current.  We have TWO algorithms that each work better under
different conditions.  Unfortunately, our default algorithm that works
best under load is slower than our algorithm that works fastest under
light load.  Of course, in order to support our user base the best, we
sacrifice a bit of our benchmark perf, to make our system run better
under load, when the perf is really needed.  If we enable the low quality LRU
algorithm, the perf would look better on certain TRIVIAL benchmarks.

Like I implied before, it doesn't appear that the benchmark measures
what users will see as "performance."  The FreeBSD team consiously sacrifices
benchmark perf for "quality."  (That is real world perf.)

I guess what I am saying is that the results would look more credible
with a real load, where the networking code would be exercised more.

One last comment -- if you notice that FreeBSD isn't that much slower in
the localhost case, it appears that the difference is that the driver could
use some work.  It appears that we could be seeing the performance difference
of the specific DRIVER under LIGHT load :-), NOT the networking code
specifically.

John