*BSD News Article 86821


Return to BSD News archive

Path: euryale.cc.adfa.oz.au!newshost.carno.net.au!harbinger.cc.monash.edu.au!munnari.OZ.AU!news.Hawaii.Edu!ames!enews.sgi.com!news.sgi.com!news.bbnplanet.com!su-news-hub1.bbnplanet.com!arclight.uoregon.edu!newsfeed.direct.ca!nntp.portal.ca!cynic.portal.ca!not-for-mail
From: cjs@cynic.portal.ca (Curt Sampson)
Newsgroups: comp.os.linux.misc,comp.os.linux.networking,comp.os.linux.setup,comp.unix.bsd.bsdi.misc,comp.unix.bsd.misc,comp.os.linux.advocacy,comp.unix.advocacy
Subject: Re: Linux vs BSD
Followup-To: comp.os.linux.networking,comp.os.linux.advocacy,comp.unix.bsd.misc,comp.unix.advocacy
Date: 26 Jan 1997 12:40:33 -0800
Organization: Internet Portal Services, Inc.
Lines: 84
Message-ID: <5cgfg1$jgf@cynic.portal.ca>
References: <32DFFEAB.7704@usa.net> <5c39sk$ddl@troma.rv.tis.com> <5c8jlm$50u@cynic.portal.ca> <m23evrulla.fsf@desk.crynwr.com>
NNTP-Posting-Host: cynic.portal.ca
Xref: euryale.cc.adfa.oz.au comp.os.linux.misc:152550 comp.os.linux.networking:65080 comp.os.linux.setup:92336 comp.unix.bsd.bsdi.misc:5614 comp.unix.bsd.misc:1924 comp.os.linux.advocacy:79911 comp.unix.advocacy:33748

In article <m23evrulla.fsf@desk.crynwr.com>,
Russell Nelson  <nelson@crynwr.com> wrote:

>cjs@cynic.portal.ca (Curt Sampson) writes:
>
>> 1. Design and source code quality. The quality of the design and
>> source code in the BSD kernels is far, far above that of Linux.
>
>Not clear about that.  For example, BSD uses mbufs, while Linux uses
>sk_buffs.  With an sk_buff, you have a linear buffer, which can be
>copied in one loops.  With mbufs, you need to copy chunk to chunk to
>chunk.  The setup time is not insignificant.

Yes, this makes copying from (or occasionally to) a chain of mbufs
slightly slower than doing it with a single sk_buf. However, the
hit is not nearly as bad as it seems, since if the data are large
enough enough to fill more than two mbufs they are put into a 2K
mbuf cluster instead. But linking chains of buffers has a several
advantages over the Linux way of doing things.

1. When reassembling fragments, you just link your chains together
to get your packet. Linux has to copy all of the fragment data to
do the reassembly. Doing 5 copies of about 1.5K each every time an
NFS packet comes in is a pretty big loose. (This may in part explain
Linux's poor NFS performance.)

2. It's easy to prepend data to an mbuf chain, merely by linking
another mbuf to the front. (You might have to do this if you have
a protocol stack that has a particularly long--or variable--set of
headers.) With sk_bufs, you have to recopy the entire packet if
you didn't preallocate enough empty space at the front of the buffer
when you initially set up the packets.

3. Since datagrams are built up by linking together as many mbufs
and clusters as you need, it's possible to pre-allocate a pool of
them so that they are ready-to-hand in time-critical areas. This
is used in the Etherlink III chipset driver, for example, so that
when a packet is received the requisite mbufs are there, rather
than having to be allocated in the interrupt service routine. When
the buffer falls below a low water mark, a separate routine outside
of the interrupt handler refills the pool.

You could do this with sk_bufs as well, except you've got the
problem of deciding how big you want them before you know the packet
size. If you choose a single large size, you waste a lot of memory.
So more likely you'd want to keep two pools of two sizes, which is
more work, and you'd still waste more memory than you do with mbufs
anyway.

4. Mbufs and clusters, being all the same size as each other, should
be cheaper to allocate and track than the variable sized chunks
that are allocated for the data areas of sk_bufs. In fact, in the
BSD implementation clusters are allocated from a separate pool,
and are faster to allocate than memory from the regular kernel
malloc pool. Mbufs come from the regular kernel malloc pool, but
this isn't a problem when you pre-allocate them as in point 3 above.

In addition, the general implementation of the networking code in
BSD is far, far cleaner than that in Linux. An sk_buf is full of
various structures relating to networking protocols (mostly from
the TCP/IP suite). This is not only a gross abstraction violation
(of a sort common throughout the Linux networking code) but also
has a good chance of breaking binary compatability (of things like
LKMs) every time you add or remove a network protocol. 

It really looks as if someone wrote a TCP/IP over Ethernet stack
and since then other people have been gluing other protocols on to
it however they can. It's far from elegant.

>Also, the BSD development has fragmented.  You've got OpenBSD,
>FreeBSD, NetBSD, and BSDI.  So you can't talk about "BSD", you have to
>talk about "the BSDs".

In the same way you have to talk about `the Linuxes.' Since we're
discussing the complete OS here, not just the kernel, there are
far more Linuxes than BSDs.

I've set followups to go to slightly more appropriate newsgroups.

cjs
-- 
Curt Sampson    cjs@portal.ca		Info at http://www.portal.ca/
Internet Portal Services, Inc.	
Vancouver, BC   (604) 257-9400		De gustibus, aut bene aut nihil.