*BSD News Article 35897


Return to BSD News archive

Path: sserve!newshost.anu.edu.au!harbinger.cc.monash.edu.au!msuinfo!agate!howland.reston.ans.net!swrinde!news.uh.edu!news.sccsi.com!nuchat!! ()
From: ()
Newsgroups: comp.os.386bsd.questions
Subject: Heavy SLIP load = Lost IP Packets?
Date: 16 Sep 1994 21:47:35 GMT
Organization: South Coast Computer Services (sccsi.com)
Lines: 42
Distribution: world
Message-ID: <35d3pn$icb@tattoo.sccsi.com>
NNTP-Posting-Host: wort.appsmiths.com

Hi,

I'm running FreeBSD 1.1.5.1, operating as a gateway.  We are connected at 38400 
to our service provider, and are running ethernet locally through a 3c509.  For 
the most part, things work fine.

When we get a lot of open sockets running simulataneously over the SLIP 
connection (e.g. 5 sendmails scratching) we start losing packets in a curious 
way.  Although there are a lot of pieces to the puzzle, this is what we have 
seen.  I'm hoping someone else has witnessed comparable madness and can comment 
on its source, or suggest further tests to be run.

1) Things boogie along good for an indeterminate amount of time (5mins to 
several hours).  Lots of modem traffic, lots of stuff moving.  All of a sudden, 
communications ceases, and eventually sendmail backs off.  The situation never 
corrects itself.

2) we can ping across the link, but it is strange.  We see the packet go out 
the modem, and the reply come in.  Ping does not see it.  After sufficient 
pings go out (and I suppose come back in as well), we will then see the first 
ping packet (iseq=1) come in, some 10-15 seconds late.  TCPDUMP shows about the 
same scenario.  This fact leads me away from pointing fingers at serial boards 
and modems.  Where would a valid iseq=1 packet have come from?

3) If we ping quickly after the failure, our pinging seems to "push"  some of 
the other (missing) packets from the sendmails back through the machine, which 
keeps them talking, albeit slowly.

4) Killing (with -HUP) slattach and letting it redial corrects the problem, 
until the next time it goes on.

5) We cant seem to generate a similar failure with a single FTP.  It will run 
all night, making me think its the *number* of sockets, not the traffic per se.

In my mind, it almost looks like packets are getting "stuck" in some incoming 
buffer, and are only getting handed out to processes when the buffer fills back 
up, but I'm just guessing here.

Any help would be appreciated.