*BSD News Article 20371


Return to BSD News archive

Newsgroups: comp.os.386bsd.bugs
Path: sserve!newshost.anu.edu.au!munnari.oz.au!news.Hawaii.Edu!ames!agate!doc.ic.ac.uk!uknet!mcsun!sun4nl!izfcs!tom
From: tom@izf.tno.nl (Tom Vijlbrief)
Subject: Re: Peculiar SLIP/serial line hang + FIX (patch)
Message-ID: <CCs29y.DEu@izf.tno.nl>
Organization: TNO Institute for Perception
References: <CCqqxG.3Cs@bernina.ethz.ch>
Date: Fri, 3 Sep 1993 12:20:21 GMT
Lines: 113

torda@igc.ethz.ch (Andrew Torda) writes:

>I have a small problem with using slip. It was more common
>with 386BSD, but still happens with NetBSD 0.9.

>If I send a boatload of data over the slip line (like an ftp
>transfer) eventually I will get a com0 silo overflow. No
>surprise.
>Some time afterwards, packets will stop moving between my
>machine and the other end. This is quite clear from the lack
>of activity and the lack of flashing lights on the modem.

>Now, the interesting and reproducible part.
>If I then, open the serial port with another process (like
>tip directly to the port), there is a flurry of activity and
>slip packets move again.

>I am so baffled, I don't even know where to start looking
>for an explanation for this behaviour.
>-Andrew
>-- 
>Andrew Torda, Computational Chemistry, ETH, Zurich, torda@igc.ethz.ch

I had the same problem on my 486DX2/50 notebook with Cirrus chipset.

Somehow a serial line transmitter ready interrupt is lost and the
serial driver waits forever for the lost interrupt.

I created and posted a patch a few months ago which somehow got lossed
or it was not included because it is dirty and it is a workaround
for a problem which is really hardware related. Anyway here is the original
posting.


Hi Patchlist maintainer,

I experienced trouble with slip on both a 486 Notebook
and a Compaq 386sx deskpro. After sending a few megabytes
of data slip hangs, and netstat shows a rising oerr
count and ping reports 'no more buffer space'.

This is probably caused by a missed transmitter queue empty interrupt,
at which point the queue is just filled to its max length and all
packets are dropped.
(I experienced the same lost interrupt problem when debugging a serial line
driver routine for a Messy DOS slip router). This patch detects this situation
and restarts by a call to slstart.

Tom Vijlbrief

*** org/if_slvar.h	Sun Apr 18 14:35:52 1993
--- if_slvar.h	Sun Apr 18 14:37:26 1993
***************
*** 62,67 ****
--- 62,69 ----
  	long	sc_lasttime;		/* last time a char arrived */
  	long	sc_starttime;		/* last time a char arrived */
  	long	sc_abortcount;		/* number of abort esacpe chars */
+ 	/* PATCH, tom@izf.tno.nl */
+ 	u_int	sc_prevqueue;		/* previous queue length (detects deadlock) */
  #ifdef INET				/* XXX */
  	struct	slcompress sc_comp;	/* tcp compression data */
  #endif
*** org/if_sl.c	Sun Apr 18 14:39:51 1993
--- if_sl.c	Sun Apr 18 14:42:00 1993
***************
*** 89,94 ****
--- 89,96 ----
  #include "kernel.h"
  #include "conf.h"
  
+ #include "syslog.h"
+ 
  #include "if.h"
  #include "if_types.h"
  #include "netisr.h"
***************
*** 388,393 ****
--- 390,405 ----
  		return (0);
  	}
  	s = splimp();
+ 
+ 	/* PATCH: Detect hanging output queue, tom@izf.tno.nl */
+ 	if (sc->sc_prevqueue
+ 	   && RB_LEN(&sc->sc_ttyp->t_out) == sc->sc_prevqueue
+ 	   && (((time.tv_sec - sc->sc_if.if_lastchange.tv_sec) * hz +
+ 	   (time.tv_usec - sc->sc_if.if_lastchange.tv_usec) * hz / 1000000) > 1)) {
+ 		log(LOG_NOTICE, "slip queue hangs, restarting\n");
+ 		slstart(sc->sc_ttyp);
+ 	}
+ 
  	if (IF_QFULL(ifq)) {
  		IF_DROP(ifq);
  		m_freem(m);
***************
*** 397,403 ****
  	}
  	IF_ENQUEUE(ifq, m);
  	sc->sc_if.if_lastchange = time;
! 	if (RB_LEN(&sc->sc_ttyp->t_out) == 0)
  		slstart(sc->sc_ttyp);
  	splx(s);
  	return (0);
--- 409,416 ----
  	}
  	IF_ENQUEUE(ifq, m);
  	sc->sc_if.if_lastchange = time;
! 	/* PATCH: tom@izf.tno.nl */
! 	if ((sc->sc_prevqueue= RB_LEN(&sc->sc_ttyp->t_out)) == 0)
  		slstart(sc->sc_ttyp);
  	splx(s);
  	return (0);