*BSD News Article 58305

Newsgroups: comp.unix.bsd.bsdi.misc,comp.unix.advocacy
Path: euryale.cc.adfa.oz.au!newshost.anu.edu.au!harbinger.cc.monash.edu.au!news.mel.connect.com.au!munnari.OZ.AU!news.ecn.uoknor.edu!paladin.american.edu!zombie.ncsc.mil!news.mathworks.com!newsfeed.internetmci.com!howland.reston.ans.net!ix.netcom.com!netcom.com!bakul
From: bakul@netcom.com (Bakul Shah)
Subject: Re: multiple httpds vs threads vs ... (was BSDI Vs. NT...)
Message-ID: <bakulDK9GrA.B2p@netcom.com>
Organization: NETCOM On-line Communication Services (408 261-4700 guest)
References: <taxfree.3.00C439A1@primenet.com> <4be592$6tb@madeline.ins.cwru.edu> <4bhfmp$gei@Mars.mcs.com> <DK5Crs.I77@metrics.com> <4bmsjp$7lv@elf.bsdi.com> <bakulDK7u6M.LrM@netcom.com> <4brlt5$8un@sungy.Germany.Sun.COM>
Date: Wed, 27 Dec 1995 19:57:09 GMT
Lines: 61
Sender: bakul@netcom22.netcom.com
Xref: euryale.cc.adfa.oz.au comp.unix.bsd.bsdi.misc:1860 comp.unix.advocacy:12683

Casper.Dik@Holland.Sun.COM writes:
>That's not correct.  Most modern implementations allow you to redefine
>FDSET_SIZE to anything you please by defining FDSET_SIZE before including
>the header that usually defines it (the header defines FDSET_SIZE only
>if not previously defined).

You can redefine FDSET_SIZE but the problem is that the *kernel*
routine implementing select() on some implementations uses
FDSET_SIZE or some such _compile time_ parameter to allocate
space on the kernel stack for select args.  So if your program is
compiled with a bigger FDSET_SIZE that the one used by the
kernel, you lose on fds beyond the kernel limit.  If the kernel
dynamically allocated space for select fdset args, a user can use
as large a fdset as his fd limit.

You should be able to use select() on any fd you can get but this
is not true for most select() implementations.

>System V Poll() is much better in that respect, as it gives the application
>full control over how many fds you use.  The set-size is yours to specifiy.
>(And poll allows to select for more than just read/write/exception).

It _is_ better but portable code can't rely on it exclusively.

[Speculation mode on]
Ideally I'd like to break up poll into a set of more efficient
syscalls.  Something like add_pollset(), remove_pollset(),
get_pollset() and check_pollset().  add_pollset adds new fds to
the set of fds we are interested in.  remove_pollset removes
them.  get_pollset tells us about the current set of interesting
fds.  check_pollset returns a set of fd on which interesting
events happened.  I'd also add a signal so that I don't have to
call check_poll until I am sure there is something waiting.
Given these, current poll() interface can be implemented
something like:

int
poll(struct pollfd * fds, int nfds, int timeout) {
	int n;
	struct pollfd fds_copy[nfds];
	struct timeval timeval = {timeout/1000, timeout%1000};
	memcpy(fds_copy, fds, nfds*sizeof(*fds));
	add_pollset(nfds, fds_copy);
	n = check_pollset(nfds, fds, timeout == -1 ? 0 : &timeval);
	remove_pollset(nfds, fds_copy);
	return n;
}

[Ofcourse, it would be silly to make three syscalls for poll() so
it would remain a syscall but something similar does happen in
the kernel.  Also, struct pollfd is not the ideal type for
{add,remove,get,check}_pollset but I wanted to keep the example
simple]

Typically the fd set of interest changes far less frequently than
the number of poll/select calls you make (you add/remove fds to
the set as clients arrive/leave).  At present the kernel has to
process every fd in the set every time poll/select is called,
which is quite wasteful for large fd sets.

Bakul Shah <bakul@netcom.com>