*BSD News Article 7028

Newsgroups: comp.unix.bsd
From: jes@grendel.demon.co.uk (Jim Segrave)
Path: sserve!manuel.anu.edu.au!munnari.oz.au!uunet!pipex!demon!grendel.demon.co.uk!jes
ReplyTo: jes@grendel.demon.co.uk
Subject: Re: ISA/DMA in 386BSD
References: <BILL.92Oct24005633@kepler.ucsd.edu>
Distribution: world
X-Mailer: cppnews $Revision: 1.19 $
Organization: None
Lines: 119
Date: Sun, 25 Oct 1992 12:24:47 +0000
Message-ID: <720041087snx@grendel.demon.co.uk>
Sender: usenet@gate.demon.co.uk


In article <BILL.92Oct24005633@kepler.ucsd.edu> bill@kepler.ucsd.edu (Bill Reynolds) writes:
> My first question is regarding the nature of the problem. I'm not a
> unix/cs guru. My understanding of exactly what all these terms mean is
> sketchy, so I'd appreciate some clarification. Currently, my
> understanding is the following: There is a facility whereby a device
> can directly access the system memory, without having to tie up the
> cpu (DMA). Unfortunately, if the memory that the device wants to get
> at is above some limit (16M), the (24 bit ISA) address overflows, and
> there is hell to pay. So if a device wants to perform DMA, the
> operating system must first copy the relevant region of memory
> someplace below the ISA ceiling, whereupon the DMA can proceed
> normally. So, on to the first question: Is this right?
Two answers:
If you use the DMA controller chip on an ISA motherboard, there is no
standard (ie. IBM AT's do it this way) method of generating an address
greater than 0xffffff. I am told, though I don't know the details, that
some motherboards have an extension register which can be used to set
higher address bits so that DMA can be performed to address >=
0x1000000.
It you have a bus-mastering DMA card - say a SCSI or network card -
which contains its own DMA controller, then on an ISA bus it is
restricted to DMA below 0x100000, since there are only 24 address lines
on the ISA bus.
Hence a general solution for DMA requires using a buffer in the first
16Mb and copying if the source/target address for DMA writes/reads is
above 16Mb.
> If so, how much
> of a hit is the memory copy/DMA process over a straight DMA? How great
> are the advantages of a device (e.g. a disk controller) that does DMA?
Not an enormous amount for devices which provide data in blocks. DMA
controllers releive the CPU of the overhead of an interrupt acknowledge
and a block copy to move data which sounds like it would save a lot. On
the other hand, to get good transfer rates, a DMA controller will sieze
the bus for the duration of a transfer. The CPU will be unlikely to get
much processing done during this time, since the first time it needs to
make a memory reference, it will have to wait until the DMA is complete.
For medium speed devices, where data arrives faster than an interrupt
service routine can process it but substantially slower than the CPU can
move data from I/O ports to memory, DMA is a big win. Otherwise the CPU
would have to wait from the time the first data arrived until the end of
a block in a tight polling loop. Discs and network cards buffer data
(sectors/tracks or messages, so the entire data block is available on
the card before any transfer begins. For these, it's much of a muchness
whether or DMA or CPU I/O<->memory transfers are used.
> Enough to justify buying an EISA system, which does not have this
> memory addressing limit (as I understand it)? Do the other advantages
> of EISA justify the expense?
The cost of copying blocks from below 16M to above is probably not large
enough to justify EISA on its own. When combined with peripheral cards
with 32 bit data busses, the performance increase is likely to be
substantial. At the moment, there aren't that many EISA cards with
driver support, so unless raw performance is a critical issue, I
wouldn't bother.
> The next question is an extension of the previous one - if a device is
> sitting on a VL bus, which is (as I understand) 32 bits wide, will it
> be subject to the same limitation on DMA? I just read a post that says
> that while a local bus is busy, the CPU will wait, leading to a big
> lose for a multitasking OS like unix. Is this true? If so, how bad
> will it impact overall performance?
A device on the local bus which contains it's own DMA logice (or on a
motherboard which provides extensions to it's DMA controllers to allow
 > 24 bit addressing will not suffer from these limitations - eg. the
perfromance will be as good as or better than an EISA device.
Whenever the local bus is busy servicing some other device, the CPU will
have to wait. The same is true if the CPU is using an ISA bus and DMA is
going on. I fail to see how this will have any major impact on
multi-tasking OSes. Certainly it is possible to build a machine with
multiple data busses so that, for example, DMA could be using one bus to
transfer data to memory below 16Mb while the CPU is fetching
instructions from memory above 16Mb, but this gets *real* expensive.
> My next question regards the following code fragment from
> /usr/src/sys.386bsd/i386/isa/isa.c, which seems to perform the copy
> described above.
> 
> 	if (isa_dmarangecheck(addr, nbytes)) {
> 		if (dma_bounce[chan] == 0)
> 			dma_bounce[chan] =
> 				/*(caddr_t)malloc(MAXDMASZ, M_TEMP
> 							, M_WAITOK);*/
> 				(caddr_t) isaphysmem + NBPG*chan;
> 		bounced[chan] = 1;
> 		newaddr = dma_bounce[chan];
> 		*(int *) newaddr = 0;	/* XXX */
> 
> 		/* copy bounce buffer on write */
> 		if (!(flags & B_READ))
> 			bcopy(addr, newaddr, nbytes);
> 		addr = newaddr;
> 	}
> 
> 
> This code seems to do exactly what was just described: it checks to
> see if the requested piece of memory is reachable, if it is not, it
> decides on a new address to which it will bcopy the data to. However,
> the line marked /* XXX */ then sets this adress to 0. My impression is
No it sets an int sized portion at this new address to zero - note the *
in front of the cast.
> that this code is incomplete. This leads to my next question: What is
> going on here? If it's broken, what needs to be done to fix it? Is it
> a lot of work?  If possible, how much work would be involved to extend
> this to a VL-Bus?
> 
> Sorry for all the questions, but from reading stuff on the net, it
> seems that I'm not the only one confused about the limitations (and
> terminology) of the current pc technology, so if I get some good
> explanations, I'll be sure to summarize. Thanks for the help.
> 
> 
> 
> --
> _______________________________________________________________________
> Bill Reynolds        |            **** HE'S LYING ****
> bill@kepler.ucsd.edu |	
> 
> 
--
Jim Segrave (Segrave Software Services)     jes@grendel.demon.co.uk