*BSD News Article 2415


Return to BSD News archive

Newsgroups: comp.unix.bsd
Path: sserve!manuel!munnari.oz.au!uunet!cis.ohio-state.edu!zaphod.mps.ohio-state.edu!wupost!darwin.sura.net!uvaarpa!cv3.cv.nrao.edu!laphroaig!cflatter
From: cflatter@nrao.edu (Chris Flatters)
Subject: Re: Jolitz 386BSD-0.1 -- floating point perform
Message-ID: <1992Jul24.161646.22896@nrao.edu>
Sender: news@nrao.edu
Reply-To: cflatter@nrao.edu
Organization: NRAO
References: <l6qc51INN1gu@neuro.usc.edu>
Date: Fri, 24 Jul 1992 16:16:46 GMT
Lines: 50

In article l6qc51INN1gu@neuro.usc.edu, merlin@neuro.usc.edu (merlin) writes:
>I have most of the US Army BRLCAD three dimensional CSG modeling and
>distributed ray tracing system ported to the Jolitz 386BSD-0.1.  But,
>I am getting only about one fifth of the floating point performance
>previously measured using AT&T pcc and GNU gcc 1.4x on ATT UNIX SYSV.
>
>Does the compiler default to '387 emulation?  Is there some flag which
>needs to be set to actually use the coprocessor?  Or are there reasons
>386BSD-0.1 would exhibit relatively poor floating point performance?

I ran some checks last night and 386BSD is certainly exploiting the coprocessor.
These are the results from the Plum2 benchmark (See section 8.2 of "C++
Programming Guidelines" by Thomas Plum and Dan Saks.  The results are
the average time for a register int, auto short, auto long and auto float
operation and the average time to call and return from an empty function.
Times are in nominal milliseconds (CLOCKS_PER_SEC was missing from <time.h>
so I guessed a value of 100 --- I now think that it should have been 60.
The tests were performed on a CompuAdd 325s (25MHz 80387SX CPU) with a
Cyrix 83S87 FasMath coprocessor.

                       register      auto      auto  function      auto
                            int     short      long  call+ret    double
            386BSD gcc    0.178     0.448     0.474      1.62      4.94  
         386BSD gcc -O    0.159     0.207     0.159      1.75      3.37  

The ration of floating-point time to auto long is 21.2 (with optimization)
which is in the correct ball park for a 386SX/387SX system but a little
on the long size.

As a control, I made a copy of the dist.fs disk with a compiled version of
bench2 on it and booted it on my portable: a 16 MHz 80386SX system without
a coprocessor.  The results were

                       register      auto      auto  function      auto
                            int     short      long  call+ret    double
         386BSD gcc -O    0.240     0.317     0.242      2.32       346  

Note that the ratio of of f-p time to auto long is now 1429.8 --- in other
words emulation is more than 60 times slower than the coprocessor.  Unless
BRLCAD uses very little floating-point I believe that the coprocessor is
active on Alexander-James Annala's machine too (If Alexander-James wants to
try these tests I'll send him the source code if he drops me a line).

For final comparison, I have some old figures from Linux with gcc 2.1.
Using the register int time to place the results on the same scale as
the 25MHz results above the mean time for a f-p operation was 2.09 usec
without optimization and 0.936 usec at -O1 and above.

	Chris Flatters
	cflatter@nrao.edu