*BSD News Article 99188

Path: euryale.cc.adfa.oz.au!newshost.carno.net.au!harbinger.cc.monash.edu.au!munnari.OZ.AU!news.ecn.uoknor.edu!feed1.news.erols.com!howland.erols.net!news.mathworks.com!enews.sgi.com!fido.asd.sgi.com!neteng!lm
From: lm@neteng.engr.sgi.com (Larry McVoy)
Newsgroups: comp.unix.bsd.freebsd.misc
Subject: Re: dd `benchmark'
Date: 8 Jul 1997 07:00:09 GMT
Organization: Silicon Graphics Inc., Mountain View, CA
Lines: 26
Message-ID: <5psohp$a3e@fido.asd.sgi.com>
References: <u7wwn5jrs0.fsf_-_@japonica.csl.sri.com> <5pm7gn$ds6$1@flea.best.net> <5pmmm3$8fa$1@godzilla.zeta.org.au>
Reply-To: lm@slovax.engr.sgi.com
NNTP-Posting-Host: neteng.engr.sgi.com
X-Newsreader: TIN [version 1.2 PL2]
Xref: euryale.cc.adfa.oz.au comp.unix.bsd.freebsd.misc:44086

Bruce Evans (bde@zeta.org.au) wrote:
: Nope.   Under FreeBSD, this measures something closely related to the
: main memory write bandwidth if bs > size of L2 cache, and something not
: so closely related to L2 memory write bandwidth if bs is about half the
: size of the L2 cache.  Under other systems, it is meaningless unless
: you know the implementation of /dev/zero and /dev/null.

Yupper.

: P5's have about twice the effective main memory bandwidth as P6's at
: the same (memory) clock speed, because P6's do write allocation.

This is only sort of true.  The write allocation comment is correct
but that's not why P6 is slow on writes.  P6 has a substantially better
memory subsystem - look at read perf - but the write perf is crippled
by the fact that the P6 writes take part in an MP coherency transaction
on every write, even if it is not an MP.

The next release of lmbench has a graph that shows all this junk -
one graph with read, write, read + write (forces allocates), bcopy,
bzero, etc.   In addition, it shows partial read/write/copy numbers -
these are 4 byte loads / stores every 32 bytes.  You can tell a lot by
comparing these.
--
---
Larry McVoy                lm@sgi.com                 http://reality.sgi.com/lm