*BSD News Article 79201


Return to BSD News archive

Newsgroups: comp.unix.bsd.freebsd.misc,comp.unix.bsd.bsdi.misc
Path: euryale.cc.adfa.oz.au!newshost.carno.net.au!harbinger.cc.monash.edu.au!munnari.OZ.AU!news.ecn.uoknor.edu!news.wildstar.net!cancer.vividnet.com!hunter.premier.net!www.nntp.primenet.com!nntp.primenet.com!howland.erols.net!newsxfer2.itd.umich.edu!news.sprintlink.net!news-chi-8.sprintlink.net!rockyd!dnn.rockefeller.edu!dan
From: dan@dnn.rockefeller.edu (Dan Ts'o)
Subject: Re: Why one should buy parity memory for reliability?
X-Nntp-Posting-Host: dnn.rockefeller.edu
Message-ID: <DyApK7.DL8@rockyd.rockefeller.edu>
Followup-To: comp.unix.bsd.freebsd.misc,comp.unix.bsd.bsdi.misc
Sender: notes@rockyd.rockefeller.edu (News Administrator)
Organization: Rockefeller University
X-Newsreader: TIN [version 1.2 PL2]
References: <32485B0D.41C6@austin.ibm.com>
Date: Wed, 25 Sep 1996 15:55:19 GMT
Lines: 45
Xref: euryale.cc.adfa.oz.au comp.unix.bsd.freebsd.misc:27987 comp.unix.bsd.bsdi.misc:4983

Tushar Patel (tpatel@austin.ibm.com) wrote:
: If the Board supports the parity memory and error occurs then in
: theory the OS should be notified and the access should be reapeted.

	I don't know if the FreeBSD kernel attempts any repeats. That would be
nice but in most UNIX systems you just get an error message and sometimes
even just a panic. In most real-world cases a repeat won't do any good, though
as the simm is dead. In addition, I believe that parity is only checked on
reads, in which case, if it reads wrong, it won't generally change.

	What you really want of course is ECC memory so that the memory values
get corrected on the fly, software continues to run, data is intact and you
get the warning that hardware needs replacing. No serious mission-critical
computer should be without ECC.

: What happens in the case of the DMA transfer  from the DISK to the 
: memory or from memory to disk, if the memory error occures then
: processor is not looking at the data bus, so does that mean that the
: DMA master (SCSI controller) will detect the parity error and 
: retransfer the data?

	The memory controller does the parity/ECC check. The processor is not
involved unless there is an error. Then the CPU usually gets an NMI (non-
maskable interrupt) Every thing going into and out of the memory gets
checked on the fly, DMA transfers included.

: There is a big difference in the price between the parity and non
: parity memory so I am trying to justify the parity memory purchase.

	The danger with non-parity or fake-parity is that memory errors will
go undetected. You could be computing a payroll or other important transaction
and writing bad and corrupted data on the disk or to a printer or screen and
you may *never* know it. If base your computing work on non-parity memory you
either don't care if the results are accurate (like you are playing games) or
you are gambling that a memory failure will be so catastrophic that it will
take down the machine or exhibit some other very obvious behavior, which isn't
necessarily the case. What the statistics are on this gamble I don't know.
Serious computing must at least have true parity, if not ECC.
--
			Cheers,
			Dan Ts'o			212-327-7671
                        Dept. of Neurobiology   	FAX: 212-327-7671
                        The Rockefeller University
                        1230 York Ave.  Box 138		dantso@cris.com
                        New York, NY  10021     	dan@dna.rockefeller.edu