*BSD News Article 79105


Return to BSD News archive

Path: euryale.cc.adfa.oz.au!newshost.carno.net.au!harbinger.cc.monash.edu.au!nntp.coast.net!news.sgi.com!www.nntp.primenet.com!nntp.primenet.com!cpk-news-hub1.bbnplanet.com!newsfeed.internetmci.com!in2.uu.net!munnari.OZ.AU!metro!metro!asstdc.scgt.oz.au!nsw.news.telstra.net!elausrv2.att.net.au!surya
From: Richard Laxton <richard@real.net.au>
Newsgroups: comp.unix.bsd.freebsd.misc,comp.unix.bsd.bsdi.misc
Subject: Re: Why one should buy parity memory for reliability?
Date: Wed, 25 Sep 1996 15:54:28 +1000
Organization: RealNet Access
Lines: 53
Message-ID: <3248C914.3C17@real.net.au>
References: <32485B0D.41C6@austin.ibm.com>
Reply-To: richard@real.net.au
NNTP-Posting-Host: 203.17.240.113
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
X-Mailer: Mozilla 3.0 (WinNT; I)
Xref: euryale.cc.adfa.oz.au comp.unix.bsd.freebsd.misc:27929 comp.unix.bsd.bsdi.misc:4969

Given that parity checking can only *detect* errors, one might say that
there is no point. However memory corruption can occur for many reasons,
even something like radioactive decay can change the state of memory
cells and there is a chance that that change could cause chain reactions
within the system if the change were to the right place (e.g. jump 
tables or whatever). Parity errors generally cause a NMI (non maskable
interrupt) that may halt the system. Given that most UNIX servers up
up for between 7 and 700 days at a time the chances of some memory
corruption are quite good.

There are newer motherboards made by DFI and others that do ECC with
normal parity checked SIMMs. This could be useful in that the system
will be able to recover transparently from any memory errors and
potentially stay up much longer.

Just a few thoughts...

Richard.

Tushar Patel wrote:
> 
> Hi,
> 
> I was going through the FreeBSD hadbook and one of the things suggested
> in the book is buy "parity memory".
> 
> I was involve in designing one of the microcontroller for Motorola,
> one of the things we left out for the future controller was not
> supporting parity (less pin count). The reasoning was, In the normal
> condition the error should not happen, if it does then there is
> somethings seriouly wrong, like having high temprature or more noise
> in the system. So, the bord designer better design the system correcly.
> (They had some hard numbers to back the theory and this is just the
> brif explanation).
> 
> If the Board supports the parity memory and error occurs then in
> theory the OS should be notified and the access should be reapeted.
> 
> Does FreeBSD support such error conditions?
> 
> What happens in the case of the DMA transfer  from the DISK to the
> memory or from memory to disk, if the memory error occures then
> processor is not looking at the data bus, so does that mean that the
> DMA master (SCSI controller) will detect the parity error and
> retransfer the data?
> 
> There is a big difference in the price between the parity and non
> parity memory so I am trying to justify the parity memory purchase.
> 
> Please make comments.
> 
> Thanks,
> Tushar