*BSD News Article 81244

Path: euryale.cc.adfa.oz.au!newshost.carno.net.au!harbinger.cc.monash.edu.au!munnari.OZ.AU!spool.mu.edu!howland.erols.net!math.ohio-state.edu!jussieu.fr!univ-lyon1.fr!in2p3.fr!swidir.switch.ch!serra.unipi.it!labinfo.iet.unipi.it!luigi
From: luigi@labinfo.iet.unipi.it (Luigi Rizzo)
Newsgroups: comp.unix.bsd.freebsd.misc
Subject: Re: RAID sw?
Date: 21 Oct 1996 14:20:17 GMT
Organization: Dip. di Ingegneria dell'Informazione, Univ. di Pisa
Lines: 57
Distribution: world
Message-ID: <54g0r1$trg@serra.unipi.it>
References: <chad-3009960810030001@sverige.pengar.com> <325018D3.1131EC4C@lambert.org> <54d5nk$3ou@zwei.siemens.at>
NNTP-Posting-Host: labinfo.iet.unipi.it

In article <54d5nk$3ou@zwei.siemens.at>, Ingo Molnar <mingo@pc5829.hil.siemens.at> writes:
|> [wading through a bunch of unread articles:]
|> 
|> Terry Lambert <terry@lambert.org> wrote:
|> 
|> : RAID calculation in software is extremely expensive, especially
|> : for the hamming codes.  [...]
|> 
|> why?
|> 
|> RAID 3 Theory of Operation [greatly simplified]:
|> ================================================
|> 
|>   4 disks with real data
|>   1 dedicated disk with XOR-ed data.
|>   basic data unit: 'logical sector' of 4 ordinary sectors

This is a 'special' case in which you tolerate 1 failure out of 5 disks.
Perhaps Terry had in mind a more general case where you tolerate M
failures on N+M disks. In that case, there is a bit more of overhead in
computing the remaining M-1 redundant blocks, which are generally
computed as a sum or a polynomial over a prime field GF(p).

(Although, to be honest, I don't know if there is a RAID specification
allowing more than 1 failure).

|> a RAID 3 write operation:
|> 
|>   a transaction of:
|>      parallel write to the 4 disks
|>      and a write to the XOR-ed sector. [the XOR has to be recalculated]
|> 
|> a RAID 3 read operation:
|> 
|>   parallel read from all 4 disks

actually, it's parallel read from 4 non-failed disks. In case of a disk
failure, it also involves XORing data to rebuild the missing block.

|> :              [...] It is so expensive that no one has really
|> : bothered to implement code to do it.
|> 
|> Such RAID boxes are usually a dedicated 486 board with a better
|> SCSI adapter [possibly mirrored], using tagged queueing, ensuring
|> hot plugging, etc. IMHO, no magic there, really.

The key issue is the use of a dedicated cpu vs. the main processor.
The same argument is used in the IDE vs. SCSI debate.

	Luigi
-- 
====================================================================
Luigi Rizzo                     Dip. di Ingegneria dell'Informazione
email: luigi@iet.unipi.it       Universita' di Pisa
tel: +39-50-568533              via Diotisalvi 2, 56126 PISA (Italy)
fax: +39-50-568522              http://www.iet.unipi.it/~luigi/
====================================================================