[Klug-general] Software Vs Hardware Raid

Mike mike at csits.net
Mon Jul 7 13:09:59 BST 2008


On Fri, Jul 04, 2008 at 12:04:48PM +0100, J D Freeman wrote:
> On Fri, Jul 04, 2008 at 08:40:57AM +0100, Dan Attwood wrote:
> > Surely if you talking about raid and servers the fact that it's portable is
> > immaterial?
> 
> Portable means more than just easy to move.
> 
> It means ability to move stuff about.
> 
> Simple example, you have a rack with 3 servers in, each bought a year
> apart. Each has a hardware raid device. When the oldest one dies, the
> PFY pulls the disks from the machine and puts them in the empty hot swap
> bays of the machine above. Which having a RAID card made one year later
> has a different firmware, and thus different metadata, thus your drives
> are unreadable.
> 
> Or, if you just use a hotswap controller in them, you hot add the
> drives, and do an mdadm -A magic, and away you go, not even skipping a
> beat.
> 
> When it comes to users of hardware raid there are two types, those who
> have been bitten, and those who will be bitten.

A team leader of mine once told me that it's a trait of all young people
to look at things in black and white.  He was probably correct.  As I've
grown older, I've started to see shades of grey and it gives a much
clearer picture.  You should try to remove the blinkers and
take a look at something in a clear, impartial and objective way.  Just
because you are confronted with two technologies does not mean that it's
necessary to pick one to love and one to hate.  Your assassination of
hardware RAID is somewhat unbalanced and frankly, a little off the mark.

The example of the three servers is an intersting one.  There are some
points to bear in mind.  The first being that the averge lifetime of a
server is three years.  The chances of a hardware failure in that time
is fairly low.  Most likely, anything that can go wrong is fairly
fixable.  This biggest threat to a RAID array is either multiple disk
failure or RAID card failure.  The latter will likely total the array
anyway.  The former will definatly total the array.  RAID is not a
replacement for a backup solution.  If the data is at all valuable then
it needs a seperate backup system.

As you kindly pointed out.  I have been working in the IT industry since
the last centry.  In that time I have seen a handful of multiple disk
failures and a couple of instances where a RAID card failed and
corrupted the array on its way down.  I have also seen a system with a
mirror disk where the server crashed and corrupted the system disk.  The
system then dutifully mirrored the corruption to the mirror disk and
redndered both servers unuseable.  In each of these cases, this is where
you need to restore from backup.  I don't think I've ever seen a
catastrophic server failure ie one that has been unfixable.  I have also
never seen a case where it has been necesssary to remomve disks from a
server and insert them into a new one.

Again, just because you choose RAID 1 for a particular implimentation
does not mean that you are required to hold all other RAID technologies
in utter contempt.  RAID one is a hellish waste of disk space.  If you
really are concerned that RAID 5 could be totaled by a multiple disk
failure then consider using hot spares, RAID-6, RAID-DP or
$RAID5+spareofchoice.  RAID one is still prone to multiple disk failure,
controler failure or host destruction.

Regarding the orginal question a few things to bear in mind are:

What are you planning to buy?  Real RAID cards are hellishly expensive.
There are a lot of consumer level "RAID" cards out there that are in
fact IDE/SATA PCI cards with a Windows driver that mirrors data between
the disks.  The give away is when you install it in a Linux box and see
multiple drives rather than a single logical drive that a RAID
controller presents.  If you end up with one of these cards, you end up
doing software RAID anyway.

Point two.  With software RAID you have to be clever about how you set
things up.  As the RAID is handled by the kernel, you have to bear in
mind that it needs a seperate /boot partition to hold the kernel and
initramfs.  It *is* possible to setup mirroring of a single boot
parition and boot off either, it just takes care.  There are some docs
describing how to do it.  Alternativly, what I do is use a CF to IDE
converter and store /boot on a small CF card.  I image the CF card as a
backup.  The changes of the CF card failing are remote as there are no
moving parts.  If it does decide to die, I can just dd the image onto a
new card and replace.  You can even get CF > IDE converts with a PCI
backplane so the CF card slots into the back of the machine.  I store my
backup CF image on the RAID array so that if a disk fails it's still
safe.  NO WAIT!.............

Mike.

> > Being a decently speced raid 5 system means that the chances of total
> > failure are greatly reduced and even if it does fail totally you will have
> > full backups to restore from.
> 
> I personally don't trust RAID5, as my experience is that murphy really
> likes to play with hard disks. I use RAID1, Lots of raid1.
> 
> # ls /dev/md* | wc -l
> 158
> 
> 
> > As such I would say that hardware is the way to go. Let the board do all the
> > work rather then the pc and if the os does stuff up the data should still be
> > safe on the data partition where is can be recovered.
> 
> As such I would say let software raid do the work. Removes the risks of
> hardware fault, and moves all the risk to a highly portable set of
> software. Hard disks fail, raid cards fail. The fuckup fairy will come
> visit you. Prepare for it as best you can, don't lock youself to one
> card.
> 
> Julia
> 



More information about the Kent mailing list