[sclug] Mirror disks across machines like distributed RAID?
Tom Carbert-Allen
tom at randominter.net
Mon Jan 4 11:39:35 UTC 2010
On 02/01/2010 12:20, Dickon Hood wrote:
> On Fri, Jan 01, 2010 at 23:48:24 +0000, John Stumbles wrote:
> doesn't help if/when the host
> : system goes bad,
>
> iSCSI. Export the block device from the server to the client, use it as
> any other block device (so add it to LVM / md / whathaveyou) on the
> client. Job's a goodun. And it's standard.
>
Dickon,
iSCSI is great, I use it for my diskless machines and VM's etc, but
doesn't meet the OP requirements, a client/server disk system which you
suggest does not provide the fail over of services he wanted.
BTW, drbd is now in the mainline kernel, so is "standard" too, I have
never understood why DRBD had so many haters in the past, any ideas?
John,
To overcome the same problem you describe I use DRBD, which is pretty
easy to set up and works great for me. I have been using it for years
without issue and found it copes well with all the failure scenario's
i've been through. I also found the throughput is good too when you use
a dedicated disk network, just stick an extra gigabit card in each
machine and link them back to back, this removes any worry of the disks
getting out of sync unless you are regularly writing to the disk at more
than gigabit speeds (e.g. SSD - in this case add cards to suit )
This does increase your power footprint as both machines have to be on
to allow the disk replication, but it is worth it if you need the
reliability. If you went down any other route, like an external
iSCSI/SAN/NAS box which feeds both servers, you would only end up having
to get two of them as well to avoid failure at that layer anyway, so
having the dual storage system and dual cpu both being the same pair of
boxs saves money and power.
Not everyone is mad enough to build a rack mount server room in there
back garden like I have so I find a great solution to the power and
space problem to get a redundant home setup is to use two laptops,
modified to take 3.5" disks and second network port added via card slot.
This way you have total hardware fault tolerance (including TWO UPS!) on
a power budget of under 50W and normally a smaller cubic space
requirement than single ATX machine. Great use for laptops with broken LCD's
The next issue will be your switch or router, I have seen a fair number
of no-name switch die when in 24/7 use in the last year (in fact
probably more than I have seen disk fails if you weight the data on
number of units) you can find a decent Cisco or HP rack mount unit on
ebay for less than ?40 which will be much more reliable, but larger,
more power hungry and a lot noisier! or just wire up two cheaply no name
entire networks with Ethernet bonding for fail over.
The most important part of any redundant setup is always monitoring and
reporting though. If a system works well you will have no idea the
primary has failed and you are now running in non-fault tolerant mode
from a user perspective. So it is critical you setup and test systems to
let you know so you can fix the problem and get back to two node mode as
soon as possible. Recently I saw a big fuss over a hosted redundant
system going offline despite having RAID 5 SAN with spare disk, upon
investigation it turned one disk from the array died in 2007 and the hot
spare died shortly after it joined the array, two years ago, without
anyone noticing a thing!
When you get comfortable with DRBD to share your application and user
data disk, you can start to look at some advanced topics too, like
booting both machines from the same OS, multiple on-line primaries with
load balancing and things like that. Be careful you don't get ahead of
your self and end up with your brain split in two though, boom boom.....
TCA
More information about the Sclug
mailing list