[Wylug-help] Inactive RAID 10 Array
Chris Davies
Chris.Davies at bcs.org.uk
Tue Apr 14 22:58:27 UTC 2009
Dave Fisher wrote:
> 1. How to correctly backup the affected array before I do anything else.
Using dd for each partition of the array will work. The MD superblock is
usually stored at the end of each physical partition (not sure how big
it is, though).
The manual page clarifies this a little:
<<
The different sub-versions store the superblock at different locations
on the device, either at the end (for 1.0), at the start (for 1.1) or
4K from the start (for 1.2).
>>
But then goes on to muddy the waters by saying that the default (on
Debian, at least) is to create superblocks with version 0.90, which is
also stored at the end of the partition.
> 2. How to diagnose the fault with a high degree of certainty.
Have you considered simply trying to re-assemble the array?
If you "mdadm --examine --scan -v", and the superblocks are still
intact, you'll get a dump of something like this (from my RAID 1 setup):
# mdadm --examine --scan -v
ARRAY /dev/md1 level=raid1 num-devices=2 UUID=...
devices=/dev/hdc1,/dev/hda1
ARRAY /dev/md5 level=raid1 num-devices=2 UUID=...
devices=/dev/hdc5,/dev/hda5
ARRAY /dev/md6 level=raid0 num-devices=2 UUID=...
devices=/dev/hdc6,/dev/hda6
ARRAY /dev/md9 level=raid10 num-devices=4 UUID=...
devices=/dev/dm-8,/dev/dm-3,/dev/dm-2,/dev/dm-1
If you see the right devices you should simply be able to restart the array:
# mdadm --assemble /dev/md1 /dev/sd{b,c,d,e,f}4
Provided you don't use --force it will refuse the action if there are
too many errors in the devices.
> 5. sdd4 appears to be faulty and fdf4 is supposed to be a spare
mdadm /dev/md1 --fail /dev/sdd4
> My two big fears are that:
>
> 1. Some of the RAID metadata is stored elsewhere, e.g. on a different partition
> or superblock.
>
> If so, how do I back that up and restore it?
It's stored at the end of each corresponding physical RAID partition. If
you set it up initially, it will also be in /etc/mdadm/mdadm.conf.
> 2. There may be hardware constraints that I've forgotten or never knew about.
>
> For example, I remember that the partitions in an array have to be identically
> sized, but I am guessing that they don't have to be physically identical, i.e.
> they don't have to occupy identically positioned blocks on identical models of HDD.
Partitions /should/ be identically sized; if they're not, then the
smallest is used as the array size. They don't have any other
constraints (I've just built a RAID10 device from four LVM volumes
allocated from a two disk RAID0).
> So I should be able to treat raw images of the partitions just like the originals.
I'm not sure you can use a raw image as a component of a RAID device,
but if you attached it via the loop device I guess it's possible. (Now
there's an interesting train of thought!)
> What about block sizes for the dd'd copies?
As big as possible. bs=10240k often works for me.
> One thing I don't understand is why mdadm -E /dev/sdb2 reports one
> failed partition, but mdadm -E /dev/sd{c,d,e,f} show no such error?
(Un)fortunately, just because a partition has failed doesn't
automatically mean that the remainder of the disk is seen as failed.
Chris
More information about the Wylug-help
mailing list