[sclug] RAID 5

Thu Sep 22 10:50:55 UTC 2005

Err... yeah perhaps I should have added the messages I got when booting...

**********************************
mdadm: /dev/md1 has been started with 7 drives (out of 8).
done.
Checking all file systems...
fsck 1.37 (21-Mar-2005)
/boot: clean, 31/490560 files, 82063/979965 blocks
fsck.ext3: Invalid argument while trying to open /dev/md2
/dev/md2:
The superblock could not be read or does not describe a correct ext2 
filesystem.  If the device is valid and it really contains an ext2 
filesystem (and not swap or ufs or something else), then the superblock 
is corrupt, and you might try running e2fsck with an alternate superblock:
   e2fsck -b 8193 (device)

raid5: switching cache buffer size, 4096 --> 1024
/home: clean, 35/37175296 files, 1174835/74346608 blocks

fsck failed: Please repair manually

CONTROL-D will exit from this shell and continue system startup

Give root password for maintenance
(or type Control-D to continue):
**********************************

If I continue and log in as normal and do mount /dev/m I get :

mount: wrong fs type, bad option, bad superblock on /dev/md2,
       missing codepage or other error
       (could this be the IDE device where you in fact use
       ide-scsi so that sr0 or sda or so is needed?)
       In some cases useful info is found in syslog - try
       dmesg | tail  or so

If I do dmesg | tail I get:

pci_hotplug: PCI Hot Plug PCI Core version: 0.5
shpchp: acpi_shpchprm:get_device PCI ROOT HID fail=0x1001
pci_hotplug: PCI Hot Plug PCI Core version: 0.5
pciehp: acpi_pciehprm:get_device PCI ROOT HID fail=0x1001
eth0: Media Link On 100mbps full-duplex
NET4: AppleTalk 0.18a for Linux NET4.0
apm: BIOS version 1.2 Flags 0x07 (Driver version 1.16)
EXT3-fs: unable to read superblock

This isn't a major problem at the moment, but I wanted to check that if 
we DO have an error in the future that we won't loose valuable data.  I 
just find it a bit odd that 2 of the 3 MD's manage to mount happily.

Cheers

Pete

Alex Butcher wrote:

> On Thu, 22 Sep 2005, Keith Edmunds wrote:
>
>> Peter Brewer wrote:
>>
>>> We are running Debian and when we set up the RAID we set all 8 disks 
>>> to active and none to inactive (not entirely sure whether we needed 
>>> an inactive disk or not).  All seems to be working fine except when 
>>> we simulated a disk failure by booting up with just 7 disks.  Debian 
>>> detects the missing disk and continues to boot happily.  After I log 
>>> in, both root and /home seem to be intact however /data is empty.  
>>> Why would it successfully recover just 2 of the 3 RAID partitions 
>>> when they've all been set up in the same way?
>>
>>
>> "Recover" is a bit of a misnomer here - it hasn't recovered the 
>> partition, merely reconstructed the data from what is left. But maybe 
>> I'm being picky, sorry! Anyway, my first thought would be "is the 
>> /data partition mounted"? It seems unlikely that all the data would 
>> just disappear.
>
>
> Seconded. I would expect an array failure to result in the filesystem 
> module
> bitching about incorrect superblocks and the like.
>
>> Keith
>
>
> Best Regards,
> Alex.