[Gllug] Problems rebuilding a RAID5 array after a failed disc on Centos 5

Wed Sep 22 08:41:15 UTC 2010

I have a backup server configured with the following 3 RAID arrays

//backup2> /c0 show

Unit  UnitType  Status         %RCmpl  %V/I/M  Stripe  Size(GB)  Cache
AVrfy
------------------------------------------------------------------------------
u0    RAID-1    OK             -       -       -          74.4951   OFF
OFF
u1    RAID-5    OK             -       -       64K     3259.56   OFF    OFF
u2    RAID-5    OK             -       -       64K     5122.17   OFF    OFF

The first array (RAID1) /co/u0 contains 2 x 500GB discs which hold the
operating system.

The second array (RAID5) /c0/u1 contains 8 x 500GB discs and provides just
over 3GB worth of space to hold my backup files.
This is mounted on /dev/sdb2 as /disk01

The third array (RAID5)  /co/u2 contains 12 x 500GB discs and provides just
over 5GB worth of space to hold my backup files.
This is mounted on /dev/sdb1 as /disk02

The third array co/u2 was having some problems after one of it's discs
failed.
This first showed up by the partition it was mounted on /disk01 showing I/O
errors.

So I rebooted it and went to the RAID controller (alt-3) which showed me the
corrupt disc.
I replaced the corrupt disc and told the controller to rebuild the array
which it did. I then verified
it via the controller.

However, when I boot the system up it didnt mount /disk02.
So I manually mounted it but it didnt show any data and it didnt let me
write to it.
Then I did a fsck on it and it said the following

# fsck /dev/sdb1
fsck 1.39 (29-May-2006)
e2fsck 1.39 (29-May-2006)
The filesystem size (according to the superblock) is 854473244 blocks
The physical size of the device is 317602332 blocks
Either the superblock or the partition table is likely to be corrupt!
Abort<y>?

I'm not sure what has happened here and why the array hasnt been recovered.
Does anyone have any suggestions?

I'm not too bothered about losing the data from /disk02 as I have it backed
up elsewhere, but I'd rather
recover everything to its previous state than have to reinstall the entire
server.

Thanks,

Oliver

below is the output from the RAID controller using the tw_cli linux client
from 3ware.
Note that slots 2 and 11 have always been empty.

# /opt/3ware/CLI/tw_cli
//backup2> /c0 show

Unit  UnitType  Status         %RCmpl  %V/I/M  Stripe  Size(GB)  Cache
AVrfy
------------------------------------------------------------------------------
u0    RAID-1    OK             -       -       -          74.4951   OFF
OFF
u1    RAID-5    OK             -       -       64K     3259.56   OFF    OFF
u2    RAID-5    OK             -       -       64K     5122.17   OFF    OFF

Port   Status           Unit   Size        Blocks        Serial
---------------------------------------------------------------
p0     OK               u0     74.53 GB    156301488     6QZ3D8E8
p1     OK               u0     74.53 GB    156301488     6QZ3D04Z
p2     NOT-PRESENT      -      -           -             -
p3     OK               u1     465.76 GB   976773168     9QM0G1SF
p4     OK               u1     465.76 GB   976773168     9QM0L1N3
p5     OK               u1     465.76 GB   976773168     9QM0EEP9
p6     OK               u1     465.76 GB   976773168     9QM0HXMH
p7     OK               u1     465.76 GB   976773168     9QM0GZ6H
p8     OK               u1     465.76 GB   976773168     9QM0G16T
p9     OK               u1     465.76 GB   976773168     9QM0LK3C
p10    OK               u1     465.76 GB   976773168     9QM0EVP7
p11    NOT-PRESENT      -      -           -             -
p12    OK               u2     465.76 GB   976773168     WD-WCASU6421435
p13    OK               u2     465.76 GB   976773168     WD-WCASU6548401
p14    OK               u2     465.76 GB   976773168     WD-WCASU6548327
p15    OK               u2     465.76 GB   976773168     WD-WCASU6421420
p16    OK               u2     465.76 GB   976773168     WD-WCASU6421342
p17    OK               u2     465.76 GB   976773168     WD-WCASU6519440
p18    OK               u2     465.76 GB   976773168     WD-WCASU6542386
p19    OK               u2     465.76 GB   976773168     WD-WCASU6418851
p20    OK               u2     465.76 GB   976773168     WD-WCASU6548441
p21    OK               u2     465.76 GB   976773168     WD-WCASU6548301
p22    OK               u2     465.76 GB   976773168     WD-WCASU6541089
p23    OK               u2     465.76 GB   976773168     WD-WCASU6041935

//backup2>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.lug.org.uk/pipermail/gllug/attachments/20100922/b184211c/attachment.html>
-------------- next part --------------
-- 
Gllug mailing list  -  Gllug at gllug.org.uk
http://lists.gllug.org.uk/mailman/listinfo/gllug