[Sussex] LVM and disk failure - how to recover?

Karl E. Jorgensen karl at jorgensen.org.uk
Mon Jul 23 21:54:29 UTC 2007


I've got a problem. My MythBox is broken. Rather than being forced to 
watch live TV (5 channels only) without the pause button, advert 
flagging, fast-forward over adverts etc, I'm trying to fix it.

The bad news: I've got a volume group of about 800Gb, where one disk has 
now failed miserably. No mirroring or striping. Just a bunch of disks. 
The failed disk is a 500Gb SATA disk (/dev/sda), which now reports 
nothing but SCSI errors in the system log.  It's as dead as Monty 
Python's parrot.

The good news: The failed disk only contained a single logical volume: 
my mythtv partition. The remaining file systems are all on the 
still-alive disks.  And I managed to back up most [1] of the files from 
the now-dead partition onto DVDs, other machines, other partitions and 
whatever I could find before it gave up completely. (Thank God for 
SMART!!) [2]

The machine trundles on OK (but without MythTV) and works fine.  The 
existing mount points are all reachable.

Commands like "vgdisplay" reports 'Volume group "vgbig" not found' - 
despite the fact I've got /usr as a LV in vgbig and /usr works fine. Go 
figure...

Even a "vgchange -a y" reports the same thing - both after a slew of 
/dev/sda errors and the message:

    Couldn't find all physical volumes for volume group vgbig.

Other commands, like pvdisplay successfully reports all the other 
physical volumes and the "good" logical volumes, but (obviously) still 
complains about the lack of the disk, and lists the PV as "unknown 
device"

Annoyingly, I cannot remove the PV either:
    braun:~# pvremove /dev/sda1
      /dev/sda1: read failed after 0 of 2048 at 0: Input/output error
      No physical volume label read from /dev/sda1

    braun:~# pvremove --force /dev/sda1
      /dev/sda1: read failed after 0 of 2048 at 0: Input/output error
      No physical volume label read from /dev/sda1
      Can't open /dev/sda1 exclusively.  Mounted filesystem?

It's lying!!! - or easily confused.

    braun:~# pvremove --force --pretty-please /dev/sda1
    pvremove: unrecognised option `--pretty-please'
      Error during parsing of command line.

And utterly uncooperative!  Gah!

Obviously, the failed disk has to go back (it's 3 months old and still 
under warranty), but that means shutting the machine down. And with the 
errors displayed, I'm not sure that LVM will like what it sees on a 
fresh boot...

I *do* have backups of everything else already, but I'd rather not 
resort to them - and since the data is still there, I shouldn't have 
to... If disaster strikes, the backup of /etc/lvm should be handy...

So, where do I go from here?  Ideas are welcome and may be rewarded by 
beverages (!)

[1] The MythTV partition was only 460Gb according to 
    /etc/lvm/backup/vgbig. And I managed to yank about 338Gb 
    (compressed) from it. Even if I've lost a few recordings, I'm happy 
    with that. Can't do anything about it now anyway...

[2] Don't tell me what's on TV now. Or yesterday. I'll be *weeks* behind 
    on the good stuff (a few Gb's) before this is sorted... 

PS: Random fortunes are really uncanny sometimes...
-- 
Karl E. Jorgensen
karl at jorgensen.org.uk  http://www.jorgensen.org.uk/
karl at jorgensen.com     http://karl.jorgensen.com
==== Today's fortune:
Due to lack of disk space, this fortune database has been discontinued.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: Digital signature
Url : http://mailman.lug.org.uk/pipermail/sussex/attachments/20070723/bc3a591d/attachment.pgp 


More information about the Sussex mailing list