[Gllug] Badblocks in LVM

Mohan mk at nerdplanet.co.uk
Sun Nov 23 19:14:54 UTC 2008


hi

I have a lvm setup with 4 hard disks in total 2TB size. Reiserfs is the
file system used. Well coming to the problem.

One of the hard disk has around 130 badblocks. I ran badblocks against
the disk which is /dev/sda and it has given the block entries which are
corrupted in a file.

What is the best solution to fix this issue. I have a huge database
running on that server which doesnt want to start due to this isssue.
After doing little search i found that if the problem is on a single
hard drive with ext2 file system

The blocks exist on the sda4 partition on my case

fsck -t ext -l badblocks-logfile /dev/sda4

should fix the issue

if its reiserfs then

reiserfsck -B badblocks-logfile /dev/sda4

should i do the same with lvm and will it work fine ? what am worried is
if am going to attempt the above step will it disturb the current lvm
setup or is there any other bestway to do it.

I did come across a document .
http://smartmontools.sourceforge.net/BadBlockHowTo.txt

I Followed each step carefully mentioned in the above url, by Federic
BOITEUX and here are the Results of the test:

*Step 1*
smartctl -l selftest /dev/sda

=== START OF READ SMART DATA SECTION ===
SMART Self-test log structure revision number 1
Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error

# 1 Short offline Completed: read failure 90% 13365 *227328439*

*Step 2*

Disk /dev/sda: 60801 cylinders, 255 heads, 63 sectors/track
Units = sectors of 512 bytes, counting from 0

Device Boot Start End #sectors Id System
/dev/sda1 * 63 401624 401562 83 Linux
/dev/sda2 401625 12691349 12289725 82 Linux swap / Solaris
/dev/sda3 12691350 24981074 12289725 83 Linux
/dev/sda4 *24981075* 976768064 951786990 8e Linux LVM

*Step 3*

(227328439 - 24981075) = *202347364*

*Step 4*

pvdisplay -c /dev/sda4 | awk -F: '{print $8}'
*4096*

To get its size in LBA block size (512 bytes or 0.5 KB), we multiply this
number by 2 : 4096 * 2 = 8192 blocks for each PE.

Either, you can look in /etc/lvm/backup :
# grep pe_start $(grep -l /dev/sda4 /etc/lvm/backup/*)
pe_start = *384*

*Step 5*
Then, we search in which PE is the badblock, calculating the PE rank
in which the faulty block of the partition is :
physical partition's bad block number / sizeof(PE) =

202347364 / 8192 = *24700.6059*


*Step 6*

server#lvdisplay --maps
--- Logical volume ---
* LV Name /dev/vg/lv*
VG Name vg
LV UUID zkcUSw-Dpum-aIXr-jE2Y-ob8Z-eqlF-AKQWnP
LV Write Access read/write
LV Status available
# open 2
LV Size 1.81 TB
Current LE 473886
Segments 4
Allocation inherit
Read ahead sectors 0
Block device 253:0

--- Segments ---
Logical extent 0 to 119233:
Type linear
Physical volume /dev/sdb1
Physical extents 0 to 119233

Logical extent 119234 to 238467:
Type linear
Physical volume /dev/sdd1
Physical extents 0 to 119233

* Logical extent 238468 to 354651:*
Type linear
Physical volume /dev/sda4
Physical extents 0 to 116183

Logical extent 354652 to 473885:
Type linear
Physical volume /dev/sde1
Physical extents 0 to 119233


*Step 7*

* bad block number for the filesystem :
---------------------------------------

Since my physical extent for the partition /dev/sda4 starts from 0

Physical extent

(0 * 8192) + 384 = 384

(202347364 - 384) = 202346980 /(sizeof(fs block) / 512)


202346980 / (4096/512) =

202346980 / 8 = *25293372.5*

As we can see from my lvdisplay all harddisks physical extents start
from 0, so if i use the formula using physical extent its not working
out. I need a formula which would get me the right block using logical
extent range * Logical extent 238468 to 354651:* the calculated value in
*Step 5* 24700.6059 also comes within the range. I tried using the same
formula in *Step 7* but with the logical extent values but the outputs
are negative.

Logical extent 238468 to 354651:

(238468 * 8192) + 384 = 1953529856 + 384 = 1953530240

(202347364 - 1953530240) = -1751182876 / 8 = -218897859.5


*Step 8*

* Test of the fs bad block :

dd if=/dev/vg/lv of=block25293372 bs=4096 count=1 skip=25293372

This test returns successful which means the calculated block is wrong
for the lvm.

#smartctl -A /dev/sda
smartctl version 5.37 [i386-redhat-linux-gnu] Copyright (C) 2002-6 Bruce
Allen
Home page is http://smartmontools.sourceforge.net/

=== START OF READ SMART DATA SECTION ===
SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED
RAW_VALUE
1 Raw_Read_Error_Rate 0x000f 200 200 051 Pre-fail Always - 0
3 Spin_Up_Time 0x0003 218 217 021 Pre-fail Always - 6100
4 Start_Stop_Count 0x0032 100 100 000 Old_age Always - 98
5 Reallocated_Sector_Ct 0x0033 200 200 140 Pre-fail Always - 0
7 Seek_Error_Rate 0x000f 200 200 051 Pre-fail Always - 0
9 Power_On_Hours 0x0032 081 081 000 Old_age Always - 14034
10 Spin_Retry_Count 0x0013 100 253 051 Pre-fail Always - 0
11 Calibration_Retry_Count 0x0012 100 253 051 Old_age Always - 0
12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 97
194 Temperature_Celsius 0x0022 253 253 000 Old_age Always - 44
196 Reallocated_Event_Count 0x0032 200 200 000 Old_age Always - 0
*197 Current_Pending_Sector 0x0012 200 200 000 Old_age Always - 3
198 Offline_Uncorrectable 0x0010 200 200 000 Old_age Offline - 1*
199 UDMA_CRC_Error_Count 0x003e 200 200 000 Old_age Always - 0
200 Multi_Zone_Error_Rate 0x0009 200 200 051 Pre-fail Offline - 0

As we can see from the above output there are 3 currently pending
sectors and 1 Offline Uncorrectable sector. Badblocks sees around 100
badblocks in the /dev/sda hard disk.

Also Mutt has two mails saying the following which confirms there is
badblocks in the /dev/sda hard disk.

The following warning/error was logged by the smartd daemon:

Device: /dev/sda, 3 Currently unreadable (pending) sectors

For details see host's SYSLOG (default: /var/log/messages).

The following warning/error was logged by the smartd daemon:

Device: /dev/sda, 1 Offline uncorrectable sectors

For details see host's SYSLOG (default: /var/log/messages).



Apart from the above steps i have tried reiserfsck --check /dev/vg/lv
which said no corruptions found.

debugreiserfs -B badblock.log /dev/vg/lv gave me no currently marked
badblocks in the file system. This step is mentioned here
http://chichkin_i.zelnet.ru/bad-block-handling.html

I Really dont know how to fix this. Any help would be much appreciated.


Regards,

Mohan.




-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.lug.org.uk/pipermail/gllug/attachments/20081123/3da87c3b/attachment.html>
-------------- next part --------------
-- 
Gllug mailing list  -  Gllug at gllug.org.uk
http://lists.gllug.org.uk/mailman/listinfo/gllug


More information about the GLLUG mailing list