<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">

<html>

<head>

</head>

<body bgcolor="#ffffff" text="#000000">

hi<br>

<br>

I have a lvm setup with 4 hard disks in total 2TB size. Reiserfs is the

file system used. Well coming to the problem.<br>

<br>

One of the hard disk has around 130 badblocks. I ran badblocks against

the disk which is /dev/sda and it has given the block entries which are

corrupted in a file.<br>

<br>

What is the best solution to fix this issue. I have a huge database

running on that server which doesnt want to start due to this isssue.

After doing little search i found that if the problem is on a single

hard drive with ext2 file system<br>

<br>

The blocks exist on the sda4 partition on my case<br>

<br>

fsck -t ext -l badblocks-logfile /dev/sda4<br>

<br>

should fix the issue<br>

<br>

if its reiserfs then<br>

<br>

reiserfsck -B badblocks-logfile /dev/sda4<br>

<br>

should i do the same with lvm and will it work fine ? what am worried

is if am going to attempt the above step will it disturb the current

lvm setup or is there any other bestway to do it.<br>

<br>

I did come across a document . <a

 href="http://smartmontools.sourceforge.net/BadBlockHowTo.txt"

 target="_blank">http://smartmontools.sourceforge.net/BadBlockHowTo.txt</a><br>

<br>

I Followed each step carefully mentioned in the above url, by Federic

BOITEUX and here are the Results of the test:<br>

<br>

<b>Step 1</b><br>

smartctl -l selftest /dev/sda<br>

<br>

=== START OF READ SMART DATA SECTION ===<br>

SMART Self-test log structure revision number 1<br>

Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error<br>

<br>

# 1 Short offline Completed: read failure 90% 13365 <b>227328439</b><br>

<br>

<b>Step 2</b><br>

<br>

Disk /dev/sda: 60801 cylinders, 255 heads, 63 sectors/track<br>

Units = sectors of 512 bytes, counting from 0<br>

<br>

Device Boot Start End #sectors Id System<br>

/dev/sda1 * 63 401624 401562 83 Linux<br>

/dev/sda2 401625 12691349 12289725 82 Linux swap / Solaris<br>

/dev/sda3 12691350 24981074 12289725 83 Linux<br>

/dev/sda4 <b>24981075</b> 976768064 951786990 8e Linux LVM<br>

<br>

<b>Step 3</b><br>

<br>

(227328439 - 24981075) = <b>202347364</b><br>

<br>

<b>Step 4</b><br>

<br>

pvdisplay -c /dev/sda4 | awk -F: '{print $8}'<br>

<b>4096</b><br>

<br>

To get its size in LBA block size (512 bytes or 0.5 KB), we multiply

this<br>

number by 2 : 4096 * 2 = 8192 blocks for each PE.<br>

<br>

Either, you can look in /etc/lvm/backup :<br>

# grep pe_start $(grep -l /dev/sda4 /etc/lvm/backup/*)<br>

pe_start = <b>384</b><br>

<br>

<b>Step 5</b><br>

Then, we search in which PE is the badblock, calculating the PE rank<br>

in which the faulty block of the partition is :<br>

physical partition's bad block number / sizeof(PE) =<br>

<br>

202347364 / 8192 = <b>24700.6059</b><br>

<br>

<br>

<b>Step 6</b><br>

<br>

server#lvdisplay --maps<br>

--- Logical volume ---<br>

<b> LV Name /dev/vg/lv</b><br>

VG Name vg<br>

LV UUID zkcUSw-Dpum-aIXr-jE2Y-ob8Z-eqlF-AKQWnP<br>

LV Write Access read/write<br>

LV Status available<br>

# open 2<br>

LV Size 1.81 TB<br>

Current LE 473886<br>

Segments 4<br>

Allocation inherit<br>

Read ahead sectors 0<br>

Block device 253:0<br>

<br>

--- Segments ---<br>

Logical extent 0 to 119233:<br>

Type linear<br>

Physical volume /dev/sdb1<br>

Physical extents 0 to 119233<br>

<br>

Logical extent 119234 to 238467:<br>

Type linear<br>

Physical volume /dev/sdd1<br>

Physical extents 0 to 119233<br>

<br>

<b> Logical extent 238468 to 354651:</b><br>

Type linear<br>

Physical volume /dev/sda4<br>

Physical extents 0 to 116183<br>

<br>

Logical extent 354652 to 473885:<br>

Type linear<br>

Physical volume /dev/sde1<br>

Physical extents 0 to 119233<br>

<br>

<br>

<b>Step 7</b><br>

<br>

* bad block number for the filesystem :<br>

---------------------------------------<br>

<br>

Since my physical extent for the partition /dev/sda4 starts from 0<br>

<br>

Physical extent<br>

<br>

(0 * 8192) + 384 = 384<br>

<br>

(202347364 - 384) = 202346980 /(sizeof(fs block) / 512)<br>

<br>

<br>

202346980 / (4096/512) = <br>

<br>

202346980 / 8 = <b>25293372.5</b><br>

<br>

As we can see from my lvdisplay all harddisks physical extents start

from 0, so if i use the formula using physical extent its not working

out. I need a formula which would get me the right block using logical

extent range <b> Logical extent 238468 to 354651:</b> the calculated

value in <b>Step 5</b> 24700.6059 also comes within the range. I tried

using the same formula in <b>Step 7</b> but with the logical extent

values but the outputs are negative. <br>

<br>

Logical extent 238468 to 354651:<br>

<br>

(238468 * 8192) + 384 = 1953529856 + 384 = 1953530240<br>

<br>

(202347364 - 1953530240) = -1751182876 / 8 = -218897859.5<br>

<br>

<br>

<b>Step 8</b><br>

<br>

* Test of the fs bad block :<br>

<br>

dd if=/dev/vg/lv of=block25293372 bs=4096 count=1 skip=25293372<br>

<br>

This test returns successful which means the calculated block is wrong

for the lvm.<br>

<br>

#smartctl -A /dev/sda<br>

smartctl version 5.37 [i386-redhat-linux-gnu] Copyright (C) 2002-6

Bruce Allen<br>

Home page is <a href="http://smartmontools.sourceforge.net/"

 target="_blank">http://smartmontools.sourceforge.net/</a><br>

<br>

=== START OF READ SMART DATA SECTION ===<br>

SMART Attributes Data Structure revision number: 16<br>

Vendor Specific SMART Attributes with Thresholds:<br>

ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED

RAW_VALUE<br>

1 Raw_Read_Error_Rate 0x000f 200 200 051 Pre-fail Always - 0<br>

3 Spin_Up_Time 0x0003 218 217 021 Pre-fail Always - 6100<br>

4 Start_Stop_Count 0x0032 100 100 000 Old_age Always - 98<br>

5 Reallocated_Sector_Ct 0x0033 200 200 140 Pre-fail Always - 0<br>

7 Seek_Error_Rate 0x000f 200 200 051 Pre-fail Always - 0<br>

9 Power_On_Hours 0x0032 081 081 000 Old_age Always - 14034<br>

10 Spin_Retry_Count 0x0013 100 253 051 Pre-fail Always - 0<br>

11 Calibration_Retry_Count 0x0012 100 253 051 Old_age Always - 0<br>

12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 97<br>

194 Temperature_Celsius 0x0022 253 253 000 Old_age Always - 44<br>

196 Reallocated_Event_Count 0x0032 200 200 000 Old_age Always - 0<br>

<b>197 Current_Pending_Sector 0x0012 200 200 000 Old_age Always - 3<br>

198 Offline_Uncorrectable 0x0010 200 200 000 Old_age Offline - 1</b><br>

199 UDMA_CRC_Error_Count 0x003e 200 200 000 Old_age Always - 0<br>

200 Multi_Zone_Error_Rate 0x0009 200 200 051 Pre-fail Offline - 0<br>

<br>

As we can see from the above output there are 3 currently pending

sectors and 1 Offline Uncorrectable sector. Badblocks sees around 100

badblocks in the /dev/sda hard disk.<br>

<br>

Also Mutt has two mails saying the following which confirms there is

badblocks in the /dev/sda hard disk.<br>

<br>

The following warning/error was logged by the smartd daemon:<br>

<br>

Device: /dev/sda, 3 Currently unreadable (pending) sectors<br>

<br>

For details see host's SYSLOG (default: /var/log/messages).<br>

<br>

The following warning/error was logged by the smartd daemon:<br>

<br>

Device: /dev/sda, 1 Offline uncorrectable sectors<br>

<br>

For details see host's SYSLOG (default: /var/log/messages).<br>

<br>

<br>

<br>

Apart from the above steps i have tried reiserfsck --check /dev/vg/lv

which said no corruptions found.<br>

<br>

debugreiserfs -B badblock.log /dev/vg/lv gave me no currently marked

badblocks in the file system. This step is mentioned here <a

 href="http://chichkin_i.zelnet.ru/bad-block-handling.html"

 target="_blank">http://chichkin_i.zelnet.ru/bad-block-handling.html</a><br>

<br>

I Really dont know how to fix this. Any help would be much appreciated.

<br>

<br>

<br>

Regards,<br>

<br>

Mohan.<br>

<br>

<br>

<br>

<br>

</body>

</html>