[sclug] Copy drive problem

Sat Mar 2 13:03:18 UTC 2013

On 2 March 2013 12:00,  <sclug-
> Subject: [sclug] Copy drive problem

> dd if=/dev/sdb1 of=/dev/sdc1
>
> After a long pause the message 'read error' appeared and dd bombed out,
> with only a couple of MB transferred.  Why would this be? Have I done

So dd did tell you that a few MB
were actually transferred.

Look in /var/log/*,
dmesg ==> kern.log
syslog
if there was a bad block,
the drive would have reported a media error,
and the scsi sub-system logs it,
including the block/sector number.

## log files always look bad scary ##

Feb 15 14:51:22 x3 kernel: [273445.968444] usb 1-5: reset high-speed
USB device number 6 using ehci_hcd
Feb 15 14:51:22 x3 kernel: [273446.106023] sd 10:0:0:0: [sdd]  Result:
hostbyte=DID_OK driverbyte=DRIVER_SENSE
Feb 15 14:51:22 x3 kernel: [273446.106037] sd 10:0:0:0: [sdd]  Sense
Key : Illegal Request [current]
Feb 15 14:51:22 x3 kernel: [273446.106048] sd 10:0:0:0: [sdd]  Add.
Sense: Logical block address out of range
Feb 15 14:51:22 x3 kernel: [273446.106060] sd 10:0:0:0: [sdd] CDB:
Write(10): 2a 00 00 19 a5 50 00 00 f0 00
Feb 15 14:51:22 x3 kernel: [273446.106080] end_request: I/O error, dev
sdd, sector 1680720
Feb 15 14:51:22 x3 kernel: [273446.106090] quiet_error: 45 callbacks suppressed
Feb 15 14:51:22 x3 kernel: [273446.106097] Buffer I/O error on device
sdd, logical block 210090
Feb 15 14:51:22 x3 kernel: [273446.106101] lost page write due to I/O
error on sdd
Feb 15 14:51:22 x3 kernel: [273446.106116] Buffer I/O error on device
sdd, logical block 210091

## try to find the FIRST error sector ##

if it is a bad cable or chip,
it might appear as stray interrupt,
not a media error.

if there was an end-of-partition EOF
or seek beyond end-of-media,
then it appears as ... (dunno)

TRY:

dd if=/dev/sdb of=/dev/null
ie: the entire surface, ignoring partitions

If the sector is CRC corrupted,
the drive will retry and fail,
reporting the sector number, and why.

If you then (gingerly) overwrite that sector
(seek and write), with 512 (?) NULL bytes. either a good CRC goes to
that sector, or the drive remaps that sector (bad block remapping),
Then it all works fine,
except for a corrupted file with NULLS
where <something> used to be.

dd rescue might be able to retry
until it gets lucky (temperature change)
on that particular block.

dd plain will show the contents of adjacent blocks, and also confirm
the seek-error-addr
Maybe its just an SMTP logfile,
Maybe its a vital RDBMS index
Maybe its a directorys list of inodes
Maybe ...

It is good to be able to make it report an error,
just touching that block.

fs-db might tell you what file it is.
dd_rescue will tell you how many
other errors there are

S.M.A.R.T. might help, but I never understand it.

> anything wrong or missed out a vital step?
>

/proc/partitions (etc)
will tell you about the cables, drive names, etc

> TIA
> Neil

HTH
Graham