[Gllug] Hard drive problems

Rich Walker rw at shadowrobot.com
Thu May 31 17:26:02 UTC 2007


Geo <caparo.g at gmail.com> writes:

> On Thursday 31 May 2007 16:00, Rich Walker wrote:
>> I don't know if anyone else has them, but yesterday and today I had a
>> Samsung Spinpoint SP2514N fail each day, with lots of IDE busy errors.
>>
>> I've just removed them and now a Maxtor drive is doing the same thing.
>>
>> Kernel version 2.6.20 SMP - any ideas?

> Hi,
>  Some A / B testing with new cables and another IDE based machine would 
> isolate where the error lies 

Original situation: machine has a bunch of drives and controller
cards. Each drive is on one cable. Each controller card has two drives.

Drives: hde hdg hdi hdk hdm hdo make up /dev/md1 (One spare, RAID-5)


hde (Spinpoint) starts giving errors. Then I get errors from ext3 on one
partition on /dev/md1

hde gets pulled from the array, and the spare added. When resync
finished, I reboot and do a fsck on the faulty partition.

Next day: hdi (Spinpoint) gives same errors, followed by file system
errors. I pull hdi from the array, shut the machine down, remove hde and
hdi from the machine, and replace hdi with a handy 250GB
Seagate. Reboot, go through fsck hell again, add hdi to md1 and let it rebuild.

On a Windows machine, the two Spinpoint disks are formatted
happily. We're running chkdsk on there now.

Meanwhile a Maxtor drive (hdk) has started giving the 

[10059.640000] hdk: drive_cmd: status=0xd0 { Busy }
[10059.640000] ide: failed opcode was: 0xea

error that the other two did. No file system errors *yet*.


I have ordered a couple more 250GB disks "in case" for tomorrow, and
have just compiled a 2.6.21.3 kernel to replace the 2.6.20 I was
running.

For reference, the disk controllers are:

[    0.600000] Uniform Multi-Platform E-IDE driver Revision: 7.00alpha2
[    0.600000] ide: Assuming 33MHz system bus speed for PIO modes; override with
 idebus=xx
[    0.600000] AMD7441: IDE controller at PCI slot 0000:00:07.1
[    0.600000] AMD7441: chipset revision 4
[    0.600000] AMD7441: not 100% native mode: will probe irqs later
[    0.600000] AMD7441: 0000:00:07.1 (rev 04) UDMA100 controller
[    0.600000]     ide0: BM-DMA at 0xb800-0xb807, BIOS settings: hda:DMA, hdb:pi
o
[    0.600000]     ide1: BM-DMA at 0xb808-0xb80f, BIOS settings: hdc:DMA, hdd:pi
o
[    0.600000] Probing IDE interface ide0...
[    0.888000] hda: IBM-DTLA-305040, ATA DISK drive
[    1.560000] ide0 at 0x1f0-0x1f7,0x3f6 on irq 14
[    1.560000] Probing IDE interface ide1...
[    2.296000] hdc: _NEC DVD_RW ND-3500AG, ATAPI CD/DVD-ROM drive
[    2.968000] ide1 at 0x170-0x177,0x376 on irq 15
[    2.968000] CMD649: IDE controller at PCI slot 0000:00:08.0
[    2.968000] ACPI: PCI Interrupt 0000:00:08.0[A] -> GSI 20 (level, low) -> IRQ
 17
[    2.968000] CMD649: chipset revision 2
[    2.968000] CMD649: 100% native mode on irq 17
[    2.968000]     ide2: BM-DMA at 0xa000-0xa007, BIOS settings: hde:pio, hdf:pi
o
[    2.968000]     ide3: BM-DMA at 0xa008-0xa00f, BIOS settings: hdg:pio, hdh:pi
o
[    2.968000] Probing IDE interface ide2...
[    3.536000] Probing IDE interface ide3...
[    3.824000] hdg: Maxtor 6L250R0, ATA DISK drive
[    4.496000] ide3 at 0xa800-0xa807,0xa402 on irq 17
[    4.496000] SiI680: IDE controller at PCI slot 0000:00:09.0
[    4.496000] ACPI: PCI Interrupt 0000:00:09.0[A] -> GSI 21 (level, low) -> IRQ
 18
[    4.496000] SiI680: chipset revision 2
[    4.496000] SiI680: BASE CLOCK == 133
[    4.496000] SiI680: 100% native mode on irq 18
[    4.496000]     ide4: MMIO-DMA , BIOS settings: hdi:pio, hdj:pio
[    4.496000]     ide5: MMIO-DMA , BIOS settings: hdk:pio, hdl:pio
[    4.496000] Probing IDE interface ide4...
[    4.784000] hdi: ST3250820A, ATA DISK drive
[    5.456000] ide4 at 0xf8820080-0xf8820087,0xf882008a on irq 18
[    5.456000] Probing IDE interface ide5...
[    5.744000] hdk: MAXTOR STM3250820A, ATA DISK drive
[    6.416000] ide5 at 0xf88200c0-0xf88200c7,0xf88200ca on irq 18
[    6.416000] PDC20268: IDE controller at PCI slot 0000:02:06.0
[    6.416000] ACPI: PCI Interrupt 0000:02:06.0[A] -> GSI 17 (level, low) -> IRQ
 19
[    6.416000] PDC20268: chipset revision 2
[    6.416000] PDC20268: ROM enabled at 0xded20000
[    6.424000] PDC20268: PLL input clock is 16557 kHz
[    6.456000] PDC20268: 100% native mode on irq 19
[    6.456000]     ide6: BM-DMA at 0x5800-0x5807, BIOS settings: hdm:pio, hdn:pi
o
[    6.456000]     ide7: BM-DMA at 0x5808-0x580f, BIOS settings: hdo:pio, hdp:pi
o
[    6.456000] Probing IDE interface ide6...
[    6.744000] hdm: Maxtor 6Y160P0, ATA DISK drive
[    7.416000] ide6 at 0x7000-0x7007,0x6802 on irq 19
[    7.416000] Probing IDE interface ide7...
[    7.704000] hdo: Maxtor 6L250R0, ATA DISK drive
[    8.376000] ide7 at 0x6400-0x6407,0x6002 on irq 19
[    8.380000] Probing IDE interface ide2...
[    8.948000] hda: max request size: 128KiB
[    8.968000] hda: 80418240 sectors (41174 MB) w/380KiB Cache, CHS=65535/15/63,
 UDMA(33)
[    8.968000] hda: cache flushes not supported
[    8.968000]  hda: hda1 hda3 hda4 < hda5 hda6 >
[    9.012000] hdg: max request size: 512KiB
[    9.036000] hdg: 490234752 sectors (251000 MB) w/16384KiB Cache, CHS=30515/25
5/63, UDMA(100)
[    9.036000] hdg: cache flushes supported
[    9.036000]  hdg: hdg1 hdg2
[    9.044000] hdi: max request size: 64KiB
[    9.088000] hdi: 488397168 sectors (250059 MB) w/8192KiB Cache, CHS=30401/255
/63, UDMA(100)
[    9.116000] hdi: cache flushes supported
[    9.116000]  hdi: unknown partition table
[    9.140000] hdk: max request size: 64KiB
[    9.184000] hdk: 488397168 sectors (250059 MB) w/8192KiB Cache, CHS=30401/255
/63, UDMA(100)
[    9.208000] hdk: cache flushes supported
[    9.208000]  hdk: hdk1 hdk2
[    9.232000] hdm: max request size: 512KiB
[    9.232000] hdm: 320173056 sectors (163928 MB) w/7936KiB Cache, CHS=19929/255
/63, UDMA(100)
[    9.232000] hdm: cache flushes supported
[    9.232000]  hdm: hdm1 hdm2
[    9.240000] hdo: max request size: 512KiB
[    9.260000] hdo: 490234752 sectors (251000 MB) w/16384KiB Cache, CHS=30515/25
5/63, UDMA(100)
[    9.264000] hdo: cache flushes supported
[    9.264000]  hdo: hdo1 hdo2
[    9.268000] hdc: ATAPI 48X DVD-ROM DVD-R CD-R/RW drive, 2048kB Cache, UDMA(33
)
[    9.268000] Uniform CD-ROM driver Revision: 3.20

It's a dual CPU motherboard with AMD MP's in it.
00:00.0 Host bridge: Advanced Micro Devices [AMD] AMD-760 MP [IGD4-2P] System Controller (rev 11)
00:01.0 PCI bridge: Advanced Micro Devices [AMD] AMD-760 MP [IGD4-2P] AGP Bridge
00:07.0 ISA bridge: Advanced Micro Devices [AMD] AMD-768 [Opus] ISA (rev 04)
00:07.1 IDE interface: Advanced Micro Devices [AMD] AMD-768 [Opus] IDE (rev 04)
00:07.3 Bridge: Advanced Micro Devices [AMD] AMD-768 [Opus] ACPI (rev 03)
00:08.0 RAID bus controller: Silicon Image, Inc. SiI 0649 Ultra ATA/100 PCI to ATA Host Controller (rev 02)
00:09.0 Mass storage controller: Silicon Image, Inc. PCI0680 Ultra ATA-133 Host Controller (rev 02)
00:10.0 PCI bridge: Advanced Micro Devices [AMD] AMD-768 [Opus] PCI (rev 04)
01:05.0 VGA compatible controller: Silicon Integrated Systems [SiS] 86C326 5598/6326 (rev 0b)
02:04.0 Multimedia audio controller: C-Media Electronics Inc CM8738 (rev 10)
02:05.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL-8169 Gigabit Ethernet (rev 10)
02:06.0 Mass storage controller: Promise Technology, Inc. PDC20268 (Ultra100 TX2) (rev 02)


-- 
rich walker         |  Shadow Robot Company | rw at shadow.org.uk
technical director     251 Liverpool Road   |
need a Hand?           London  N1 1LX       | +UK 20 7700 2487
www.shadowrobot.com/hand/overview.shtml
-- 
Gllug mailing list  -  Gllug at gllug.org.uk
http://lists.gllug.org.uk/mailman/listinfo/gllug




More information about the GLLUG mailing list