[sclug] the case of the exploding switch

Jason Rivers jason.rivers at gmail.com
Wed Jul 2 15:37:21 UTC 2008


On Wed, Jul 2, 2008 at 4:11 PM, Tom Carbert-Allen <tom at randominter.net>
wrote:

> Ok so I volunteer to look after a local school network. They have been
> happy with there Linux box server and gateway for the last 5 years without
> issue (using a recycled p3 500) but are now having some very weird network
> problems.
>
> First sign of problem was 4 months ago I arrive on site to find the central
> switch is 100% dead. I figure after 5 years in a hot cupboard it just gave
> up (was only cheapy ebuyer special). I replace it with a US robotics unit I
> was given and walk away expecting at least another 5 years of trouble free
> packet forwarding. One month later I get a call again and find this switch
> is now 100% dead too. I then test every connection to the switch end 2 end
> with a cable tester and they all show ok. I connect switch number 3 and then
> test each socket with my laptop and they all give me 100mb full duplex and
> no packet loss. I run out of time so leave again very confused.
>
> Today I get a call again and find the switch is not dead but all the lights
> are on (even for sockets with nothing plugged in) and no packets are
> forwarding. I come armed with a HP Procurve this time. I plug the procurve
> in and it lasts 3 minutes before shutting off, with nothing in the logs, it
> just freezes, I reboot it with nothing pluged in and it's failing to come up
> with error about transceiver VRM failed primary test, I call HP support for
> a RMA and they give me one but say I must have a shorted cable somewhere or
> a faulty ethernet device. I plug in another ?20 ebuyer special cheap and
> cheerful switch and it works fine for today, but for how long.
>
> Can anyone suggest to me how I might go about finding which cable/piece of
> equipment is faulty and causing the switch's to all die? I have a basic
> cable tester but that just test for continuity and pinout correctness. All
> the kit is connecting ok and not showing lost packets.
>
> Thanks in advance for your help.
>
> The only idea I have so far is to buy lots of smaller 4 port switch's and
> use them instead in a switchy mess of interconnects. That way I can atleast
> narrow it down to 3 cables/devices.
>
> TCA
>
>
>

This does sound like it could be cable problems, but i've not seen them
write off the whole switch before, I would definatly take a cable tester
(one that doesn't only check in pairs is good) and test each cable, if you
don't find anything, you could put a few 4 ports in and see which one blows
out, at least you then narrow it down to one of 3 or 4 systems. or you could
(failing any cable checks) change the NIC's in the PC's - I don't know how
many PC's there are there or how much of a job it would be. but first off I
would turn up with a cable tester and check them.

also, it might be worth checking the mains output, I don't know if all these
routers are using "Kettle" leads, but are you using a different lead with
each switch?

it's easy to assume it's the switch that has packed up, and it sounds like
they have, the question is, why. check cables, see if any of the kids have
poked things into the systems where it might be making contact with a NIC
and check the power.

else...... I'll try thinking again.

/J



More information about the Sclug mailing list