[Gllug] Diagnosing hardware faults

Tethys sta296 at astradyne.co.uk
Tue Nov 30 10:17:15 UTC 2010


--------

Alain Williams writes:

>Can you get a console on it (non GUI), anything appear as it dies ?

Good thinking. The console was full of machine check exception errors.
Unfortunately, I have no idea what they mean, and my google-fu isn't
strong enough to assist me.

Running mcelog --ascii on the contents of the console gives me:

	HARDWARE ERROR. This is *NOT* a software problem!
	Please contact your hardware vendor
	CPU 0 BANK 0 TSC 360e20d8aae0 
	MCG status:MCIP 
	MCi status:
	Invalid log
	STATUS 0 MCGSTATUS 4

Which doesn't seem to be much help. Running the same command on the
contents of /var/log/mcelog gives me slightly more clues:

	HARDWARE ERROR. This is *NOT* a software problem!
	Please contact your hardware vendor
	CPU 0 BANK 0 TSC 2a3f69f3fa34 
	ADDR 73e815e0 
	MCG status:
	MCi status:
	Error enabled
	MCA:Generic CACHE Level-2 Generic Error
	STATUS 902000400001010a MCGSTATUS 0
	Resolving address 73e815e0 using SMBIOS
	No matching memory address found for 73e815e0 in SMBIOS

Hmmm... faulty L2 cache doesn't sound promising :-( But is
it genuine? The liberal use of the word "Generic" makes me
suspicious of a false positive. Even if it's not, can I do
anything about it? Replacing the CPU might be an option, but
I suspect that's not going to be as cheap as I'd like. Is it
just time for a new motherboard (which in realistic terms
means a new machine)?

Tet
-- 
Gllug mailing list  -  Gllug at gllug.org.uk
http://lists.gllug.org.uk/mailman/listinfo/gllug




More information about the GLLUG mailing list