[GLLUG] How worried should I be ...
Andy Smith
andy at bitfolk.com
Fri May 22 16:18:47 UTC 2020
Hello,
On Fri, May 22, 2020 at 04:37:57PM +0100, Alain D D Williams via GLLUG wrote:
> Should I take this as a warning and look to replace the machine or just shrug my
> shoulders & mutter something about cosmic rays ?
>
> Message from syslogd at mint at May 22 07:27:09 ...
> kernel:[Hardware Error]: MC4 Error (node 0): L3 data cache ECC error.
>
> Message from syslogd at mint at May 22 07:27:09 ...
> kernel:[Hardware Error]: Error Status: Corrected error, no action required.
The L3 cache is inside the CPU. It can be a faulty CPU, I think it
could possibly also be faulty RAM if you do not have ECC RAM
(otherwise problem would have been detected in the RAM not the L3
cache). Either way it is a single bit flip detected by ECC in the
cache and corrected.
If you can shut the machine down I would run a few passes of
memtest. That will hopefully spot any RAM problems.
If the RAM comes up clean but it keeps happening, I would really
suspect the CPU and plan for a replacement soon.
If the RAM comes up clean and it never happens again well, then yes
it could be cosmic rays or similar. I have seen this sort of thing
only a couple of times in 20 years; only one of those times did it
not soon get worse. It's not really enough data to say whether you
are in for a bad time.
Cheers,
Andy
--
https://bitfolk.com/ -- No-nonsense VPS hosting
More information about the GLLUG
mailing list