[Gllug] Diagnosing hardware faults

John Edwards john at cornerstonelinux.co.uk
Mon Nov 29 11:00:18 UTC 2010


On Mon, Nov 29, 2010 at 12:34:26AM +0000, Steve Parker wrote:
> On 28/11/10 19:09, John Edwards wrote:
> > The other possibilities are CPU, motherboard, RAM (memtest will
> > not catch errors that occur under load)
>    
> Could you elaborate on that at all please?

As components heat up they expand and there is a small chance
that the electrical contacts to the RAM sticks are not 100%.

Also with the high bus speed we have today there are going to be
small random errors, which are automatically corrected with ECC
RAM but could cause problems in non-ECC RAM. This is most likely
to be in the application memory space can cause segfaults, but
if it happens in the kernel it could lead to a crash.

For desktops which are used during office hours this is usually
not noticed, but when you have a server on 24x7 for several years
you should use ECC RAM to prevent problems. I think someone
(John Hearns?) wrote about this in more detail several months ago.

But as the original post talked about crashes every day and did
not mention any application problems I would suspect this is not
the cause.


-- 
#---------------------------------------------------------#
|    John Edwards   Email: john at cornerstonelinux.co.uk    |
#---------------------------------------------------------#
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 204 bytes
Desc: Digital signature
URL: <http://mailman.lug.org.uk/pipermail/gllug/attachments/20101129/88852f58/attachment.pgp>
-------------- next part --------------
-- 
Gllug mailing list  -  Gllug at gllug.org.uk
http://lists.gllug.org.uk/mailman/listinfo/gllug


More information about the GLLUG mailing list