[Gllug] disk problems
Sean Burlington
sean at burlington.eclipse.co.uk
Tue Mar 14 21:27:30 UTC 2006
Nix wrote:
>
> Between May 1989 (my first PC) and Jan 2006 I had two fan failures.
>
> In late Jan 2006 and early-to-mid Feb 2006 I had
>
> - two disk failures (one whose motor died at spinup, one just from
> old age and bearing wear), leading to the decommissioning of an
> entire machine because Sun disks cost so much to replace
> - one motherboard-and-network-card failure (static, oops)
> - one overheating CPU (on the replacement for the static-death box)
> - and some (very) bad RAM (on that replacement).
How many machines do you have ?
It can't be so many that that isn't an appalling failure rate!!!
> Everything seems to be stable now, but RAID it is, and because I want
> actual *robustness* I'm LVM+RAID-5ing everything necessary for normal
> function except for /boot, and RAID-1ing that.
RAID 5 = 3 hard disks + controller...
I can't really justify the expense of that even though I have had a
couple of failures (and one or two learning experiences a while back)
Pretty much everything important is under version control at work and
backed up to tape nightly.
I just need a more effective way of separating out stuff that needs to
be regularly backed up from the rest of it...
My backups have been getting less frequent as I keep finding that the
stuff I planned to backup is over DVD size!
And maybe I'll look into something like MondoRescue - even if I end up
spending more time backing up that I would restoring from scratch - at
least I can plan when I do the work.
> Thanks to initramfs this really isn't actually all that hard :) my /init
> script in the initramfs is 76 lines, and that includes enough error
> checking that if / can't be mounted, or the RAID arrays are shagged, or
> LVM has eaten itself, I get a terribly primitive shell with access to
> mdadm, the lvm tools, and fsck, on a guaranteed-functioning FS which can
> only go away if the kernel image itself has vanished.
>
> (I'm an early adopter; eventually use of an initramfs will be mandatory,
> and even now everyone running 2.6 has an initramfs of sorts built in,
> although it's empty.)
I try not to be an early adopter - or at least I try and stick with
Debian stable where I can and cherry pick from more up to date stuff.
>
> The CPU in this box is nice and cool for a P3/600, probably because I
> overdid it and slung in a huge fan after hte first overheating incident:
>
> CPU Temp: +44.2°C (high = +95°C, hyst = +89°C)
>
> :)
lmsensors has been on my todo list for a bit ....
>
>
>>I think I'm going to make more effort to switch off overnight in
>>future - it seems to me that it's boxes that get left on for months
>>which have problmes.
>
>
> I've left boxes on for years at a time with no problems at all. Most of
> my failures have happened at poweron time (excepting the old-age disk
> and that was fifteen years old when it died and had been running
> constantly for all that time, except for house moves; it had been a
> huge and expensive disk in its day, 4Gb!)
Now I've got things up and running again I'm much happier and thinking I
don't want to give up mythtv - so it's likely to get left switched on.
What I will do is make a full backup before shutting down such a machine!
>
>>>What is `the system'? /etc/mtab (as used by df(1)) is maintained by
>>>mount(8) and is terribly unreliable; it's confused by mount --bind,
>>>per-process filesystems, chroot(8), mount --move, mount --rmove, subtree
>>>sharing, you name it, it confuses it.
>>>/proc/mounts is maintained by the kernel and is actually reliable. What
>>
>>this seems bad - here was no /proc/mounts
>
>
> Either you don't have /proc mounted or you're running a pre-2.4 kernel.
> Both these things are generally bad signs.
>
it was a 2.6 kernel compiled by me rather than ready distributed
I'm not sure if proc was mounted or not - it was in /etc/fstab but since
I didn't trust df and couldn't read proc .... I re-installed to a
different disk.
--
Sean
--
Gllug mailing list - Gllug at gllug.org.uk
http://lists.gllug.org.uk/mailman/listinfo/gllug
More information about the GLLUG
mailing list