[Gllug] Partioning advice needed

Nix nix at esperi.org.uk
Mon Feb 19 23:14:20 UTC 2007


On 19 Feb 2007, Anthony Newman verbalised:

> Nix wrote:
>>> evaluate on the basis of the extra work and the difficulty added to
>>> system recovery if you have, say, a botched kernel upgrade.
>> If you're using initramfs, no difficulty at all. That's one of the
>> things that's so nice about initramfs: because the rootfs contents are
>> linked into the kernel image, they never get out of synch, and if they
>> worked in your old working kernel, they'll still work later, even if
>> you've installed totally broken LVM tools in the meantime.
>
> I have little choice but to use initramfs at home, where I'm running
> root-on-LVM-on-RAID just because I'd never done it before. If I cock
> up the kernel or initramfs (obviously unintentionally), I have to
> recover it by hand

Keep an old known-working kernel around (I call mine
/boot/vmlinuz.stable: a name that the kernel's `make install' doesn't
overwrite automatically) and you'll never need to do that.

Obviously I had a lot of interesting reboots when writing the init
script, but that's past now.

> It's odd that init{rd,ramfs} used to be one of those optional things,
> now people seem to see it as indispensable whether you actually need
> it or not.

I see it is indispensable because it adds so *much* flexibility it's not
true, with an extremely elegant implementation which is pretty much
guaranteed never to break (as breaking it would almost require breaking
the concept of mount(2) entirely). (initrd is quite different, a
half-hearted crock with horrible switch-to-real-/ semantics which I'll
be glad to see die.)

>            I just consider it an extra hassle and a hack to make
> things work that unfortunately became the norm.

Er, the kernel *always* has an initramfs linked into it. You can't avoid
it. The VFS would implode the first time it had to traverse its vfsmnts
if the rootfs wasn't sitting there terminating the list.

> I suppose if you use a distro that hides the magic behind some build
> utility it's easier to ignore it.

The kernel *already* hides the build-initramfs-from-stuff-on-the-local-
disk behind its build process. It's almost invisible: drop the right stuff
in your kernel source tree's usr/ and that's it.

>>> Compared to commercial storage options, which effectively provide
>>> highly flexible redundant storage in software, Linux RAID/LVM has a
>>> way to go yet.
>> I dunno. I'd say I've *got* `highly flexible redundant storage in
>> software'. Many of the commercial RAID options in particular come with
>> lovely extra features:
>> <snip informative stuff>
>
> You get what you pay for as usual; the sort of stuff I was alluding to
> would probably cost a significant part of a person's lifetime salary
> over its 3-5 year lifetime.

Guess why I have no experience of it :/ my workplace is trying to save
up for a SAN on the dubious basis that GFS must be crap because it's not
expensive enough. Myself I'd vastly prefer GFS...

>                             There's no reason why the technology under
> such things won't trickle down into the free implementations though,
> provided it isn't too heavily patented.

Ah, well, Neil Brown (md) and Alasdair Kergon and co (LVM) don't need to
worry *too* much about patenting: one's in .au and one lot are in .uk
(but you probably knew that, Cambridge isn't all that terribly far
away).

> I'm being deliberately obtuse because I've been recently trying to
> make Linux RAID/LVM go fast, stay fast and be able to be made to go

Fast with RAID+LVM is interesting, you need to play games with the
offset of the first filesystem on the chunk so as to have the LVM
metadata not offset your filesystems such that they think they're
reading from chunk-by-chunk when in fact they're reading half one
chunk and half the next. I scribbled some notes on this and then
lost them... :/

> I've also seen no mention of parity scrubbing of RAID 4/5/6 arrays
> under Linux; anyone know if it's possible?

echo check > /sys/block/md0/md/sync_action

or similar?

>                                            I saw 500GB of data go down
> the pan last year because of an undetected disk error on a 14 disk
> RAID5 array. Which was nice.

Why on earth would a disk error go undetected? (And if a disk error
going undetected can smash your parity undetectably, aren't you already
running degraded?)

>>> [0] Maybe there are experimental ones, but not that I'd use for any
>>> data I care about
>> ext3?
>
> I probably should have mentioned that I was thinking "while both
> online and mounted" in the same thought, although I failed to mention
> that at any point. Resizing unmounted file systems is tantamount to
> recreating them, and thus cheating :)

ext3 has online resizing, I think. (ext2 certainly does and I think
this was up-ported to ext3 as well some time ago.)

-- 
`In the future, company names will be a 32-character hex string.'
  --- Bruce Schneier on the shortage of company names
-------------- next part --------------
-- 
Gllug mailing list  -  Gllug at gllug.org.uk
http://lists.gllug.org.uk/mailman/listinfo/gllug


More information about the GLLUG mailing list