[Nottingham] LVM and load balancing and ReiserFS

Sun Oct 28 11:46:13 GMT 2007

On Sat, 2007-10-27 at 13:43 +0100, Martin wrote:
> Duncan John Fyfe wrote:
> [---]
> > 
> > What are you trying to achieve ?
> 
> Thanks for the very good summaries. Time to think up a working scenario.
> 
> I've great prejudice against RAID for the increased latency and the mess
> if it goes wrong. If you're wanting a warm fuzzy feeling of resilience
> for your data, I much prefer using rsync/mirroring onto an independent
> system backed up with good backups all round.
> 
> 
> So... I have a large (1 TB +) ever expanding chunk of frequently
> accessed data that can't easily be divided up.

> So, being as it's got to go across multiple disks, I'd like to have some
> flexibility to swap in/out disks as the data grows further, and also
> gain some IO performance. All without needing to completely rewrite the
> whole lot onto new disks for each upgrade.
> 
> 

LVM2 won't stop a mess being made when things go wrong and as Andy has
already said RAID != backup.  RAID == (hopeful) fault tolerance.
If you want a filesystem to span disks you either need LVM2+/-RAID or a
filesystem (lustrefs, ocfs2, unionfs) which might do it for you.  No
matter how you do it you need to deal with the fact that a disk (or
more) in that set can go fubar.

If you want to gain IO performance then you should be using (possibly
hardware) RAID to aggregate the disk performance.

Remember that while modern hard disk IO has improved considerably over
the years the seek time has not.  Seek times are ~ 5-10 ms so you might
expect ~ 100 -- 200 seeks per second from a disk.  Given this, if your
usage is in effect a random access pattern (handling lots of small files
or serving up lots of small chunks from a big file to different users)
then you will get better performance from 1TB of RAID disks than a
single 1TB disk.

> One option might be LVM with linear addressing *IF* ReiserFS does (or
> can be configured to) spread the data evenly-ish across the full
> physical storage space.
> 
???
If you configure LVM2 to be in linear mode you are telling it to fill
the physical devices (which could be disk partitions, it is not
restricted to whole devices) sequentially.  The overlying filesystem
cannot change this.

> Does ReiserFS optimise to use HDD cylinders to minimise head movements?
> If so, will ReiserFS still pick that up from LVM?...
> 

As has already been said it cannot.
If reiserfs is sat on top of a virtual block device (RAID / LVM2) what
is the meaning of cylinders, head movement ... ?  You can build software
RAID on top of USB sticks plugged into USB ports if you want to.  Again,
hard disk properties have no meaning (yes, you can put RAID5 across N+1
USB sticks, pull one and reconstruct the array on replacing the pulled
one.  One more reason why linux is fun/quirky :)

mke2fs has options to help align  the filesystem with an underlying RAID
configuration (-E stride=).  I don't know about reiserfs.  If you do use
reiserfs, make sure you use a recent kernel.  There was a long standing
feature/bug fixed in June/July (??) which had an impact on performance
and fragmentation of files and was what lead the mythtv people
recommending not to use reiserfs. 
Details splattered around this thread: http://lkml.org/lkml/2006/7/21/109
(or is mentioning this spreading fud ?) 

Personally, unless I was pushed for space I would configure it as RAID5
(2+1) with LVM2 on top.  If I had the cash I would also invest in a
hardware RAID card - not necessarily to use the hardware RAID but to get
some battery backed disk cache.

Have fun,
Duncan
> 
> Or is LVM clever about striping if you then add another one or two HDDs?
> 
> 
> Or any better ideas?
> 
> Regards,
> Martin