[Gllug] Filesystems again :)
pauln at truemesh.com
pauln at truemesh.com
Thu May 16 11:17:11 UTC 2002
On Thu, May 16, 2002 at 12:53:31PM +0200, John Hearns wrote:
> On Thu, 2002-05-16 at 11:37, pauln at truemesh.com wrote:
> > More pondering on how to do high availability storage :)
> >
> What do you mean by high availability storage?
Sadly not SANs (had EMC at last place), as no budget. I have a cluster
of machines, and need to have all of them getting data from a central
source (which is robust).
The servers get both user data and live feeds (updated every 2-4 mins).
The cluster is not physically in one rack (so shared storage may be
out), however they are on a fast local network.
The cluster is round robin meaning writes to the storage could occur via
any of the servers. The servers currently both read and write the data
from a single server, the feed gets delivered to all members. I'd like
to consolidate both into a more robust solution.
What I've played with so far:
1) rsync/unison in cron between all machines - this works, but is
expensive in polling. Also as more machines are added need to create
more syncs.
2) rsync/unison event triggered using FAM. Less expensive as no
polling, however when cp'ing large files (I tried an iso), it was sync'd
in chunks meaning a read may be incomplete.
3) nfs/samba/dav - still single point of failure with current set up as
only one write master. Need to see if can make write robust. (see point
5)
4) Berkeley db - looks like it will do all I need but coding change
needed.
5) Buy more machines and make back end redundant with take over (LVS,
http://people.redhat.com/jrfuller/cms/index.html, or something). This
could then use DAV/samba/whatever and failover.
6) Intermezzo and intersync - just compiling atm.
7) coda - haven't looked at yet.
I don't think there is an easy headless way to do it, so write master
with takeover is probably the way to go. It's a production system so
I'd prefer not to be running something bleeding edge.
Problems I've thought about: write collisions, merging, node removal,
nodes not being able to talk to other nodes.
The round robin is annoying as it could mean that you make a change, and
when you look, you're looking at a machine it hasn't progagated to.
Paul
--
Gllug mailing list - Gllug at linux.co.uk
http://list.ftech.net/mailman/listinfo/gllug
More information about the GLLUG
mailing list