[Gllug] Cluster management s/ware

James Hawtin oolon at ankh.org
Fri Nov 27 16:28:49 UTC 2009


> A compute cluster (ie number crunching), resilience is not the issue.
> Currently file sharing is via NFS (although each node has some hard disk),
> they have a SAN so use of GFS is possible - not looked at that yet.
> The nodes talk to each other by MPI (Message Passing Interface).
> 
> They want to be able to refesh (rebuild) the compute nodes with different
> installs.
> 

GFS really requires full cluster suite of tools, while it works fine it
would take some work to automate adding it into the cluster. The automated
builds could be done very nicely using network booting to do an automated
install, using a simple script the machines could decide there role in life
by what IP address they had etc. As part of the network boot it could check
the current load on the machine, if its wrong it does an install otherwise
it then boots local. Replacing machines would be nice and easy too.

Jameds
-- 
Gllug mailing list  -  Gllug at gllug.org.uk
http://lists.gllug.org.uk/mailman/listinfo/gllug




More information about the GLLUG mailing list