[Gllug] Redundant hosting recommendation
Vidar Hokstad
vidar at aardvarkmedia.co.uk
Fri Nov 7 10:28:01 UTC 2008
On 6 Nov 2008, at 13:57, Simon Perry wrote:
> I need to come up with a hosting solution for a new site that offers
> 100% uptime but not having done this before I don't know what
> solutions
> are available? What I have come up with so far is;
>
> 1) Two dedicated servers in separate data centres with a 3rd party
> round
> robin DNS service
If you're serious about this, at least pick two different hosting
providers for the data centers.
I've experienced:
- hosting providers going bankrupt, forcing us to move servers at a
days notice (dot com bubble... Luckily we *didn't* rely on a single
provider)
- hosting providers that turn out to have systemic problems with
the way they handle hardware or software upgrades that affect multiple
of their sites at once, meaning the disaster recovery site would fail
at the same time as the primary far more than you should expect.
- hosting providers who route all their outbound traffic from
multiple data centres through the same network path, so failures with
their network providers make everything fail at once.
Also consider that for two data centres to provide full failover you
need at least twice the equipment needed to run the site - each data
centre must be able to take the full load. If traffic is high enough
going for three sites can be a cheaper alternative as it allows you to
"only" have 50% extra capacity for each site and still be able to
handle a full site failure (the tradeoff is a slightly higher risk of
having to do a failover, of course, as you now have three sites that
can fail).
DNS round robin is not entirely reliable, though providers like
UltraDNS / Neustar do a decent (but expensive) job of handling
failover. The problem is that you'll find many people cache DNS
entries longer than they are supposed to. You *will* be unavailable to
many people for at least a few minutes in the case a single site fails
if you use DNS. If this is not acceptable your only real choice if you
want to do it all yourself is to set up BGP (or have an ISP do it for
you) so you can announce multiple routes and have both sites answer to
the same IP. You may still get "blips" when the site fails over, but
they can be made a lot shorter.
If your content is mainly static, you can sidestep (part of) the
problem by using a content distribution network such as Panther
Express which can be set up to cache everything that has been
requested from your site more or less permanently (as a bonus you get
geographic load balancing that can significantly speed up access from
outside the UK - in one case I was testing Panther with a company I
worked for and they served up content to our test script faster than
our web server located on the local host). Most of these will use BGP
to announce multiple routes, and so will be very reliable if well
managed. Some of them can also be set up to talk to multiple backends
at your end, so that _they_ handle the load balancing / failover
between your sites. They are at best problematic for dynamic content
though, as they are geared towards longer term caching.
Keep in mind that some data centres - especially smaller ones - in the
London area are also vulnerable to failures in core infrastructure in
Docklands - ask hard questions about where they get their
connectivity. You might find both the data centres you pick get all
their bandwidth from the exact same locations (one of the peering
points in Docklands, typically - if you're extra unlucky they may be
going via the same fibre bundles), and while the big peering points
are typically very reliable there are enough alternatives that have
backup paths that doesn't go via Docklands that there's no need to
take the chance if the uptime is that critical to you.
> 2) A managed redundant hosting solution. Can anyone recommend a
> supplier?
I don't know anyone that sells this as a "packaged" solution, due to
all the data synchronization issues and complexities the moment
dynamic data is involved that other people have pointed out - those
issues tend to be very application specific.
--
Vidar Hokstad
Technical Director
Aardvark Media Limited
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.lug.org.uk/pipermail/gllug/attachments/20081107/3be6887f/attachment.html>
-------------- next part --------------
--
Gllug mailing list - Gllug at gllug.org.uk
http://lists.gllug.org.uk/mailman/listinfo/gllug
More information about the GLLUG
mailing list