[Nottingham] MySQL backed nameserver

Fri Nov 3 09:59:49 GMT 2006

On 02/11/2006 23:14, Tom Bird wrote:
> Well, the idea behind bind-dlz is a good one, and should be introduced 
> into the main BIND release IMO, however I do feel that backing it with 
> SQL databases is just pure folly.  I recently was the happy recipient of 
> a DNS attack of some kind, I still am not sure exactly why it happened 
> but I was seeing upwards of 10Mbit/s of DNS traffic at times on my two 
> hosted resolvers, thousands of queries a second (imagine the size of a 
> query packet and you get some idea).  BIND running as it does served the 
> queries and shrugged it off with a few percent of CPU used.

...because everything is in RAM. Nice and low-impact.

> Now imagine each query hits a full fat database rather than the quick 
> and cheap DB, optimised for serving DNS internal to BIND.  Plop, there 
> go my services.

Aha! There's a number of tricks to fix this, the first of which I would 
do (nowadays):

1. Frontline "authoritative" (from a client perspective) servers run a 
caching, forwarding-only service bound to the service IP addresses.

2. The forwarders for those could be local (bound to localhost or a 
private IP address) or on a separate machine. Either way, only the 
forwarder - which really is the authoritative server - has access to the 
DB, which could again be local or on a separate machine.

3. The DB instances being queried by the forwarders must support query 
caching, and must be slaves of another DB server (or more than one)

4. That (those) DB server(s) should NOT BE THE MASTER.

5. The master DB(s) should be on an entirely hardened, private box which 
then replicates forwards to the various slaves.

What this gives you is a number of buffers against various type of 
problem. Starting at the back:

  = Doing a database backup can be done on the intermediate DB server(s) 
instead of the master; as (with MySQL at least) a complete dump locks 
the tables and would prevent any management interface from successfully 
doing and INSERT or UPDATE operations until it unblocked.

  = You can replicate in real time, or start/stop either the slave or 
the master link in some way such that corruption to the master isn't 
automagically propagated to every other machine.

  = The frontend servers' BIND (or otherwise) instances will cache 
positive and negative responses from the forwarders and will therefore 
not send as many queries to the DB.

  = The DB instances being queried by the forwarders will cache queries 
(positive and negative) and will return results from RAM rather than disk.

  = Having a timed replication means that the tables don't change in 
real time, so the query cache will remain until a replication takes 
place. With realtime replication of a fluid DB with many INSERT, UPDATE 
or DELETE operations taking place will destroy the query cache very 
quickly and reduce its' efficiency.

Taken together, for a large system the steps above *should* result in a 
nice, robust system. They might not mind you, I haven't touched that 
code in several months. Someone else is sort-of responsible for it until 
next Thursday, and then, well, who knows!

Graeme