[Wolves] Tinkle tinkle little disk...

chris procter chris-procter at talk21.com
Thu Jan 3 21:17:16 GMT 2008


--- Stuart Langridge <sil at kryogenix.org> wrote:
> > >read them. That's all. There are three problems
> with this, none of
> > >which I have a clear idea how to solve. They are:
> > >  rsyncness
> > >  encryption key loss
> > >  bandwidth
> >
> >
> > You could break the file up into blocks (say 4KB
> each) then calculate the difference between each
> block and the previous back up of that block, then
> if theres any changes, encrypt the diff, compress it
> and send it over.
> 
> ...also known as: reinvent rsync. You're right,
> though :-)

Not really, rsync divides the backup file into blocks,
calculates checksums for each block then goes through
the live file byte by byte looking for similar size
blocks to avoid the problem of fixed sized blocks
below.

Yes I've read Tridges rsync paper :)


> To be honest, it's probably better to do what Peter
> suggested, and
> break the files up into blocks, and encrypt each
> block, then just ship
> changed blocks. This does rely on a small change in
> a large file not
> having effects throughout the file, but that's
> relatively unlikely.

If you just ship changed blocks then you have a
problem

Take 123456789abc
split it into three blocks 1234 5678 9abc and send
them all to the backup machine

delete the first character (the 1) then split it into
blocks you get
2345 6789 abc

which are all different from the original blocks we
sent so all need to be sent.

If you only ship diffs then you just send "delete the
first byte", or if you're rsync you send "234, then
block 2, then block3"


> > You actually dont need to reconstruct the file
> from the diffs untill you restore it so all the
> remote site ends up storing is encryted compressed
> messages that say "at 23:00 on 29thFeb in file X in
> the 23rd 4k block changed the seventh byte from a to
> e". To restore you get all of the patch files and
> rerun them in time order.
> 
> Erf. Then reassembling your backups becomes a
> horrible jigsaw puzzle,
> I think, which is made a lot more complex by how
> some of your backup
> nodes likely won't be on the network :)

Well you timestamp the pieces and reassemble them in
order so its not that hard really, and having missing
nodes doesn't effect this system more then any other.

 
Actually, thinking about it the easiest way to do all
this is to say stuff it, encrypt the complete files
and transfer everything, trusting that we'll all have
100mb fibre to the home before the end of the decade
anyway :) 

> Yeah, so I gotta ask the user for a passphrase when
> they first use the
> system. Still, that's the flaw with encrypting it, I
> suppose...

If you can come up with an encryption system that
doesn't require any information from the user or their
hardware then I will bow before you :)


chris


      __________________________________________________________
Sent from Yahoo! - the World's favourite mail http://uk.mail.yahoo.com




More information about the Wolves mailing list