[Gllug] Hacker Attack, and a wild aside about version-controlled filesystems

Tethys sta296 at astradyne.co.uk
Thu Jan 12 01:30:35 UTC 2006


Nix writes:

>File access is, unfortunately, going to be slow, scaling as O(mn) where
>m = the number of versions needed to traverse to reconstruct the file
>contents, and n = the number of blocks in the file.

Not necessarily. A few alternative options:

1. Use an RCS-like approach of storing the most recent version in its
   entirety, and deltas back to the beginning of time, to make m=1 for
   accessing the current version. Yes, as Larry McVoy is so keen to point
   out, this can lead to undetected corruption of previous versions.
   However, his insistence on using SCCS style storage isn't the only
   solution. You could also have a periodic reconstruction of the first
   checked in version, which could be checksummed to detect corruption.
   The period can be as frequent or infrequent as you wish, but pick a
   sensible default, and you should be fine.
   
2. Store a full version of the file every c revisions, so at most you need
   to go back c revisions, rather than allowing m to grow arbitrarily large.
   Yes, this incurrs a storage overhead, but for most non-niche cases today,
   storage is cheap.
   
3. Every c revisions, store not the delta from the previous version, but
   from the file as it was c versions ago. This is a kind of half way house,
   in that if you set c=10, you reduce m by nearly an order of magnitude,
   without paying the storage penalty of saving the whole file. Thus to get
   to revision 32, you'd go from 0 -> 10 -> 20 -> 30 -> 31 -> 32, making
   m=5, rather than m=32. Checkin time increases, but the ratio of reads to
   checkins is likely to make this a win for most files.
   
Tet
-- 
Gllug mailing list  -  Gllug at gllug.org.uk
http://lists.gllug.org.uk/mailman/listinfo/gllug




More information about the GLLUG mailing list