[Gllug] Biometrics?

Nix nix at esperi.org.uk
Tue Mar 24 00:25:47 UTC 2009


On 23 Mar 2009, Bernard Peek uttered the following:
> The amount of data in the whole genome is vast, so they don't store
> that. Instead they use what amounts to a hash function. 

No they don't. That would require sequencing the whole genome, which
would be far too expensive (although this is changing fast). In
any case, hashing the whole genome would lead to different results
in virtually every sample, as the error rate of human DNA polymerases
is approximately 10^-10. The human genome is 3x10^9 bases long, so
every three cell divisions you get one mutation. Virtually all of
these land in unused junk: many of the rest are synonymous... but
all of those would change any whole-genome hash.

Hashes are really, really the wrong thing here. As John Edwards has
pointed out, what they do is count the *length* of tandem repeats.
Not a hash function at all.
-- 
Gllug mailing list  -  Gllug at gllug.org.uk
http://lists.gllug.org.uk/mailman/listinfo/gllug




More information about the GLLUG mailing list