[Nottingham] [Misc] Making a hash of it...

Martin martin at ml1.co.uk
Sat Jun 6 22:40:45 UTC 2020


Folks,

Here be perhaps a bit of a giggle...


Long long ago, I began writing a little bash script to run through a
filesystem and collect into a pretty-printed log the md5sum for all the
files.

The idea was that this could then be used as a long term integrity check
and also to check for duplicate files.

For speed, the script is pipelined at every step and multiple
checksums/hashes can run in parallel.

The script worked but was never finished.


Roll forwards to now and I've a big disk to check through for
duplicates... So, the old script has been resurrected and tidied up and
the test output looks good. The script really was very nearly finished.
Rather odd why not so.

Next was to check what hash to use today and to benchmark for the
fastest and/or most worthwhile...


One brew later:

OK (testing rhash on a few GBytes of tmpfs)... So any of md4sum, md5sum,
sha1sum, tiger, btih, aich, ed2k, has160, edon-r256, edon-r512 are all
way faster than any HDD, even for a single single-threaded instance!

Pointless to go parallel!!


In this case, md5sum may as well do.

Still, 'twas good fun playing with pipes :-)


Enjoy!
Martin



More information about the Nottingham mailing list