[Gllug] entropykey: why did nobody ever mention this thing before?

Nix nix at esperi.org.uk
Mon Aug 2 23:04:13 UTC 2010


On 2 Aug 2010, Andrew Farnsworth told this:

> I know several people who have used large quantities of random numbers and
> could definitely have used something like this as they were running out.  In
> case anyone wants to know what running out means, it means looping around
> and starting the list over.

That is not what happens if you run out of entropy on most modern
systems.  The entropy pool is hashed and mixed back into the pool: if
the entropy in the pool falls to zero, it is merely a sign that all the
unpredictability in the pool has been fed to the outside world, so an
attacker could potentially predict its future contents. Of course this
would require breaking SHA-1 first, and if it gets broken the kernel
will switch to using something unbroken. But still raving paranoia
dictates the maintenance of decent amounts of entropy. (Also, if you're
creating cryptographic keys or nonces themselves used in cryptography,
well, a lot of crypto algorithms themselves use SHA-1 internally and
it's long been considered a bad idea to hash the output of hashes with
themselves. And a *lot* of things create keys and random nonces using
random noise extracted from /dev/(u)random and then use them for crypto,
so it's best to keep it topped up with entropy.)

What you're talking about is quite different: a PRNG being used for
longer than its period, which is of course a catastrophically bad
idea. However, as long as any entropy at all arrives in the pool over
that time, this does not apply to randomness derived from /dev/urandom,
and it is vanishingly unlikely that this would not happen in the period
of the algorithm used by /dev/random: asphyxiating because all the air
molecules in the room suddenly moved to the other side is probably more
likely. (Of course it doesn't apply to randomness derived from
/dev/random because it won't feed you any randomness at all unless there
is entropy in the pool.)

In any case, the periodicity of half-decent PRNGs is enormous; the very
Wikipedia article you point to mentions the Mersenne twister, with its
period of 2^19937-1. Periodicity is not a problem you need concern
yourself with. Repeating sequences of bits in the results *is*: the
horrible old rand() used to *alternate* least significant bits (you
could find repetitions by hand with a piece of graph paper, it was truly
appalling: I'm not sure that even *counts* as a spectral test).

>                              If you have not used the random number
> generator in most PCs and Languages today, you may not realise that they are
> really pseudo random in that they are generated using a formula on a seed
> number to get a new number which it returns as the random result and then

/dev/random is a different kettle of fish. If it is properly fed with
entropy, it is truly random even to someone who can reverse SHA-1 on
demand. The problem is acquiring enough entropy: PCs don't have terribly
good hardware sources of randomness as a rule.

>           My friend was a PhD candidate and was seeing patterns in her data
> that just could not be explained, she finally tracked it down to the random
> number generator and the fact that she was using orders of magnitude more
> numbers than the algorithm could produce.

Either it was an awful algorithm or she was hitting periodicities in
smaller sequences of bits than the complete output. She wasn't using rand(),
was she? That barely counts as a random number generator at all...

>                                            She ended up buying a very large
> set of random numbers (who would have thought that was a product) so it is

A product from the very earliest days of computing, no less.

<http://www.amazon.com/Million-Random-Digits-Normal-Deviates/dp/0833030477>

> nice to see this type of setup where you can get random numbers generated
> simply and easily.  She probably would have still purchased the large block
> of random numbers as she was using them much faster than the Key here would
> generate them so it was better for her to purchase a large block than to
> have to wait for the numbers to be generated.

Yeah, if you want huge numbers of random numbers, there are better ways
to generate them... but they're much more expensive and generally
require more care to keep going than a plug-and-forget USB key. Several
records in this area just got broken in quick succession. The current
stands at 3Gbits/sec, or 300Mbits/sec if you insist on quantum rather
than thermal noise: <http://www.technologyreview.com/blog/arxiv/25355/>.

> For more information... see
> http://en.wikipedia.org/wiki/Random_number_generation

For more information than you can believe, see the *entirety* of chapter
3 of _The Art of Computer Programming_, 194 pages of excellently-written
and only-slightly-dated info (we have better statistical tests and
generators now, but everything written in there is still true as far as
I know, and I don't know anywhere else that provides so much info in one
digestible lump).

(s3.5 covers the question (asked earlier in this thread) of what makes a
sequence of numbers random.)
-- 
Gllug mailing list  -  Gllug at gllug.org.uk
http://lists.gllug.org.uk/mailman/listinfo/gllug




More information about the GLLUG mailing list