Removing bias
It's possible to remove bias from a stream of random numbers.
Three popular methods are:
-
1, The Von Neumann method.
Take two independent 1-bit samples. If the
bits are the same, discard them and take two more. If they are different,
use 1 of the bits as the output. This method has the nice property that
it's theoretically perfect - regardless of how biased the original source is,
the output is exactly as likely to be a 1 as a 0. But it has the unnice
property that it might never produce any bits. So, there's method
-
2, Exclusive Or.
Exclusive or (XOR) involves taking a chunk of the stream, and reducing
the number of bits by XOR'ing them together. If the original
source is 1 90% of the time, and 0 10%, then the XOR of 10 bits
will be 1 about 55% of the time, a vast improvement. XOR'ing enough
bits brings the probably of a 1 close enough to 50% for all practical
purposes. This is guaranteed to produce numbers in a known amount of
time, but it uses up a lot of bits, and so there's method:
-
3, Hashing.
Hashing involves running a chunk of bits through a hashing
function. CRC16 only takes about 6 bits to equal the performance of
10 XOR'ed bits, and larger hashing functions work better.
Plus, you can feed almost anything into a hashing function,
and distill it down to just the randomness.
Keyrand uses a simple CRC function
to hash key stroke timing. SHA256 and MD5 are popular hashing
functions in cryptographic random number generators like
Fortuna
SHA256 is used mostly for security reasons, not because it
produces a higher quality of random number. However, it does
do a better job of mixing than CRC does.