Wednesday, September 29, 2010

Don't leave randomness to chance (or, avoid using rand if it matters)

Just a little caution.

What do you think this piece of code prints?

#!/usr/bin/perl
use strict; use warnings;
my %r = map { rand() => undef } 1 .. 1_000_000;
print scalar keys %r, "\n";

On my Windows system with both ActiveState perl 5.10.1 and Strawberry perl 5.12.1, it prints 32768.

That is, the default random number generator on this system will only ever give you 32,768 distinct values.

Use a module such as Math::Random when randomness matters.

Now, your system might be better. You can feel good about that.

However, the point of this caution is to alert once more all those programmers who think all rands are created equal.

They are not.

Do not leave things to chance when randomness matters.

Use a well-known pseudo random number generator with established properties.

Not any odd function called rand which the default runtime gives you.

6 comments:

  1. Well, if randomness really matters, you need something hooked up to a truly random process, such as radioactive decay. It's not so hard to do that with a few bits and bobs from an electronic store. There used to be people who'd already done that and would give you the random numbers they collected. :)

    ReplyDelete
  2. Of course, true randomness is probably the least appropriate thing when the quality of your random number generator matters (simulation, statistical analysis etc).

    ReplyDelete
  3. Is this a Windows thing? I popped in after your SO reference as I'd never heard of it, and I'm running an ancient version of perl:


    # perl testrnd.pl
    1000000
    # more testrnd.pl
    #!/usr/bin/perl
    use strict; use warnings;
    my %r = map { rand() => undef } 1 .. 1_000_000;
    print scalar keys %r, "\n";
    # perl -v

    This is perl, v5.10.0 built for i386-linux-thread-multi

    Copyright 1987-2007, Larry Wall

    ReplyDelete
  4. This particular case is a Windows issue. However, the general point is not to rely on default C runtime implementations of rand if the quality of the PRNG matters.

    ReplyDelete
  5. Just a brief update: On my 32bit Windows system I get 32768 when running that script using Active State perl v5.10, but 1000000 when using Cygwin perl v5.14. Not sure, though, if this is related to Active State or to the perl version.

    ReplyDelete
    Replies
    1. Cygwin is not using the C runtime that comes with Windows.

      Delete