Archive for December, 2007

Shuffling Analysis

This post is in regard to Jeff Atwood’s post on Shuffling and The Danger of Naïveté.

In his second post he used an example of shuffling three-card deck 600,000 times to explain how naive solutions can be dangerous. The post is extremely well written. I enjoyed every bit of it. However I had two questions that I couldn’t find an answer for in Jeff’s post. Why exactly 3 permutations (out of the 6 possible ones) are biased? and why those 3 specifically are biased and not any other permutations?

I spent sometime thinking about it and using a pencil and a paper I was able to get some answers.
First here is the naive method: (I reversed the loop to be as similar as possible to the correct method)

for (int i = cards.Length - 1; i >= 0; i--)
{
    int n = random.Next(cards.Length);
    Swap(ref cards[i], ref cards[n]);
}

And here is the correct one (Knuth Shuffle)

for (int i = cards.Length - 1; i > 0; i--)
{
    int n = rand.Next(i + 1);
    Swap(ref cards[i], ref cards[n]);
}

Now let’s agree on some facts real quick. For a three-card deck:

  1. There are exactly 6 unique possible permutations for that deck (3!)
  2. Using the Naive shuffling method, there are 27 possible sequences of random-numbers on any given run (33)

The answer to the first question above, is that 27 mod 6 is equal to 3. What that means is there are 3 extra sequences of random numbers that have to map to 1 (or 2 or 3) of the 6 unique permutations of the deck.

To answer the second question we have to note something first about using swapping to perform a shuffle, take a look at the state of the deck at each iteration given the following random numbers:

Step Random Number Deck
0 N/A 123
1 2 132
2 3 123
3 3 321
Step Random Number Deck
0 N/A 123
1 1 321
2 1 231
3 2 321

Notice how the two different sequences of random-numbers resulted in the same permutation of the deck. Now remember there are 27 possible sequences and only 6 unique permutations. if we divide 27/6 we get 4 (with remainder 3) which means there are at least 4 paths (or sequences) for every unique permutation of the deck, the 3 extra possible sequence lead to 3 of the unique permutations giving them an advantage over the other 3.

so basically we have 6 permutations of the deck, 3 of them have 4 random-number sequences that lead to them, and the other 3 have 5 sequences that lead to them.

In this particular example if you follow the algorithm above and try the following random-number sequences you will get the same deck permutation for all 5.

  • 1,1,2
  • 3,1,3
  • 2,2,2
  • 1,3,1
  • 2,3,3

Conclusion

The problem with the Naive shuffling method is swapping! The fact that swapping elements in 2 (or more) different sequences can lead to the same deck permutation is a subtle problem.

The Knuth shuffle algorithm avoids that problem by ensuring N! possible random sequences where N is the number of elements in a deck (or any container).