Doug's MATLAB Video Tutorials
June 16th, 2010
Puzzler: Running it twice
As with many math geeks, I am fascinated with games of “probability theory and applied psychology” (also known as Poker!). At a recent game this debate came out: “Does running it twice change your expected value?” What does this question mean in plain English?
Let’s give an example: Two more cards will be dealt from the remaining 45. If either or both of them are one of the four remaining Kings, the underdog will win $100. So, to LOSE the hand he will have to NOT get a King (41/45) and then he will have to NOT get a King (40/44) on the second card too. Multiplying this out, 82.82% of the time he will lose, or 17.17% time he will win. This means that on average this hand is worth $17.17. Since we are going to only do this once, his variance from expectation will be quite large since his two possible outcomes are either $0 (82.82%) or $100 (17.17%). However if he were in this same exact situation twice for half the amount of winnings for each time, his outcomes will be either $0 (68.59%), $50 (28.44%) or $100 (2.94%). The more times it is dealt out, the lower the variance. There is no question that if the deck were shuffled between the two trials, the math above holds. However, if two cards are dealt and NOT replaced then two more cards are dealt from the remaining 43 does the above math hold?
I was distracted at the table for the rest of the night puzzling this out. It turns out that “running it twice” does not change the average return, though it changes the variance. My buddy Jesse was kind enough to write some MATLAB code to ‘prove’ this. I did the same, only in a a Monte Carlo simulation way. Rather than work out the closed form solution, I figure just run through half a million iterations and see!
It always amazes me the many different ways that people use MATLAB to ‘prove’ their point on simple problems like this. Just like the Frisbee flipping puzzler, send your most interesting case, made in MATLAB and I will send out a MATLAB t-shirt to the most interesting entry.
By
Doug Hull
Doug first used MATLAB in 1994, could not figure it out until he got some help in 1995. He is now dedicated to making sure that no one else wastes a year of their life not knowing MATLAB like he did.
14:16 UTC |
Posted in Level: Basic, Topic: Puzzler |
Permalink |
12 Comments »
You can follow any responses to this entry through the RSS 2.0 feed.
You can skip to the end and leave a response. Pinging is currently not allowed.
Leave a Reply
|
OK, slapped this together during lunch time.
Without loss of generality, we can consider 8 cards remaining, instead of 45. (On the superfast, megamemory MATLAB of the future, we could set “numberCards” in the following code to 45.)
Also, this only proves the mean. It does not address the variance.
Proof by exhaustion:
@Cyclist,
This is rather elegant for a brute force solution! :) I did mine monte carlo style, so I could run all the cards, but I get a wee bit of variance even on half a million iterations.
Well done!
Doug
@dhull,
Maybe the variance that you observe using a half million iterations is due to the nature of the “random numbers” generated by Matlab. Are all the random numbers really random and independent? I have observed similar trends with the variance in my Monte Carlo simulations.
Claudio,
When doing any Monte Carlo or other analysis that involves (pseudo)randomness, an important question to ask oneself is, “How much variance should I EXPECT in my simulation, given the number of iterations I am running, etc.”
If Doug had cared to, he could have constructed confidence intervals around his estimates, and there is every reason to believe that these intervals would cover the true exact value.
You are making a vague and unsubstantiated suggestion that the MATLAB pseudorandom number generator is suspect, but the reality is that their suite of generators are amongst the most robust ones out there. If you have concrete evidence to the contrary, I’d be interested to see it.
cyclist,
Thanks for your reply. Sure, you can check these two studies:
[DEAD LINK REMOVED- Doug]
http://www2.cs.cas.cz/~savicky/papers/rand2006.pdf
Claudio
I don’t need MATLAB to prove that running it twice with replacement and without replacement gives the same expectation. Distributive property is all I need.
Suppose number of cards unseen is 44. This number could be generalized later. Suppose underdog’s number of outs is x. In the expectation formulas for the favorite below, I’m not writing out the part that evaluates to zero.
Expectation for favorite
To generalize, change 44 to “t” and 43 to “t – 1″ where t is total number of unseen cards.
Algebra rocks!
= 100 * (44 – x)/44
Note: The above reasoning is for situation with one card to come (44 cards unseen). The stated problem is for the situation of two cards to come (45 cards unseen), but similar work could be done to prove the equalities.
Here is a pure probability treatment. The result of puzzler(45) is identical to Doug’s Example. However, the Cyclist’s results for n=8 match only the mean but not the distribution for the case without replacement. For the case with replacement (shuffle) his wins are either 0 or 2, but strangely they give the same mean! Are we solving two different problems?
Here is an update to cover all aspects:
1. The mean does not change if you run once, twice, or three (or more) times.
2. The variance decreases as rounds increase
2. Non-replacement does not change the distribution
Claudio, just for the record, the second of those two links you posted about random number generators in MATLAB points to a paper that describes a generator that has not been the default since R2006b, and for which there has been an alternative since something like R14sp2 (c2005). The current default generator in MATLAB is the Mersenne Twister, for which the only flaws that I’ve seen discussed have to do with crytography, which it is not intended for. Two other generators that are available specifically for parallel simulation are L’Ecuyer’s combined multiple recursive generator, and a multiplicative lagged Fibonacci generator.
I can’t tell what the first link you posted was to; it went to citeseer but the document was not found.
So, who was the “most interesting entry” that got the t-shirt? I always have my eye out for MATLAB swag.
@The Cyclist
Do’h, this one slipped past! I actually liked yours the best, though it does not yet scale, it was a clever way of doing this. Because it is an exact solution I feel confident with the result and that it will generalize. With my own Monte Carlo there was still that nagging doubt that the difference in the two cases was small enough to hide within the error bands of the problem.
I think you have every shirt we make at this point, but send me your snail mail and I will drop something your way again (let me know what you already have)
-Doug