# Celebrate the year of the rooster

According to the Chinese zodiac, the year 2017 is the year of the rooster. In fact, today (January 28, 2017) is the start lunar New Year, which is the first day of the year of the rooster. What better way to celebrate the year of the rooster than working a related math puzzle, or to perform a related random experiment!

At the office yesterday, conversation at the beginning of a meeting, before discussing the main topics, centered on the Chinese zodiac animal signs (the Chinese zodiac system is a 12-year cycle with a different animal representing each year). One coworker mentioned he is a tiger. Another coworker did not know his sign and Googled to find out that he is a rat! Another coworker is a rooster. It turns out that a pig is also represented. Imagine that you keep picking people at random and ascertain his/her animal sign. How many people do you have to ask in order to have met all 12 animal signs?

The random experiment that has been described is this. Put 12 slips of papers numbered 1 through 12 in a hat. Randomly draw a piece of paper and note the number and then put it back into the hat. Keep drawing until all 12 numbers have been chosen. Let $X$ be the number of selections that are required to perform this random experiment. Of course, you can expand the sample space to include more slips of papers (i.e. with more numbers). But the context will not be picking animal signs.

There are two ways to get a handle on the random variable $X$ as described above. One is through simulation and the other is through math.

Before discussing the simulation or the math, let’s point out that the problem discussed here is a classic problem in probability that goes by the name “the coupon collector problem”. The numbers 1 to 12 in a hat are like coupon (prizes) that are randomly given out when purchasing a product (e.g. a box of cereal). The problem discussed here is that the coupon collector desires to collect the entire set of coupons.
___________________________________________________________________________

Simulation

To get a sense of how long it will take, simulate random numbers from 1 through 12 until all numbers have appeared. The following is 5 iterations of the experiment.

9, 2, 6, 8, 9, 10, 2, 9, 8, 1, 5, 11, 1, 1, 2, 10, 9, 8, 8, 9, 5, 11, 9, 3, 7, 9, 8, 8, 4, 3, 1, 4, 3, 12 (34 draws)

7, 7, 1, 10, 11, 11, 10, 4, 5, 8, 8, 2, 6, 4, 6, 2, 12, 6, 6, 12, 9, 5, 8, 10, 1, 5, 10, 4, 9, 4, 1, 11, 11, 6, 2, 1, 6, 6, 3 (39 draws)

9, 5, 2, 2, 1, 5, 6, 11, 7, 11, 4, 6, 1, 12, 3, 7, 8, 3, 3, 2, 2, 3, 5, 6, 2, 5, 1, 6, 8, 5, 4, 10 (32 draws)

1, 5, 5, 4, 5, 12, 10, 1, 8, 1, 3, 9, 1, 3, 11, 9, 10, 3, 9, 11, 4, 4, 4, 7, 7, 3, 1, 11, 11, 4, 10, 6, 3, 2 (34 draws)

6, 7, 6, 1, 12, 6, 1, 1, 7, 1, 11, 10, 3, 3, 9, 6, 9, 4, 2, 6, 11, 7, 7, 11, 2, 6, 2, 1, 7, 2, 5, 9, 6, 12, 6, 11, 1, 11, 11, 2, 5, 6, 7, 5, 2, 11, 2, 2, 6, 2, 12, 5, 5, 5, 12, 10, 3, 11, 1, 10, 10, 6, 9, 11, 10, 7, 11, 5, 1, 9, 11, 9, 8 (73 draws)

Each of the number is generated by using the =RANDBETWEEN(1, 12) in Excel. In each iteration, the numbers are generated until all 12 numbers have been generated.

There is considerable fluctuation in this 5 iterations of the experiment. With the 5th one being exceptionally long, it is possible that it takes a long time to find all 12 animal signs. The average of the first iteration is obviously 34. The average of the first two iteration is 36.5. The averages of the first 3, 4, and 5 iterations are 35, 34.75, and 42.4, respectively.The last average of 42.4 is quite different from the average of 37 indicated earlier.

What if we continue to run the experiment, say, for 10,000 times? What would the long run averages look like? The following graph shows the averages from first 100 runs of the experiment. It plots the average of the $n$ iterations from $n=1$ to $n=100$.

Figure 1 – Long run averages from 100 runs

Figure 1 shows that there is quite a bit of fluctuation in the averages in the first 25 runs or so. Eventually, the averages settle around 37 but still retain noticeable fluctuation. The following graph shows the averages from first 1000 runs of the experiment.

Figure 2 – Long run averages from 1000 runs

The graph is Figure is smoother as it moves toward 1000, but still has noticeable fluctuation from 37 (in fact the graph is mostly below 37). The following graph shows the averages from first 10000 runs of the experiment.

Figure 3 – Long run averages from 10000 runs

The graph in Figure 3 shows the average of the first $n$ iterations with $n$ goes from 1 to 10,000. The graph is for the most parts a horizontal line slightly above 37, especially after $n=3000$. In fact the average of all 10,000 iterations is 37.3381, which is close to the theoretical average of 37.2385.

The simulation is an illustration of the law of larger numbers. The essence of the law of large numbers is that the short run results of a random experiment are unpredictable while the long run results are stable and predictable and eventually settle around the theoretical average.

The first 5 runs of the experiment (as shown above) are certainly unpredictable. It may take 34 draws or may take 73 draws. The first 100 simulations also have plenty of ups and downs, even though graph in Figure 1 shows a movement toward 37. The first 1000 simulations display more stable results but are below average as the graph move toward 1000 (Figure 2). In simulating the experiment 10,000 times (Figure 3), the long run averages settle around the theoretical average of 37.2385.

So if you survey people their animal signs, the time it takes has a great deal of random fluctuations. It may take 34 asks or 73 asks (as shown in the first 5 simulations). If the experiment is done repeatedly, the results are predictable, i.e. the average is around 37.

The long run results of a gambling game are predictable too and will settle around the theoretical average. The theoretical average of a gambling game is usually referred to as the house edge. For example, for the game of roulette, the house edge is 5.26%. For each bet of $1, the gambler is expected to lose 5.26 cents. In playing a few games, the gambler may win big. In the long run, the house is expected to gain 5.26 cents per one dollar bet. Thus the law of large numbers can mean financial ruin for the gambler (or profits for the casino). For an illustration of the law of large numbers in the context of the game of Chuck-a-Luck, see here. For an illustration in the context of the roulette wheel, see here. Another piece of useful information from the 10,000 simulated runs of the experiment is the frequency distribution. Table 1 Frequency Distribution of the 10,000 Simulated Runs $\begin{array}{rrrrr} \text{Interval} & \text{ } & \text{Frequency} & \text{ } & \text{Relative Frequency} \\ \text{ } & \text{ } \\ \text{10 to 19} & \text{ } & 375 & \text{ } & 3.75 \% \\ \text{20 to 29} & \text{ } & 2817 & \text{ } & 28.17 \% \\ \text{30 to 39} & \text{ } & 3267 & \text{ } & \text{ } 32.67 \% \\ \text{40 to 49} & \text{ } & 1931 & \text{ } & \text{ } 19.31 \% \\ \text{50 to 59} & \text{ } & 901 & \text{ } & \text{ } 9.01 \% \\ \text{60 to 69} & \text{ } & 379 & \text{ } & \text{ } 3.79 \% \\ \text{70 to 79} & \text{ } & 190 & \text{ } & \text{ } 1.90 \% \\ \text{80 to 89} & \text{ } & 88 & \text{ } & \text{ } 0.88 \% \\ \text{90 to 99} & \text{ } & 30 & \text{ } & \text{ } 0.30 \% \\ \text{100 to 109} & \text{ } & 13 & \text{ } & \text{ } 0.13 \% \\ \text{110 to 119} & \text{ } & 3 & \text{ } & \text{ } 0.03 \% \\ \text{120 to 129} & \text{ } & 3 & \text{ } & \text{ } 0.03 \% \\ \text{130 to 139} & \text{ } & 1 & \text{ } & \text{ } 0.01 \% \\ \text{140 to 149} & \text{ } & 2 & \text{ } & \text{ } 0.01 \% \\ \text{ } & \text{ } & \text{ } & \text{ } & \text{ } \\ \text{Total } & \text{ } & 10000 & \text{ } & 100.00 \% \end{array}$ Figures 1 to 3 tell us the long run behavior of the simulations (e.g. the long run average is 37). Table 1 gives the counts of the simulations that fall into each interval and the corresponding relative frequency (the percentage). Table 1 tells us how often or how likely a given possibility occurs. The total number of simulations that fall within the range 20 to 49 is 8015. So about 80% of the time, the experiment ends in 20 to 49 draws. Furthermore, 92.95% of the simulations fall into the interval 20 to 69. This really tells us what the likely results would be if we perform the experiment. The frequency distribution also tells us what is unlikely. There is only 3.75% chance that the experiment can be completed with less than 20 draws. In the simulations, there are two that are above 140 (they are 141 and 142). These extreme results can happen but are extremely rare. They only happened about 2 times per 10,000 simulations. ___________________________________________________________________________ The math angle There is also a mathematical description of the random experiment of surveying people until all 12 animal signs are obtained. For example, there is a formula for calculating mean, and there is also a formula for calculating the variance. There is also a probability function, i.e. a formula for calculating probabilities (akin to Table 1). The formula for the mean is actually simple to describe. Let $X_n$ be the number of draws from the set $\left\{1,2,3,\cdots,n \right\}$ with replacement such that each number in the set is picked at least once. The expectation of $X_n$ is the following. $\displaystyle E[X_n]=n \biggl[ \frac{1}{n}+\frac{1}{n-1}+ \cdots+ \frac{1}{3}+\frac{1}{2}+1 \biggr]$ The 37.2385 theoretical average discussed above comes from this formula. The the case of $n=12$, the mean would be $\displaystyle E[X_{12}]=12 \biggl[ \frac{1}{12}+\frac{1}{11}+ \cdots+ \frac{1}{3}+\frac{1}{2}+1 \biggr]=37.23852814$ For more details of the math discussion, see this previous post. For a more in-depth discussion including the probability function, see this post in a companion blog on probability. ___________________________________________________________________________ $\copyright \ 2017 \text{ by Dan Ma}$ Advertisements # The Game of Chuck-a-Luck Chuck-a-Luck is a game of chance that is often played at carnivals and sometimes used as a fundraiser for charity. The game is easy to understand and easy to play. The odds seem attractive. The carnival makes money consistently. It is a game of chance that seems to be popular for all sides. This post aims to shine a light on this game, from how it is played to how to evaluate its expected payout. When you gamble for a short duration, the results can be unpredictable. In playing just a few games, you may win big or stay even. What about the long run results? We will show that the house edge is about 8 cents per dollar wagered. That is, the expected payout is for the casino (or carnival) is 8 cents per$1 bet. That means for the player would on average lose 8 cents per $1 bet. We use the game of Chuck-a-Luck to open up a discussion on the law of large numbers. The following is a Chuck-a-Luck set. A Chuck-a-Luck set The set consists of three dice in a wire cage and a betting surface with spots for 1 through 6. Here’s how the game is played: • The player picks a number out of 1,2,3,4,5, and 6 and then places a stake on the spot corresponding to the number he/she chooses. • The three dice in the cage and then rolled. • If the player’s number appears on one, two or three of the dice, then the player receives one, two or three times the original stake, respectively, and the player can keep the original stake. • If the player’s number does not appear on any of the three dice, then the player loses the original stake. Essentially the player or the bettor (the person who makes the bet on the chosen number) wins the amount of the bet for each occurrence of the chosen number. When the player wins, he or she also keeps the original stake. However, when the chosen number does not show up among the three dice, the player loses the original stake. Suppose$1 is placed on the number $2$. Then if the three rolls are 2, 5 and 1, then the player wins $1, and keep the original stake of$1. If the the results of the three rolls are 2, 1 and 2, then the bettor wins $2, and keep the original stake of$1. If the results are 3, 1 and 4, then the player would lose the original stake $1. Of course, if the results are 2, 2 and 2, then the player wins$3 and keeps the original stake of $1. Note that the original stake that is returned to you (when you win) is not considered part of the winning since it is your own money to start with. This is an important point to keep in mind when calculating the expected payout. ___________________________________________________________________________ A Counting Reasoning The game of Chuck-a-Luck seems to be an attractive game. There are four distinct outcomes when three dice are rolled – zero, one, two and three occurrences of the chosen number. Yet the player wins in 3 of these outcomes. He/she loses in only one of the four outcomes. The game of Chuck-a-Luck is thus viewed favorably with at least even odds. The reasoning for even odds might goes like this: The probability of a two (if two is the chosen number) in a roll is 1/6. Since there are three dices, the probability of having at least one two is 1/6 + 1/6 + 1/6 = 3 (1/6) = 1/2, representing an even chance of winning. To calculate the odds of winning Chuck-a-Luck, it will be helpful to find out the likelihood (or probability) of obtaining zero, one, two or three appearances of the chosen number in a roll of three dice. We can use an informal counting reasoning to do this. Again, we assume the chosen number is 2. The roll of one die has 6 outcomes. So rolling three dice has 6 x 6 x 6 = 216 possible outcomes. How many of these 216 outcomes are the losing outcomes for the player? In other words, how many of these 216 outcomes have no twos? Each roll has 5 outcomes with no two. Thus in three rolls, there are 5 x 5 x 5 = 125 outcomes with no twos. Out of 216 bets, the player on average will lose 125 of them. So the probability of the player losing the game is 125/216 = 0.58 (about 58% chance of losing). That means when you play this game repeatedly, you win about 42% of the time (much worse than even odds). Let’s find out how often the player will win one time, two times or three times of the original stake. Out of the 216 possible outcomes in rolling three dice, there are 125 outcomes in which the player are losing and the other 91 outcomes are ones in which the player are winning (216 = 125 + 91). Let’s break down these 91 winning outcomes. In rolling three dice, how many of the 216 outcomes have exactly one 2? The patterns are 2-X-X, X-2-X and X-X-2. The X is a die that cannot be a 2 (and has 5 possibilities). The first pattern has 1 x 5 x 5 = 25 possibilities. Then there are 3 x 25 = 75 outcomes with exactly one of the chosen number. Let’s count the outcomes that have exactly two of the chosen numbers. The patterns are 2-2-X, 2-X-2 and X-2-2. The first pattern has 1 x 1 x 5 = 5 possibilities. Then there are 3 x 5 = 15 outcomes with exactly two appearances of the chosen number. Finally, there is only one outcome with three of the chosen number, namely 2-2-2. These counts are summarized in the following table. Table 1 Expected Frequency of Winning/Losing $\begin{array}{rrrrr} \text{Outcome} & \text{ } & \text{Frequency} & \text{ } & \text{Percentage} \\ \text{ } & \text{ } \\ \text{No 2} & \text{ } & 125 & \text{ } & 57.9 \% \\ \text{One 2} & \text{ } & 75 & \text{ } & 34.7 \% \\ \text{Two 2's} & \text{ } & 15 & \text{ } & \text{ } 6.9 \% \\ \text{Three 2's} & \text{ } & 1 & \text{ } & \text{ } 0.5 \% \\ \text{ } & \text{ } & \text{ } & \text{ } & \text{ } \\ \text{Total } & \text{ } & 216 & \text{ } & 100 \% \end{array}$ The game is far from a fair game. The player on average loses about 57.9% of the time. Furthermore, the winning outcomes (happening 42.1% of the time) skew toward the outcomes for winning one time of the original stake. So the game is lopsidedly in favor of the house (the carnival or the casino). There is another angle that we haven’t explored yet, namely the expected payout. The above table only shows the likelihood of winning and losing. It does not take into account of the expected size of the winning for the four scenarios. ___________________________________________________________________________ Expected Payout Table 1 above does not take into account of the size of the winning in each of the four scenarios. Let’s look at the following table. Table 2 Expected Winning/Losing $\begin{array}{rrrrrrr} \text{Outcome} & \text{ } & \text{Frequency} & \text{ } & \text{Winning per Bet} & \text{ } & \text{Expected Winning}\\ \text{ } & \text{ } \\ \text{No 2} & \text{ } & 125 & \text{ } & -\1 & \text{ } & -\125\\ \text{One 2} & \text{ } & 75 & \text{ } & \1 & \text{ } & \75\\ \text{Two 2's} & \text{ } & 15 & \text{ } & \2 & \text{ } & \30\\ \text{Three 2's} & \text{ } & 1 & \text{ } & \3 & \text{ } & \3\\ \text{ } & \text{ } & \text{ } & \text{ } & \text{ } & \text{ } & \text{ }\\ \text{Total } & \text{ } & 216 & \text{ } & \text{ } & \text{ } & -\17 \end{array}$ The last column in Table 2 gives the expected winning for each scenario, which is obtained by multiplying the frequency and the winning per bet. For example, for the “No 2” scenario, the player is expected to lose$125 (out of 216 bets of $1). For the “two 2’s” scenario, the player is expected to win 15 x$2 = $30 (out of 216 bets). What is interesting is the sum of that column, which totals -$17. By putting up $216 ($1 for each bet), the player is expected to lose $17. The expected payout is -17 / 216 = -$0.079 (a negative payout).

___________________________________________________________________________

Simulations

The short run results are unpredictable. In playing a few games, the player may win most of the games and may even win big. What is the expected result if the player keeps playing? To get an idea, let’s simulate 100,000 plays of Chuck-a-Luck. We use the function Rand() in Microsoft Excel to generate random numbers between 0 and 1. The simulated values of the dice are generated based on the following rules.

Table 3
Rules for Simulations
$\begin{array}{rrrrr} \text{Random Number} & \text{ } & \text{Simulated Die Value} & \text{ } & \text{ } \\ \text{ } & \text{ } \\ \displaystyle 0

Three die values are simulated at a time. The process is repeated for 100,000 times. We cannot display all the simulated rolls of dice. The following table shows the first 20 simulated plays of Chuck-a-Luck.

Table 4
Simulated Chuck-a-Luck Games (Chosen Number = 5, $1 per Bet) First 20 Plays $\begin{array}{rrrrrrrrr} \text{Play} & \text{ } & \text{Die 1} & \text{ } & \text{Die 2} & \text{ } & \text{Die 3} & \text{ } & \text{Winning}\\ \text{ } & \text{ } \\ 1 & \text{ } & 2 & \text{ } & 3 & \text{ } & 1 & \text{ } & -\1 \\ 2 & \text{ } & 1 & \text{ } & 2 & \text{ } & 5 & \text{ } & \1 \\ 3 & \text{ } & 3 & \text{ } & 6 & \text{ } & 1 & \text{ } & -\1 \\ 4 & \text{ } & 3 & \text{ } & 5 & \text{ } & 4 & \text{ } & \1 \\ 5 & \text{ } & 4 & \text{ } & 4 & \text{ } & 5 & \text{ } & \1 \\ 6 & \text{ } & 5 & \text{ } & 3 & \text{ } & 2 & \text{ } & \1 \\ 7 & \text{ } & 6 & \text{ } & 6 & \text{ } & 1 & \text{ } & -\1 \\ 8 & \text{ } & 4 & \text{ } & 6 & \text{ } & 5 & \text{ } & \1 \\ 9 & \text{ } & 6 & \text{ } & 4 & \text{ } & 2 & \text{ } & -\1 \\ 10 & \text{ } & 1 & \text{ } & 1 & \text{ } & 6 & \text{ } & -\1 \\ 11 & \text{ } & 4 & \text{ } & 3 & \text{ } & 5 & \text{ } & \1 \\ 12 & \text{ } & 5 & \text{ } & 6 & \text{ } & 3 & \text{ } & \1 \\ 13 & \text{ } & 4 & \text{ } & 6 & \text{ } & 5 & \text{ } & \1 \\ 14 & \text{ } & 3 & \text{ } & 4 & \text{ } & 1 & \text{ } & -\1 \\ 15 & \text{ } & 6 & \text{ } & 4 & \text{ } & 6 & \text{ } & -\1 \\ 16 & \text{ } & 2 & \text{ } & 4 & \text{ } & 3 & \text{ } & -\1 \\ 17 & \text{ } & 6 & \text{ } & 2 & \text{ } & 3 & \text{ } & -\1 \\ 18 & \text{ } & 5 & \text{ } & 5 & \text{ } & 4 & \text{ } & \2 \\ 19 & \text{ } & 1 & \text{ } & 5 & \text{ } & 5 & \text{ } & \2 \\ 20 & \text{ } & 5 & \text{ } & 4 & \text{ } & 6 & \text{ } & \1 \end{array}$ In the above 20 plays, the total winning adds up to$4. So the player is ahead. If he/she quits the game at this point, that would be a good night. The average winning in these 20 games is 4 / 20 = $0.20 (20 cents per game). According to the calculation based on Table 2, the expected payout is negative 8 cents per$1 bet. Does that mean Table 2 is wrong? The short run results of Chuck-a-Luck (or any other gambling game) is unpredictable. The player can win big or lose big or stay even in a small number of games. The following graph shows the average winning of the first 1000 simulated plays of the game. The graph plots the average of first game, the average of the first 2 games, the average of the first 3 games, and so on all the way to the average of the first 1000 games.

Figure 1
Long Run Averages in the First 1000 Games

The average winning is for the most part positive (staying above the zero) during the first 100 games. But after 200 games, the average is clearly in the negative territories. The end result of the first 1000 simulated games is a loss of $79, giving a payout of -$0.079. The following shows the long run averages in the 100,000 simulated games.

Figure 2
Long Run Averages in 100,000 Games

The graph is Figure 2 is for the most part a horizontal line below zero, except for the one spike at the beginning. The end result of the 100,000 simulated plays is a loss of $8057. Out of 100,000 bets of$1, the player has a loss of $8057. The average payout is -8057 / 100,000 = -$0.08057, which, though slightly higher, is in general agreement with the the expected payout of -$0.079. ___________________________________________________________________________ The Law of Large Numbers The lesson is clear. The short run results are unpredictable. The first 20 games in Table 4 are actually profitable for the player. The first 100 simulated games have ups and downs but the averages are mostly above zero (Figure 1). The remainder of Figure 1 and the entire Figure 2 indicate that the long run average winning (actually losses in this case) is stable and predictable – the loss of about 8 cents per$1 bet. Any player who has sustained large losses and is trying to make up for the losses by playing more and more games should think twice.

The 100,000 simulated plays of Chuck-a-Luck demonstrated in Figure 1 and Figure 2 are just one instance of a simulation. If we perform another instance of simulations, the results will look different. But the long run results and averages will look stable and similar to the results shown above. That is, any long run results (simulated or actual playing) will have slightly different ups and downs from Figure 2 but will eventually settle at the average of -\$0.079.

Thus the long run average results of playing Chuck-a-Luck are stable and predictable and will eventually settle around the loss of 0.08 (8 cents). This is the essence of the law of large numbers. In any random experiment, as more and more observations are obtained, the averages will approach the theoretical average. This applies to the game of Chuck-a-Luck or any other gambling games.

The game of Chuck-a-Luck is a good game. It can be played for entertainment or for giving to a charity. In either case, an individual player usually stops playing once the “budget” is depleted – the amount that is set aside for losing or for giving. Playing more and more games to recoup the losses after a losing streak will just lead to a deeper and deeper spiral of losses. Using the game as an “investment” vehicle is a sure recipe for financial ruin. Figure 2 should make that clear.

As indicated above, the house edge of 8 cents per dollar bet. There are other gambling games with better odds for player. For example, the house edge for the game of roulette is only 5.26 cents per dollar bet (see here for more information). This previous post has another take on the law of large numbers.

___________________________________________________________________________
$\copyright \ 2016 \text{ by Dan Ma}$