Celebrate the year of the rooster


According to the Chinese zodiac, the year 2017 is the year of the rooster. In fact, today (January 28, 2017) is the start lunar New Year, which is the first day of the year of the rooster. What better way to celebrate the year of the rooster than working a related math puzzle, or to perform a related random experiment!

At the office yesterday, conversation at the beginning of a meeting, before discussing the main topics, centered on the Chinese zodiac animal signs (the Chinese zodiac system is a 12-year cycle with a different animal representing each year). One coworker mentioned he is a tiger. Another coworker did not know his sign and Googled to find out that he is a rat! Another coworker is a rooster. It turns out that a pig is also represented. Imagine that you keep picking people at random and ascertain his/her animal sign. How many people do you have to ask in order to have met all 12 animal signs?

The number of people you have to ask is a random quantity. If you are very lucky, you may encounter all 12 signs by asking just 12 people (the minimum number of people you have to ask). The outcome of twelve people is not likely (it has only about 0.005% chance of happening). It is almost certain that you have to ask many more people. It turns out that the first few animals are easy to obtain (just like in the meeting with 4 different animal signs for 4 people). As you ask more people, the animal signs tend to repeat. So you have to keep asking more people. As you encounter more animal signs, it takes longer to get a new animal sign. For example, if you have encountered 11 animal signs already, you will have to ask on average 12 more people to find the last animal sign. So the overall average is not a number near 12. On average, you will have to ask about 37 people to get the entire set of 12 animals.

The random experiment that has been described is this. Put 12 slips of papers numbered 1 through 12 in a hat. Randomly draw a piece of paper and note the number and then put it back into the hat. Keep drawing until all 12 numbers have been chosen. Let X be the number of selections that are required to perform this random experiment. Of course, you can expand the sample space to include more slips of papers (i.e. with more numbers). But the context will not be picking animal signs.

There are two ways to get a handle on the random variable X as described above. One is through simulation and the other is through math.

Before discussing the simulation or the math, let’s point out that the problem discussed here is a classic problem in probability that goes by the name “the coupon collector problem”. The numbers 1 to 12 in a hat are like coupon (prizes) that are randomly given out when purchasing a product (e.g. a box of cereal). The problem discussed here is that the coupon collector desires to collect the entire set of coupons.


To get a sense of how long it will take, simulate random numbers from 1 through 12 until all numbers have appeared. The following is 5 iterations of the experiment.

    9, 2, 6, 8, 9, 10, 2, 9, 8, 1, 5, 11, 1, 1, 2, 10, 9, 8, 8, 9, 5, 11, 9, 3, 7, 9, 8, 8, 4, 3, 1, 4, 3, 12 (34 draws)

    7, 7, 1, 10, 11, 11, 10, 4, 5, 8, 8, 2, 6, 4, 6, 2, 12, 6, 6, 12, 9, 5, 8, 10, 1, 5, 10, 4, 9, 4, 1, 11, 11, 6, 2, 1, 6, 6, 3 (39 draws)

    9, 5, 2, 2, 1, 5, 6, 11, 7, 11, 4, 6, 1, 12, 3, 7, 8, 3, 3, 2, 2, 3, 5, 6, 2, 5, 1, 6, 8, 5, 4, 10 (32 draws)

    1, 5, 5, 4, 5, 12, 10, 1, 8, 1, 3, 9, 1, 3, 11, 9, 10, 3, 9, 11, 4, 4, 4, 7, 7, 3, 1, 11, 11, 4, 10, 6, 3, 2 (34 draws)

    6, 7, 6, 1, 12, 6, 1, 1, 7, 1, 11, 10, 3, 3, 9, 6, 9, 4, 2, 6, 11, 7, 7, 11, 2, 6, 2, 1, 7, 2, 5, 9, 6, 12, 6, 11, 1, 11, 11, 2, 5, 6, 7, 5, 2, 11, 2, 2, 6, 2, 12, 5, 5, 5, 12, 10, 3, 11, 1, 10, 10, 6, 9, 11, 10, 7, 11, 5, 1, 9, 11, 9, 8 (73 draws)

Each of the number is generated by using the =RANDBETWEEN(1, 12) in Excel. In each iteration, the numbers are generated until all 12 numbers have been generated.

There is considerable fluctuation in this 5 iterations of the experiment. With the 5th one being exceptionally long, it is possible that it takes a long time to find all 12 animal signs. The average of the first iteration is obviously 34. The average of the first two iteration is 36.5. The averages of the first 3, 4, and 5 iterations are 35, 34.75, and 42.4, respectively.The last average of 42.4 is quite different from the average of 37 indicated earlier.

What if we continue to run the experiment, say, for 10,000 times? What would the long run averages look like? The following graph shows the averages from first 100 runs of the experiment. It plots the average of the n iterations from n=1 to n=100.

    Figure 1 – Long run averages from 100 runs

Figure 1 shows that there is quite a bit of fluctuation in the averages in the first 25 runs or so. Eventually, the averages settle around 37 but still retain noticeable fluctuation. The following graph shows the averages from first 1000 runs of the experiment.

    Figure 2 – Long run averages from 1000 runs

The graph is Figure is smoother as it moves toward 1000, but still has noticeable fluctuation from 37 (in fact the graph is mostly below 37). The following graph shows the averages from first 10000 runs of the experiment.

    Figure 3 – Long run averages from 10000 runs

The graph in Figure 3 shows the average of the first n iterations with n goes from 1 to 10,000. The graph is for the most parts a horizontal line slightly above 37, especially after n=3000. In fact the average of all 10,000 iterations is 37.3381, which is close to the theoretical average of 37.2385.

The simulation is an illustration of the law of larger numbers. The essence of the law of large numbers is that the short run results of a random experiment are unpredictable while the long run results are stable and predictable and eventually settle around the theoretical average.

The first 5 runs of the experiment (as shown above) are certainly unpredictable. It may take 34 draws or may take 73 draws. The first 100 simulations also have plenty of ups and downs, even though graph in Figure 1 shows a movement toward 37. The first 1000 simulations display more stable results but are below average as the graph move toward 1000 (Figure 2). In simulating the experiment 10,000 times (Figure 3), the long run averages settle around the theoretical average of 37.2385.

So if you survey people their animal signs, the time it takes has a great deal of random fluctuations. It may take 34 asks or 73 asks (as shown in the first 5 simulations). If the experiment is done repeatedly, the results are predictable, i.e. the average is around 37.

The long run results of a gambling game are predictable too and will settle around the theoretical average. The theoretical average of a gambling game is usually referred to as the house edge. For example, for the game of roulette, the house edge is 5.26%. For each bet of $1, the gambler is expected to lose 5.26 cents. In playing a few games, the gambler may win big. In the long run, the house is expected to gain 5.26 cents per one dollar bet. Thus the law of large numbers can mean financial ruin for the gambler (or profits for the casino). For an illustration of the law of large numbers in the context of the game of Chuck-a-Luck, see here. For an illustration in the context of the roulette wheel, see here.

Another piece of useful information from the 10,000 simulated runs of the experiment is the frequency distribution.

Table 1
Frequency Distribution of the 10,000 Simulated Runs
\begin{array}{rrrrr}      \text{Interval} & \text{ } & \text{Frequency} & \text{ } & \text{Relative Frequency} \\      \text{ } & \text{ } \\      \text{10 to 19} & \text{ } & 375 & \text{ } & 3.75 \% \\      \text{20 to 29}  & \text{ } & 2817 & \text{ } & 28.17 \% \\      \text{30 to 39}  & \text{ } & 3267 & \text{ } &  \text{ } 32.67 \% \\      \text{40 to 49}  & \text{ } & 1931 & \text{ } & \text{ } 19.31 \% \\  \text{50 to 59}  & \text{ } & 901 & \text{ } & \text{ } 9.01 \% \\  \text{60 to 69}  & \text{ } & 379 & \text{ } & \text{ } 3.79 \% \\  \text{70 to 79}  & \text{ } & 190 & \text{ } & \text{ } 1.90 \% \\  \text{80 to 89}  & \text{ } & 88 & \text{ } & \text{ } 0.88 \% \\  \text{90 to 99}  & \text{ } & 30 & \text{ } & \text{ } 0.30 \% \\  \text{100 to 109}  & \text{ } & 13 & \text{ } & \text{ } 0.13 \% \\  \text{110 to 119}  & \text{ } & 3 & \text{ } & \text{ } 0.03 \% \\  \text{120 to 129}  & \text{ } & 3 & \text{ } & \text{ } 0.03 \% \\  \text{130 to 139}  & \text{ } & 1 & \text{ } & \text{ } 0.01 \% \\  \text{140 to 149}  & \text{ } & 2 & \text{ } & \text{ } 0.01 \% \\      \text{ }  & \text{ } & \text{ } & \text{ } & \text{ } \\      \text{Total }  & \text{ } & 10000 & \text{ } & 100.00 \%    \end{array}

Figures 1 to 3 tell us the long run behavior of the simulations (e.g. the long run average is 37). Table 1 gives the counts of the simulations that fall into each interval and the corresponding relative frequency (the percentage). Table 1 tells us how often or how likely a given possibility occurs. The total number of simulations that fall within the range 20 to 49 is 8015. So about 80% of the time, the experiment ends in 20 to 49 draws. Furthermore, 92.95% of the simulations fall into the interval 20 to 69. This really tells us what the likely results would be if we perform the experiment. The frequency distribution also tells us what is unlikely. There is only 3.75% chance that the experiment can be completed with less than 20 draws. In the simulations, there are two that are above 140 (they are 141 and 142). These extreme results can happen but are extremely rare. They only happened about 2 times per 10,000 simulations.


The math angle

There is also a mathematical description of the random experiment of surveying people until all 12 animal signs are obtained. For example, there is a formula for calculating mean, and there is also a formula for calculating the variance. There is also a probability function, i.e. a formula for calculating probabilities (akin to Table 1). The formula for the mean is actually simple to describe.

Let X_n be the number of draws from the set \left\{1,2,3,\cdots,n \right\} with replacement such that each number in the set is picked at least once. The expectation of X_n is the following.

    \displaystyle E[X_n]=n \biggl[ \frac{1}{n}+\frac{1}{n-1}+ \cdots+ \frac{1}{3}+\frac{1}{2}+1    \biggr]

The 37.2385 theoretical average discussed above comes from this formula. The the case of n=12, the mean would be

    \displaystyle E[X_{12}]=12 \biggl[ \frac{1}{12}+\frac{1}{11}+ \cdots+ \frac{1}{3}+\frac{1}{2}+1    \biggr]=37.23852814

For more details of the math discussion, see this previous post. For a more in-depth discussion including the probability function, see this post in a companion blog on probability.

\copyright \ 2017 \text{ by Dan Ma}

The coupon collector problem

The coupon collector problem is a classic problem in probability. It is usually described in this way: a person (called the coupon collector) is trying to collect all coupons (promotional gift items) when purchasing a certain type of products (e.g. boxes of breakfast cereal). Suppose that there are n different types of coupons. Suppose that when someone buys a unit of the product, he or she does not know what type of coupons that is inside the box. How many units of the product does the coupon collector have to buy in order to collect the entire set of coupons?

Note that the number of units to buy is a random quantity. To see it for yourself, roll a fair die until each of the six faces appears at least once. You can try this with your own die and record the numbers as you go. We rolled our own die and the following shows three such experiments.

    3, 4, 6, 2, 6, 1, 3, 2, 1, 4, 6, 2, 6, 1, 6, 1, 2, 3, 5

    6, 6, 4, 1, 3, 6, 6, 4, 1, 5, 4, 2

    6, 2, 1, 1, 5, 5, 1, 1, 1, 3, 6, 1, 3, 5, 4

The first experiment has 19 trials, the second one has 12 trials and the third one has 15 trials. If we run the random experiment another time, we would get another sequence. Also note that the first 5 of the faces (coupons) show up fairly quickly. It takes a few more trials just to get the last coupon.

Rolling a dice is simply a way to generate random numbers. Thus we can describe the coupon collector problem as a random sampling problem. Draw numbers repeatedly from the sample space S=\left\{1,2,3,\cdots n \right\} with replacement until each of the n numbers has been chosen at least once. Let X_n be the number of trials that are required to perform this experiment. As demonstrated above (or as confirmed by your rolling of a die), X_n is a random variable. The smallest value it can take on is n. Theoretically, the random variable X_n could be n or some other integer greater than n. The goal is to describe the probability distribution of the random variable X_n.

Some interesting things to know about this probability distribution are the expected value E[X_n] (the mean or the average number of trials that are needed to complete the experiment). Another one is the probability function (or probability mass function) so that we can determine how likely it is to complete the experiment in k trials. For a more detailed discussion, go to this previous post. In the remainder of the post, we discuss the expected value.

The expected value E[X_n] has a clear formula and is easily computed.

    \displaystyle E[X_n]=n \biggl[ 1 + \frac{1}{2}+\frac{1}{3}+ \cdots+\frac{1}{n-1}+\frac{1}{n} \biggr]

For the 6-coupon case (or the rolling of a die), it would take on average 15 trials to get all 6 faces.

    \displaystyle E[X_6]=6 \biggl[ 1 + \frac{1}{2}+\frac{1}{3}+\frac{1}{4}+\frac{1}{5}+\frac{1}{6} \biggr]=14.7 \approx 15

Of course, 15 is simply the average. The actual results may be more or less than 15 as the above illustrations show (or your own rolling of a die). To find out how likely it is to have a result around 15, it is necessary to have the probability function of X_6. See the previous post for more information on that.

Consider this example. Suppose that you want to throw candies at random to a group of 10 kids. Each candy is assigned to each child at random. You want to throw candies until each of the children has at least one candy. Of course some kids would receive a few candies. But the goal is that no one would come away empty handed. On average, you would need to throw about 30 candies.

    \displaystyle E[X_{10}]=10 \biggl[ 1 + \frac{1}{2}+\frac{1}{3}+ \cdots+\frac{1}{9}+\frac{1}{10} \biggr]=29.29 \approx 30

If there are 50 coupons, it would take on average 225 trials to get all 50 coupons. One interesting thing is that it is relatively easy to get the first few coupons. To get the last coupon, it would take a quite a few trials. In fact, it would take on average n trials to get the last coupon (if n is the number of coupon types).

Another point we would like to make is that the coupon collector problem requires an understanding of the geometric distribution. A random experiment that can modeled by a geometric distribution is the performance of a series of independent trials until some criterion is satisfied. The random variable X_n described here has some similarity with a geometric model (but not totally similar). The similarity: we keep drawing numbers with replacement until all numbers are drawn.

Let’s look at the third experiment 6, 2, 1, 1, 5, 5, 1, 1, 1, 3, 6, 1, 3, 5, 4 listed above. We draw the first number (happens to be 6). Next we wish to draw another number different from 6. Luckily it takes only one trail to get 2. Now we wish to draw a number different from 6 and 2. It also takes one trial (to get 1). Next, continue to draw until we get a number different from 6, 2 and 1. This time it takes two trials to get a 5. Then continue to draw until we get a number different from 6, 2, 1, and 5. It takes 5 trials to get a 3. Lastly it takes 5 more trials to get the last number 4. What this shows is that the random variable X_6 is a sum of 6 geometric random variables.


Each of the C_i is a geometric random variable. The first one C_1 is kind of special. The value of C_1 is always 1 (it takes one trial to get the first number). The second variable C_2 is a geometric random variable (the number of trials to get a number different from the first chosen number. Then C_3 is the number of trials to get a number different from the first two chosen numbers and so on. If there are n coupons, then X_n is the sum of n geometric random variables.

    X_n=C_1+C_2+C_3+\cdots +C_{n-1}+C_n

The formula for the expected value E[X_n] is derived through this connection with the geometric distribution. See the previous post for a more in-depth discussion.

\copyright \ 2017 \text{ by Dan Ma}

An interview with the solver of a 350-year old math problem

Anyone who has a passion for solving math problems or learning math should find the interview with the mathematician highlighted here interesting. The interview is an old one (posted in year 2000). But the lessons that can be drawn from it are timeless and it can inspire math students and enthusiasts at all levels. You probably know who this mathematician is. If not, here’s some hints. He spent 7 years working in isolation to tackle a problem that no one could solve for almost 350 years! The problem was a famous one that bears the name of the person who posed it. The poser gave a tease that he had a proof of the problem but the margin in the page of the book in which he was scribbling notes was too small to contain the proof. The mathematician is of course Andrew Wiles. The problem was Fermat’s Last Theorem. The statement of the problem can be readily found online. Here is an explanation of it (as a preface to the explanation of a generalization of Fermat’s Last Theorem called Beal’s conjecture).

Interesting reads about Wiles are also readily available in book forms and in web sites. This interview is a good start for anyone who is not familiar with Wiles. The interview was posted in the official website of Nova, a science program of PBS. Of course the 7 years he spent working on the problem did not include the times when he worked in it in his youth. He discovered Fermat’s Last Theorem when he was 10 years in reading the book The Last Problem by Eric Temple Bell. He was completely captivated by the conjecture that he was obsessed with it for a while until he realized that he was limited by his knowledge. So he gave up the dream of being the first person to solve the problem and instead focused on his education in mathematics.

Then in 1986 there was a breakthrough in that Fermat’s Last Theorem was shown to be equivalent to another unsolved math problem called Taniyama-Shimura conjecture, which is a modern math problem that provided a new angle for attacking the intractable original problem.

One thing that struck me the most was the question posed by the interviewer, “On a day-to-day basis, how did you go about constructing your proof?” This is something that anyone who learns math or work on math problems have to face. What do you do to advance your goal on a daily basis? If there is insight or if there are ideas that seem promising, no problem. You would just focus on those ideas. But if you are stuck, or if you have no ideas on what to do, what happens then? Of course, taking a break is a good idea. Let your mind rest. Eventually you may come back to the same place where you got stuck. What do you do? The following is his answer to the question.

    I used to come up to my study, and start trying to find patterns. I tried doing calculations which explain some little piece of mathematics. I tried to fit it in with some previous broad conceptual understanding of some part of mathematics that would clarify the particular problem I was thinking about. Sometimes that would involve going and looking it up in a book to see how it’s done there. Sometimes it was a question of modifying things a bit, doing a little extra calculation. And sometimes I realized that nothing that had ever been done before was any use at all. Then I just had to find something completely new; it’s a mystery where that comes from. I carried this problem around in my head basically the whole time. I would wake up with it first thing in the morning, I would be thinking about it all day, and I would be thinking about it when I went to sleep. Without distraction, I would have the same thing going round and round in my mind. The only way I could relax was when I was with my children. Young children simply aren’t interested in Fermat. They just want to hear a story and they’re not going to let you do anything else.

One element of his day-to-day approach is about going small. Work on a little piece of a bigger problem and try to see how it fits in a broader framework. Another element is about tweaking something that had been done before and then trying to extend it. Then he also realized that known approach may not be useful at all for the problem at hand. So he had to find something new. But finding new idea or new approach is a mysterious process. There is not much point for him or anyone to describe how that mysterious process works. Here’s probably the most important point in the above paragraph. He carried the problem with him everywhere (in his head of course). When Wiles had no distraction, the problems and ideas in him just sloshed around in his head. Then when new ideas or good ideas came around, he was able to take advantage of them.

Fortunate for Wiles and for the world of mathematics, after years of digging and pushing the frontiers a little at a time (on many days no progress at all), certain right ideas finally emerged.

Is the experience of Wiles relevant to someone who is a mere mortal (mathematically speaking)? Obviously not many people are trying to solve problems that eluded the efforts of the brightest minds for centuries. I can see that the same ideas from Wiles’ epic conquest hold true in my own situation (in a much much smaller scale for sure). Going small definitely resonates with me. If I cannot solve a bigger problem, I try to pare down the problem, perhaps looking at a special case. Tweaking a known approach can also be useful. If doing so does not produce results, at least I know that a new idea is probably needed. I am a topologist by training. I certain use similar approaches when I learn something new or when I try to solve a problem.

Spending several years in isolation on a problem is certainly not easy to do for most of us. But we can be persistent in our own ways. We can be dogged in a way that fits our circumstances (not many of us has a professorship or a job that allows such unfettered access to free time). In other words, the example of Andrew Wiles can inspire us to chase our own dreams, solving problems that are meaningful to us.

Another thing that resonates with me is that he advises “heart” when choosing a problem – “always try the problem that matters most to you.” Otherwise it may be hard to stay with the problem through thick and thin.

The success meant that it was a melancholy moment for Wiles. It was a melancholy moment for many other people in mathematics too. Wiles said, “we’ve lost something that’s been with us for so long, and something that drew a lot of us into mathematics. But perhaps that’s always the way with math problems, and we just have to find new ones to capture our attention.”

A good popular science book about the search for a proof of Fermat’s Last Theorem is Simon Singh’s book. Singh also produced a documentary on the proof. Here is a video from PBS. For anyone who really wants to dig deep into the math, here is a two-volume book on the basics and the proof of Fermat’s Last Theorem by Takeshi Saito.

Here’s an article from The Guardian about Wiles’ being awarded the Abel Prize.

\copyright \ 2017 \text{ by Dan Ma}