Celebrate the year of the dog

A little over one year ago, this blog marked the beginning of the year of the rooster in this blog post. Now it’s time to celebrate the year of the dog.

2018 is the year of the dog

The Lunar New Year actually falls on February 16 this year. To be precise, the latest year of the dog will span from February 16, 2018 to February 4, 2019. We mark the New Year by highlighting a math problem that is related to the Chinese zodiac signs. In fact, this problem was also discussed in the blog post on the year of the rooster.

The Problem

The problem is this. There are 12 animal signs in Chinese zodiac – Rat, Ox, Tiger, Rabbit, Dragon, Snake, Horse, Goat, Monkey, Rooster, Dog and Pig. You ask people repeatedly their animal signs until all 12 signs have been represented. How many people do you have to ask in order to have encountered all 12 animal signs? We assume that the 12 animal signs are equally represented. This assumption makes the math more tractable.

The number of people you ask is a random quantity. Obviously the number of people you have to ask is at least 12. It is likely that you have to ask many more than 12 people. In the previous post, we discuss this problem in two different ways – through simulations and using a math formula based on the coupon collector problem. In this post we view this problem as an occupancy problem and solve it using the method of Markov chains.

The Occupancy Problem

Imagine there are 12 cells (e.g. boxes or urns). Balls are thrown into the cells one at a time at random (assuming that each ball always lands into one of the 12 cells). On average, how many balls do we need to throw so that each cell has at least one ball? In other words, how many balls do we need to throw so that there are no empty cell? This is one form of the occupancy problem. The occupancy problem being described here is a natural reformulation of the problem about asking people until all 12 animal signs are represented.

The problem can also be described as sampling with replacement. Put 12 balls labeled 1 through 12 in an urn. Draw balls at random one at a time with replacement. How many selections do we need to make so that each of the 12 numbers are drawn at least once?

Of course, there is nothing special about 12 cells. The number of cells in the occupancy problem can vary. Consider the problem of 6 cells. So balls are thrown until all 6 cells are occupied. Or an urn has 6 balls labeled 1 through 6. Then balls are drawn from the urn until all each of the 6 numbers has been chosen at least once. There is another way of looking at the 6-cell occupancy problem – rolling a die repeatedly until each face has appeared at least once.

Markov Chain Approach

There are more than one way to solve the occupancy problem (two ways are discussed in the previous posts).

The method to highlight here is to use Markov chains. A Markov chain is a series of random steps. Each random step is identified by a state. Each state is dependent only on the preceding state and not on the states prior to the preceding state. We use X_n to denote the state after the nth random step or at time n.

In the 12-cell occupancy problem, a state is the number of occupied cells after a ball is thrown. For example, X_0=0, which is the initial state (we assume that before throwing the first ball, the cells are empty). Then X_1=1 (after throwing one ball, one cell is occupied). After throwing the second ball, the number of occupied cells is random. For example, it could be that X_2=1 (if the second ball goes into the occupied cell holding the first ball) or it could be that X_2=2 (if the second ball goes into an empty cell). The first scenario has probability 1/12 and the second scenario has probability 11/12. Similarly, we can list out all the scenarios for X_3 (the number of occupied cells after throwing the third ball) and their probabilities. In fact, it will be much more efficient if we describe all these probabilities in a matrix.

\mathbf{P} =       \bordermatrix{ & 0 & 1 & 2 & 3 & 4 & 5 & 6 & 7 & 8 & 9 & 10 & 11 & 12\cr        0 & 0 & 1 & 0  & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \cr        1 & 0 & 1/12 & 11/12  & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \ \cr        2 & 0 & 0 & 2/12  & 10/12 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \cr        3 & 0 & 0 & 0  & 3/12 & 9/12 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \cr        4 & 0 & 0 & 0  & 0 & 4/12 & 8/12 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \cr        5 & 0 & 0 & 0  & 0 & 0 & 5/12 & 7/12 & 0 & 0 & 0 & 0 & 0 & 0 \cr        6 & 0 & 0 & 0  & 0 & 0 & 0 & 6/12 & 6/12 & 0 & 0 & 0 & 0 & 0 \cr        7 & 0 & 0 & 0  & 0 & 0 & 0 & 0 & 7/12 & 5/12 & 0 & 0 & 0 & 0 \cr        8 & 0 & 0 & 0  & 0 & 0 & 0 & 0 & 0 & 8/12 & 4/12 & 0 & 0 & 0 \cr        9 & 0 & 0 & 0  & 0 & 0 & 0 & 0 & 0 & 0 & 9/12 & 3/12 & 0 & 0 \cr         10 & 0 & 0 & 0  & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 10/12 & 2/12 & 0 \cr        11 & 0 & 0 & 0  & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 11/12 & 1/12 \cr       12 & 0 & 0 & 0  & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 1 \cr   } \qquad

The above matrix looks kind of huge. But there is an order behind it. The Markov chain of the 12-cell occupancy problem has 13 states (0, 1, 2, …, 12). In this case, a state refers to the number of occupied cells after a ball is thrown into the cells. So X_n is the number of occupied cells after the nth ball is thrown. For convenience, we can also think of the subscripts n as time. The type of Markov chains discussed here would be called discrete-time Markov chains.

The above matrix \mathbf{P} is called the transition probability matrix. It contains the probabilities of transitioning from one state into various states. Here’s how to read the matrix. Pick a row, say the row labeled 2. This refers to the situation of the process being at a point when 2 of the 12 cells are occupied. The next state can be 2, meaning that the next ball goes into one of the two occupied cells. Since there are two occupied cells, the probability is 2/12. The next state can be 3, meaning that the next ball goes into one of the empty cells. Since there are 10 cells not occupied, the probability is 10/12.

In general, this matrix tells us that if the process starts at state i, the process will remain at state i in the next time period with probability i /12 and the process will transition to i+1 with probability 1-i/12.

The state 12 is an interesting one. When all the cells are occupied, any additional balls thrown at them will still mean that the cells are occupied. Thus when the process reaches state 12, the process remains at 12 indefinitely. A state is called an absorbing state if the process does not leave that state once that state is entered. Thus state 12 is an absorbing state in 12-cell occupancy problem. The other states (0 to 11) are called transient states.

The probabilities in the matrix \mathbf{P} are called one-step transition probabilities because these probabilities tell us how the process transitions from a given step to the next step. These probabilities are notated by P_{ij}, signifying the probability that the state in the next period is j given that the process begins in state i. For example, P_{5,6}=7/12 and P_{10,11}=2/12. What about the two-step transition probabilities P_{ij}^2? The quantity P_{ij}^2 is the probability that the process will be in state j after two steps given that the process is currently in state i. The idea generalizes quite naturally. The quantity P_{ij}^k, a k-step transition probability, is the probability that the process will be in state j after k steps are taken given that the process is currently in state i.

To compute the k-step transition probabilities, all we need to do is to raise the matrix \mathbf{P} to the kth power. A matrix calculator will be helpful. Using this online matrix calculator, we raise \mathbf{P} to the 4th power, the following shows the non-zero elements in the first row.

    \displaystyle P_{0,1}^4=\frac{1}{20736}=0.00048

    \displaystyle P_{0,2}^4=\frac{165}{20736}=0.00796

    \displaystyle P_{0,3}^4=\frac{2750}{20736}=0.13262

    \displaystyle P_{0,4}^4=\frac{9900}{20736}=0.47743

    \displaystyle P_{0,5}^4=\frac{7920}{20736}=0.38194

After throwing 5 balls into 12 empty cells, the above probabilities tells us the likelihoods of the number of occupied cells. It is very unlikely to have 1 or 2 occupied cells. The most likely would be 4 or 5 occupied cells. To determine the likelihoods after throwing 6 balls into 12 empty cells, raise \mathbf{P} to 6 and so on.

For a detailed discussion on calculation of k-step transition probabilities, see this blog post.

Solving the Occupancy Problem

We now give an indication on how to solve the 12-cell occupancy problem (or the 12 animal signs problem). Recall that state 12 is an absorbing state. Now consider the matrix obtained by removing state 12 from \mathbf{P}. Call this new matrix Q. It is a 12 x 12 matrix. Let I be the 12 x 12 identity matrix. Compute the matrix I-Q. Then use an online matrix calculator to find the inverse matrix of I-Q. Let (I-Q)^{-1} be the inverse. The sum of the all the elements in the first row of (I-Q)^{-1} is the average number of steps that the Markov process takes to reach state 12. The following is the sum of the elements on the first row of (I-Q)^{-1}.

    \displaystyle 1+\frac{12}{11}+\frac{6}{5}+\frac{4}{3}+\frac{3}{2}+\frac{12}{7}+2+\frac{12}{5}+3+4+6+12=37.2385

On average, we need to throw 37.24 balls into the 12 cells to have no empty cells. With respect to the 12 animal signs, on average, it requires sampling 37.24 people to have encountered all 12 signs.

The matrix I-Q is called the fundamental matrix of the Markov chain represented by the matrix \mathbf{P}. The method is detailed in this blog post. That blog post also discusses the occupancy problem in some details.

The discussion here is a roundabout way and a long winded way to solve the coupon collector problem, which is discussed in this blog post from a year ago. However, the example discussed here is an excellent introduction to Markov chains. Further information can be found in the following blog posts.

\text{ }

\text{ }

Dan Ma math blog
Daniel Ma math blogs

Dan Ma math
Daniel Ma math

Dan Ma mathematics
Daniel Ma mathematics

\copyright 2018 – Dan Ma

Advertisements

Tower of Hanoi

The following shows a set for the game of Tower of Hanoi.

8-disk Tower of Hanoi (Wikipedia)

The game is discussed in a companion blog. The first paragraph in that blog post:

    The tower of Hanoi is a game that works on multiple levels. It is a challenging game that test the agility and organization skills of the player. It is also a game mathematicians would love since the game is an excellent illustration of math concepts such as mathematical induction and exponential growth. It is also a concrete illustration of a recursive algorithm. To read more …

The game is discussed previously in this blog post in this blog, which is more detailed in terms of strategy of playing the game.

\copyright 2017 – Dan Ma

Celebrate the year of the rooster

year-of-rooster

According to the Chinese zodiac, the year 2017 is the year of the rooster. In fact, today (January 28, 2017) is the start lunar New Year, which is the first day of the year of the rooster. What better way to celebrate the year of the rooster than working a related math puzzle, or to perform a related random experiment!

At the office yesterday, conversation at the beginning of a meeting, before discussing the main topics, centered on the Chinese zodiac animal signs (the Chinese zodiac system is a 12-year cycle with a different animal representing each year). One coworker mentioned he is a tiger. Another coworker did not know his sign and Googled to find out that he is a rat! Another coworker is a rooster. It turns out that a pig is also represented. Imagine that you keep picking people at random and ascertain his/her animal sign. How many people do you have to ask in order to have met all 12 animal signs?

The number of people you have to ask is a random quantity. If you are very lucky, you may encounter all 12 signs by asking just 12 people (the minimum number of people you have to ask). The outcome of twelve people is not likely (it has only about 0.005% chance of happening). It is almost certain that you have to ask many more people. It turns out that the first few animals are easy to obtain (just like in the meeting with 4 different animal signs for 4 people). As you ask more people, the animal signs tend to repeat. So you have to keep asking more people. As you encounter more animal signs, it takes longer to get a new animal sign. For example, if you have encountered 11 animal signs already, you will have to ask on average 12 more people to find the last animal sign. So the overall average is not a number near 12. On average, you will have to ask about 37 people to get the entire set of 12 animals.

The random experiment that has been described is this. Put 12 slips of papers numbered 1 through 12 in a hat. Randomly draw a piece of paper and note the number and then put it back into the hat. Keep drawing until all 12 numbers have been chosen. Let X be the number of selections that are required to perform this random experiment. Of course, you can expand the sample space to include more slips of papers (i.e. with more numbers). But the context will not be picking animal signs.

There are two ways to get a handle on the random variable X as described above. One is through simulation and the other is through math.

Before discussing the simulation or the math, let’s point out that the problem discussed here is a classic problem in probability that goes by the name “the coupon collector problem”. The numbers 1 to 12 in a hat are like coupon (prizes) that are randomly given out when purchasing a product (e.g. a box of cereal). The problem discussed here is that the coupon collector desires to collect the entire set of coupons.
___________________________________________________________________________

Simulation

To get a sense of how long it will take, simulate random numbers from 1 through 12 until all numbers have appeared. The following is 5 iterations of the experiment.

    9, 2, 6, 8, 9, 10, 2, 9, 8, 1, 5, 11, 1, 1, 2, 10, 9, 8, 8, 9, 5, 11, 9, 3, 7, 9, 8, 8, 4, 3, 1, 4, 3, 12 (34 draws)

    7, 7, 1, 10, 11, 11, 10, 4, 5, 8, 8, 2, 6, 4, 6, 2, 12, 6, 6, 12, 9, 5, 8, 10, 1, 5, 10, 4, 9, 4, 1, 11, 11, 6, 2, 1, 6, 6, 3 (39 draws)

    9, 5, 2, 2, 1, 5, 6, 11, 7, 11, 4, 6, 1, 12, 3, 7, 8, 3, 3, 2, 2, 3, 5, 6, 2, 5, 1, 6, 8, 5, 4, 10 (32 draws)

    1, 5, 5, 4, 5, 12, 10, 1, 8, 1, 3, 9, 1, 3, 11, 9, 10, 3, 9, 11, 4, 4, 4, 7, 7, 3, 1, 11, 11, 4, 10, 6, 3, 2 (34 draws)

    6, 7, 6, 1, 12, 6, 1, 1, 7, 1, 11, 10, 3, 3, 9, 6, 9, 4, 2, 6, 11, 7, 7, 11, 2, 6, 2, 1, 7, 2, 5, 9, 6, 12, 6, 11, 1, 11, 11, 2, 5, 6, 7, 5, 2, 11, 2, 2, 6, 2, 12, 5, 5, 5, 12, 10, 3, 11, 1, 10, 10, 6, 9, 11, 10, 7, 11, 5, 1, 9, 11, 9, 8 (73 draws)

Each of the number is generated by using the =RANDBETWEEN(1, 12) in Excel. In each iteration, the numbers are generated until all 12 numbers have been generated.

There is considerable fluctuation in this 5 iterations of the experiment. With the 5th one being exceptionally long, it is possible that it takes a long time to find all 12 animal signs. The average of the first iteration is obviously 34. The average of the first two iteration is 36.5. The averages of the first 3, 4, and 5 iterations are 35, 34.75, and 42.4, respectively.The last average of 42.4 is quite different from the average of 37 indicated earlier.

What if we continue to run the experiment, say, for 10,000 times? What would the long run averages look like? The following graph shows the averages from first 100 runs of the experiment. It plots the average of the n iterations from n=1 to n=100.

    Figure 1 – Long run averages from 100 runs
    animal-signs-simulations-100b

Figure 1 shows that there is quite a bit of fluctuation in the averages in the first 25 runs or so. Eventually, the averages settle around 37 but still retain noticeable fluctuation. The following graph shows the averages from first 1000 runs of the experiment.

    Figure 2 – Long run averages from 1000 runs
    animal-signs-simulations-1000

The graph is Figure is smoother as it moves toward 1000, but still has noticeable fluctuation from 37 (in fact the graph is mostly below 37). The following graph shows the averages from first 10000 runs of the experiment.

    Figure 3 – Long run averages from 10000 runs
    animal-signs-simulations-10000

The graph in Figure 3 shows the average of the first n iterations with n goes from 1 to 10,000. The graph is for the most parts a horizontal line slightly above 37, especially after n=3000. In fact the average of all 10,000 iterations is 37.3381, which is close to the theoretical average of 37.2385.

The simulation is an illustration of the law of larger numbers. The essence of the law of large numbers is that the short run results of a random experiment are unpredictable while the long run results are stable and predictable and eventually settle around the theoretical average.

The first 5 runs of the experiment (as shown above) are certainly unpredictable. It may take 34 draws or may take 73 draws. The first 100 simulations also have plenty of ups and downs, even though graph in Figure 1 shows a movement toward 37. The first 1000 simulations display more stable results but are below average as the graph move toward 1000 (Figure 2). In simulating the experiment 10,000 times (Figure 3), the long run averages settle around the theoretical average of 37.2385.

So if you survey people their animal signs, the time it takes has a great deal of random fluctuations. It may take 34 asks or 73 asks (as shown in the first 5 simulations). If the experiment is done repeatedly, the results are predictable, i.e. the average is around 37.

The long run results of a gambling game are predictable too and will settle around the theoretical average. The theoretical average of a gambling game is usually referred to as the house edge. For example, for the game of roulette, the house edge is 5.26%. For each bet of $1, the gambler is expected to lose 5.26 cents. In playing a few games, the gambler may win big. In the long run, the house is expected to gain 5.26 cents per one dollar bet. Thus the law of large numbers can mean financial ruin for the gambler (or profits for the casino). For an illustration of the law of large numbers in the context of the game of Chuck-a-Luck, see here. For an illustration in the context of the roulette wheel, see here.

Another piece of useful information from the 10,000 simulated runs of the experiment is the frequency distribution.

Table 1
Frequency Distribution of the 10,000 Simulated Runs
\begin{array}{rrrrr}      \text{Interval} & \text{ } & \text{Frequency} & \text{ } & \text{Relative Frequency} \\      \text{ } & \text{ } \\      \text{10 to 19} & \text{ } & 375 & \text{ } & 3.75 \% \\      \text{20 to 29}  & \text{ } & 2817 & \text{ } & 28.17 \% \\      \text{30 to 39}  & \text{ } & 3267 & \text{ } &  \text{ } 32.67 \% \\      \text{40 to 49}  & \text{ } & 1931 & \text{ } & \text{ } 19.31 \% \\  \text{50 to 59}  & \text{ } & 901 & \text{ } & \text{ } 9.01 \% \\  \text{60 to 69}  & \text{ } & 379 & \text{ } & \text{ } 3.79 \% \\  \text{70 to 79}  & \text{ } & 190 & \text{ } & \text{ } 1.90 \% \\  \text{80 to 89}  & \text{ } & 88 & \text{ } & \text{ } 0.88 \% \\  \text{90 to 99}  & \text{ } & 30 & \text{ } & \text{ } 0.30 \% \\  \text{100 to 109}  & \text{ } & 13 & \text{ } & \text{ } 0.13 \% \\  \text{110 to 119}  & \text{ } & 3 & \text{ } & \text{ } 0.03 \% \\  \text{120 to 129}  & \text{ } & 3 & \text{ } & \text{ } 0.03 \% \\  \text{130 to 139}  & \text{ } & 1 & \text{ } & \text{ } 0.01 \% \\  \text{140 to 149}  & \text{ } & 2 & \text{ } & \text{ } 0.01 \% \\      \text{ }  & \text{ } & \text{ } & \text{ } & \text{ } \\      \text{Total }  & \text{ } & 10000 & \text{ } & 100.00 \%    \end{array}

Figures 1 to 3 tell us the long run behavior of the simulations (e.g. the long run average is 37). Table 1 gives the counts of the simulations that fall into each interval and the corresponding relative frequency (the percentage). Table 1 tells us how often or how likely a given possibility occurs. The total number of simulations that fall within the range 20 to 49 is 8015. So about 80% of the time, the experiment ends in 20 to 49 draws. Furthermore, 92.95% of the simulations fall into the interval 20 to 69. This really tells us what the likely results would be if we perform the experiment. The frequency distribution also tells us what is unlikely. There is only 3.75% chance that the experiment can be completed with less than 20 draws. In the simulations, there are two that are above 140 (they are 141 and 142). These extreme results can happen but are extremely rare. They only happened about 2 times per 10,000 simulations.

___________________________________________________________________________

The math angle

There is also a mathematical description of the random experiment of surveying people until all 12 animal signs are obtained. For example, there is a formula for calculating mean, and there is also a formula for calculating the variance. There is also a probability function, i.e. a formula for calculating probabilities (akin to Table 1). The formula for the mean is actually simple to describe.

Let X_n be the number of draws from the set \left\{1,2,3,\cdots,n \right\} with replacement such that each number in the set is picked at least once. The expectation of X_n is the following.

    \displaystyle E[X_n]=n \biggl[ \frac{1}{n}+\frac{1}{n-1}+ \cdots+ \frac{1}{3}+\frac{1}{2}+1    \biggr]

The 37.2385 theoretical average discussed above comes from this formula. The the case of n=12, the mean would be

    \displaystyle E[X_{12}]=12 \biggl[ \frac{1}{12}+\frac{1}{11}+ \cdots+ \frac{1}{3}+\frac{1}{2}+1    \biggr]=37.23852814

For more details of the math discussion, see this previous post. For a more in-depth discussion including the probability function, see this post in a companion blog on probability.

___________________________________________________________________________
\copyright \ 2017 \text{ by Dan Ma}

Asking people for money

Consider this thought experiment. Suppose that you ask the first person you meet for one dollar. Ask the second person you meet for half of a dollar. Ask the third person you meet for one-third of a dollar. To the one-hundredth person you meet, you ask for one-hundredth of a dollar. To the one thousand and first person you meet, you ask for \frac{1}{1001} of a dollar and so on. Suppose that it is possible to break down a dollar to a fraction as tiny as possible (and that the person you ask will give the money to you). Suppose that there are unlimited number of people for you to ask. If you keep asking and receiving in this manner, how much money will you have?

Mind you that the amounts you ask and receive are getting smaller and smaller to the point that it is practically zero. From the 100th person, you get one cent. The amount you get from a person beyond the first one hundred seems to be trivially small. From the first 100 people, the total amount would be about $5.19. Is it possible that going beyond 100 people might not net you much more that five dollars? You may want to think about the puzzle.

Here’s one way to look at the thought experiment. From the first person, you get $1. From the second person, you get \frac{1}{2} of a dollar. From the next two people, you get \frac{1}{3}+\frac{1}{4}, which is greater than \frac{1}{4}+\frac{1}{4}=\frac{1}{2}. Now from the next 4 people, you get \frac{1}{5}+\frac{1}{6}+\frac{1}{7}+\frac{1}{8}, which is greater than \frac{1}{8}+\frac{1}{8}+\frac{1}{8}+\frac{1}{8}=\frac{1}{2}. From the next 8 people, you get

    \displaystyle \frac{1}{9}+\frac{1}{10}+\frac{1}{11}+\frac{1}{12}+\frac{1}{13}+\frac{1}{14}+\frac{1}{15}+\frac{1}{16},

which is greater than

    \displaystyle \frac{1}{16}+\frac{1}{16}+\frac{1}{16}+\frac{1}{16}+\frac{1}{16}+\frac{1}{16}+\frac{1}{16}+\frac{1}{16}=\frac{1}{2}.

And so on. Each group of people you ask is twice as big as the previous group. The amount of money from each group is always \frac{1}{2} of a dollar or more. Indeed, if you can ask enough groups as described above, you can have as much money as you would like. Remember, the first group has one person (giving you $1), and the second group has one person (giving you half a dollar). The third group has 2 people and the fourth group has 4 people. Each subsequent group has twice as many people as the previous group, but each group will net you at least half a dollar. To get one million dollar, you ask two million groups. With each group giving you at least half a dollar, you will get at least one million dollars in total. Of course, the total number of people in these 2 million groups would be astronomical! To get one trillion dollars, you would ask 2 trillion groups of people. The idea here is that if you want N dollars, you can receive more than that amount by asking enough groups of people (i.e. by asking enough people). In fact this is like a definition of infinity. You know, for a certain target, no matter how big it is, you can always reach that target by performing enough iterations. Then mathematically speaking, it is a limitless phenomenon. So we can write the following:

    \displaystyle 1+ \frac{1}{2}+ \frac{1}{3}+ \frac{1}{4}+ \cdots + \frac{1}{n}+ \cdots = \infty

    \displaystyle \sum \limits_{n=1}^\infty \frac{1}{n}=\infty

The above are two different ways to express an infinite series. The first way is to write out the sum of the first several terms and then to follow that by dot dot dot. The second way uses the uppser case Greek letter Sigma, which denotes sum. So it is saying that it is a sum of \frac{1}{n} as n starts from 1 and increases without bound (or from 1 to infinity). The terms in the sum are getting smaller and smaller to become so small it is practically and essentially zero (a more economical way of saying that is that the limit of the term \frac{1}{n} is zero). Here, we are summing numbers that are close to zero and yet the sum is infinity. Another way to say it is that this infinite series diverges. This may seem counter intuitive, but the math is pretty clear if you think about it and let it sink in.

The above infinite series goes by the name of harmonic series. The name has a musical connection; it derives from the concept of overtones or harmonics in music. Here, we stick with the mathematical side of things. So the harmonic series diverges (or not converges to a finite number).

___________________________________________________________________________

Another Proof

The above proof that the harmonic series is a classical proof. It was first given in the 14th century. Anyone who wants to study the concept of infinite series should learn this proof. However, it requires the mental leap of grouping the terms in groups of exponential sizes (2, 4, 8, 16, and so on). Here’s another proof that requires only grouping two terms at a time.

    \displaystyle S=1+\frac{1}{2}+\biggl(\frac{1}{3}+\frac{1}{4}\biggr)+\biggl(\frac{1}{5}+\frac{1}{6}\biggr)+\biggl(\frac{1}{7}+\frac{1}{8}\biggr)+\cdots

which is greater than

    \displaystyle 1+\frac{1}{2}+\biggl(\frac{1}{4}+\frac{1}{4}\biggr)+\biggl(\frac{1}{6}+\frac{1}{6}\biggr)+\biggl(\frac{1}{8}+\frac{1}{8}\biggr)+\cdots

    \displaystyle =1+\frac{1}{2}+\frac{1}{2}+\frac{1}{3}+\frac{1}{4}+\cdots=\frac{1}{2}+S

If S were finite, then S>\frac{1}{2}+S, which is a contradiction. So S must be infinite. This proof is an old one too. It is a clever proof and is just as valid mathematically as the first one. Both proofs are useful in helping us understand the harmonic series.

___________________________________________________________________________

Other Sequences

To contrast, what if we sum not all the terms \frac{1}{n}? What is we just sum a subset? Here’s are several examples.

    \displaystyle 1+ \frac{1}{2^1}+ \frac{1}{2^2}+ \frac{1}{2^3}+ \cdots + \frac{1}{2^n}+ \cdots

    \displaystyle 1+ \frac{1}{2^2}+ \frac{1}{3^2}+ \frac{1}{4^2}+ \cdots + \frac{1}{n^2}+ \cdots

    \displaystyle \frac{1}{2}+ \frac{1}{3}+ \frac{1}{5}+ \frac{1}{7}+ \frac{1}{11}+ \frac{1}{13}+\frac{1}{17}+ \cdots

Which of the above series converge and which diverge? In the first series, the denominator in each term is a power of 2. In the second series, the denominator in each term is the square of an integer. In the third, the denominator of the terms ranges over all the prime numbers.

The harmonic series is a great example to introduce the concepts of series and infinite series. But it is just a beginning. We do not intend to delve too deeply into the subject. For more information about harmonic series, see the Wikipedia entry. Here’s the the Wikipedia entry on series. This is a concept that is both wide and deep in mathematics. Google online or find some good math books. A good book on infinite series and related topics is Principles of Mathematical Analysis by Walter Rudin.

___________________________________________________________________________
\copyright \ 2016 \text{ by Dan Ma}

How to play the tower of Hanoi

This post shows how to play the tower of Hanoi, which is a fun game and is also interesting mathematically speaking.

___________________________________________________________________________

The Tower of Hanoi

The standard form of the game involves three rods and a number of disks of different sizes that can be slid onto any rod. At the start of the game, all the disks are placed on one rod in ascending order with the smallest disk on top. The following diagram shows a tower of Hanoi with 6 disks in the starting position.

Figure 1 – Tower of Hanoi with 6 Disks

towerhanoi-display-1

To play the game, you can buy a tower of Hanoi set from Amazon. Or play the game online. Another way is to create a homemade tower of Hanoi set. The following is a homemade set made with dinner plates, plastic covers, jar covers and a quarter at the top (for a total of eight disks). The three positions identified by paper take the place of rods.

Figure 2 – Home Made Tower of Hanoi with 8 Disks

towerhanoihomemade1

The following is a tower of Hanoi with 32 disks found in Mexico as shown in Wikipedia.

Figure 3 – Tower of Hanoi with 32 Disks

tower-of-hanoi-32-disk

___________________________________________________________________________

The Rules of the Game

Given a tower of Hanoi such as the one set up in Figure 1, the objective is to move all the disks to another rod (also in ascending order with the smallest disk on top). In making the moves, the following rules must be obeyed.

  • Only one disk can be moved at a time.
  • Only the uppermost disk on a rod can be moved.
  • No disk can be placed on top of a smaller disk.

The rod where all the disks are placed at the beginning (the leftmost rod in Figure 1 or the leftmost paper) is called the starting rod. The rod where all the disks land at the end of the game is called the destination rod. The destination rod can be any one of the two rods other than the starting rod (either the middle rod or the rightmost rod in Figure 1). The rod that is nether the starting rod nor the destination rod is called the intermediate rod. The distinction among the three rods is important as we describe the game in more details below.

The number of disks used in the game is called the height of the tower.

___________________________________________________________________________

Playing The Game

To get a sense of how the game is played, try playing with a tower with a small number of disks (e.g. three disks). Once the three-disk game is mastered, move onto the four-disk game. The following diagram shows the moves in a three-disk game.

Figure 4 – The Moves for the 3-Disk Game

towerhanoi-33

In Figure 4, the first row shows the three disks (red, orange and green) in the starting rod (the leftmost rod). The destination rod is chosen to be the rightmost rod. The first move is to place the red disk into the the destination rod. The second move is to place the orange disk into the intermediate rod, and so on. The game is completed in seven moves. The following two diagrams shows the moves in a 4-disk game.

Figure 5 – The Moves for the 4-Disk Game

towerhanoi-43-1

Figure 6 – The Moves for the 4-Disk Game (continued)

towerhanoi-43-2

As in the 3-disk game, the leftmost rod is the starting rod and the rightmost rod is the destination rod. In Figures 5 and 6, there is the additional blue disk (the largest disk). The game is completed in 15 moves – seven moves shown in Figure 5 and eight moves shown in Figure 6.

The diagrams for the 3-disk game and 4-disk game are very informative. They show that the game can be played in a recursive fashion. Note that the seven moves in Figure 5 transfer the top three disks into the middle rod (the intermediate rod). The next move is to transfer the largest disk (blue disk) from the starting rod into the destination rod. This is accomplished in the first move in Figure 6. The moves in the rest of Figure 6 are then to transfer the three disks from the middle rod into the destination. Essentially, the 4-disk game is accomplished by playing the 3-disk game twice with one additional move in between the two 3-disk games (to transfer the largest disk to the destination rod). It turns out that this recursive strategy is an optimal way to play the game.

___________________________________________________________________________

A Closer Look

As demonstrated by the above diagrams, the game of the tower of Hanoi is played recursively. With n disks, move the top n-1 disks into the the intermediate rod (by following the rules of course). Then move the largest disk in the starting rod into the destination rod. To complete the game, move the n-1 disks sitting in the intermediate rod into the destination rod. In other words, the n-disk game is executed by playing the (n-1)-disk game twice with a move of the largest disk in between.

The idea of recursion sounds simple enough. The game up to four disks are easy to play. When the disk count is larger, there are a lot of sub games to keep track of. Without some guiding principles, it is easy to get “lost”. Here’s some ideas to carefully manage the sub games that are required in the recursion.

The first move.

  • If the number of disks is even, move the smallest disk into the intermediate rod.
  • If the number of disks is odd, move the smallest disk into the destination rod.

For example, in moving 4 disks from rod A to rod C, the first move is to move the smallest disk (among the four disks) into rod B. On the other hand, in moving 3 disks from rod A to rod C, the first move is to move the smallest disk (among the three disks) into rod C.

The subsequent moves.

  • When moving m disks from rod A to rod B (rod A is the starting rod and rod B is the destination rod in this sub game with rod C being the intermediate rod), move m-1 disks to the rod C. Then to move the largest disk (in the m disks) to rod B. Then move the m-1 disks in the rod C to the rod B.
  • The moving of the m-1 disks is another sub game which can be executed by playing two (m-2)-disk games and one move in between with its own starting rod and destination rod.
  • The breaking down into smaller game continues until the number of disks is smaller enough to be processed as one continuous process. For example, the 3-disk game can be considered the smallest sub game that can be played without further reduced. If preferred, the 2-disk game can be used as the smallest unit.

To illustrate, consider the 8-disk game. The game requires a minimum of 255 moves. Instead of showing all the 255 moves, we break down the game into phases as shown in the following two diagrams.

Figure 7 – Phases of the 8-Disk Game

A = Leftmost Rod, B = Middle Rod and C = Rightmost Rod
towerhanoi-83-phases-i

Figure 8 – Phases of the 8-Disk Game (continued)

A = Leftmost Rod, B = Middle Rod and C = Rightmost Rod
towerhanoi-83-phases-ii

Figure 7 describes all the moves at a high level to place the top 7 disks into rod B. Then Figure 8 describes all the moves to place the 7 disks in rod B into rod C. In other words, the figures outline how to place two 7-disk games. Not shown in these two figures is the move of the largest disk (black) from rod A to rod C. The two figures are the results of thinking backward. For example, to achieve the result shown in phase 7 (7 disks on rod B), we must first place 6 disks on rod C, and for that to happen, we must first place 5 disks on rod B and so on. In Figure 7, to achieve the results shown in phase 14, we must first place 6 disks on rod A and before that place 5 disks on rod C (there are 6 disks on rod C but the one at the bottom is fixed).

The early phases in each figures are relatively easy. The later phases are very involved. For example, to go from phase 3 to phase 4, simple move the top disk in rod A in phase 3 to rod C. Then move the three disks in rod B in phase 3 to rod C. Going from phase 5 to phase 6 would require a lot more maneuvering. First move the top disk in rod A in phase 5 to rod C and then move the 5 disks in rod B in phase 5 to rod C. To make that happens, first move the top 3 disks to rod C, later 4 disks to rod A and then 5 disks to rod C.

To go from phase 6 to phase 7, first move the top disk in rod A in phase 6 to rod B. Then move the 6 disks in rod C in phase 6 to rod B. To do that, first move 3 disks to rod A, then 4 disks to rod B, then 5 disks to rod A and finally 6 disks into rod B. In other words, each phase will have multiple recursions. When it involves a large number of disks, it may be helpful to have a diagram such as Figure 7 or 8 drawn out for that phase.

___________________________________________________________________________

How Efficient is this Approach?

The 8-disk game described above in Figure 7 and Figure 8 indicates that the game of the tower of Hanoi can be very convoluted (this has only 8 disks)! The movement of disks is not direct. To move n disks into one particular rod, we must maneuver smaller blocks of disks among the three rods (for every smaller possible size and multiple times for each size). Is there a more efficient way to play the game other than the recursive idea described above? The answer is no.

In the algorithm described above, the number of moves required to complete the n-disk game is 2^n-1. This number is the minimum, i.e. it is impossible to play the game in fewer moves. In general, let M_n be the minimum number of moves for the n-disk game. Recall that the n-disk game is executed by playing the (n-1)-disk game twice with a move of the largest disk in between. So we have the following relation:

    M_n=2 \times M_{n-1}+1

The above relation says that when the n-disk game is executed by doing the two plays of the (n-1)-disk game in the minimal way plus one additional move, the resulting number of moves is minimal for the n-disk game.

It is clear that M_1 is 1, which is 2^1-1. It is also clear that M_2 is 3, which is 2^2-1. Suppose that M_{n-1}=2^{n-1}-1. Consider the following calculation:

    M_n=2 \times M_{n-1}+1=2 \times (2^{n-1}-1)+1=2^n-1

This inductive reasoning shows that a best strategy for playing the n-disk game of the tower of Hanoi is by executing an (n-1)-disk game with one subsequent move of the largest disk and by executing another (n-1)-disk game. The resulting number of moves is 2^n-1.

___________________________________________________________________________

A Different Take on the Game

The game of the tower of Hanoi was invented by the French mathematician Édouard Lucas in 1883. A year later, an author called Henri de Parville told of an interesting tale about the origin of the tower of Hanoi. The following is a translation in English, cited in [1].

    In the Temple of Benares, beneath the dome which marks the centre of the world, rests a brass-plate in which are fixed three diamond needles, each a cubit high and as thick as the body of a bee. On one of these needles, at the creation, God placed sixty-four discs of pure gold, the largest disc resting on the brass plate, and the others getting smaller and smaller up to the top one. This is the Tower of Bramah! Day and night unceasingly the priests transfer the discs from one diamond needle to another according to the fixed and immutable laws of Bramah, which require that the priest must not move more than one disc at a time and that he must place this disc on a needle so that there is no smaller disc below it. When the sixty-four discs shall have been thus transferred from the needle on which at the creation God placed them to one of the other needles, tower, temple and Brahmins alike will crumble into dust, and with a thunderclap the world will vanish.

This is an enchanting story. It is not known whether Lucas, the inventor of the game, invented this legend or was inspired by it. One thing is clear. The legend accurately describes the enormity of the 64-disk game of the tower of Hanoi. The least number of moves that are required to play the 64-disk game is 2^{64}-1, which is

    18,446,744,073,709,551,615

Assume that the priests can move one disk per second, it would take about 5.85 \times 10^{11} years, or 585 billion years to complete the 64-disk game. In contrast, the age of the universe is believed to be 13.82 billion years. The age of the sun is believed to be 4.6 billion years. The remaining lifetime of the sun is believed to be around 5 billion years. So by the time the sun dies out the game is still not finished!

Consider the 32-disk tower of Hanoi shown in Figure 3 above. The number of moves would be 2^{32}-1 The total time it would take would be 4,294,967,295 seconds, assuming one second per move. This would be about 136.2 years. The 32-disk tower, half as tall as the one in the Temple of Benares, would be a fast project in comparison.

Towers such as the one in the Temple of Benares or even the 32-disk tower are not for us mortals. We play the game for enjoyment and, for some of us, for the mathematical aspects. The following table give an estimate of the required times (assuming moving one disk per second).

    \displaystyle \begin{array}{lllllll} \text{Disks} &\text{ } & \text{Approx Time} & \text{ } &\text{Disks} &\text{ } & \text{Approx Time}\\  \text{ } & \text{ } & \text{ }  \\  7 &\text{ } & 2.12 \text{ min} & \text{ } & 18 &\text{ } & 3.03 \text{ days}\\     \text{ } & \text{ } & \text{ } \\  8 &\text{ } & 4.25 \text{ min} & \text{ } & 19 &\text{ } & 6.07 \text{ days} \\  \text{ } & \text{ } & \text{ }  \\  9 &\text{ } & 8.52 \text{ min}  & \text{ } & 20 &\text{ } & 12.14 \text{ days} \\  \text{ } & \text{ } & \text{ }  \\  10 &\text{ } & 17.05 \text{ min} & \text{ } & 21 &\text{ } & 24.27 \text{ days} \\  \text{ } & \text{ } & \text{ }  \\  11 &\text{ } & 34.12 \text{ min} & \text{ } & 22 &\text{ } & 48.55 \text{ days}\\  \text{ } & \text{ } & \text{ }  \\  12 &\text{ } & 1.14 \text{ hours}  & \text{ } & 23 &\text{ } & 97.09 \text{ days}\\  \text{ } & \text{ } & \text{ }  \\  13 &\text{ } & 2.28 \text{ hours}  & \text{ } & 24 &\text{ } & 194.18 \text{ days}\\  \text{ } & \text{ } & \text{ }  \\  14 &\text{ } & 4.55 \text{ hours}  & \text{ } & 25 &\text{ } & 1.06 \text{ years}\\  \text{ } & \text{ } & \text{ }  \\  15 &\text{ } & 9.10 \text{ hours}  & \text{ } & 26 &\text{ } & 2.13 \text{ years}\\  \text{ } & \text{ } & \text{ }  \\  16 &\text{ } & 18.20 \text{ hours}  & \text{ } & 27 &\text{ } & 4.26 \text{ years}\\  \text{ } & \text{ } & \text{ }  \\  17 &\text{ } & 36.41 \text{ hours}  & \text{ } & 28 &\text{ } & 8.51 \text{ years}\\    \end{array}

One thing that stands out from the table is that adding one more disk to the game doubles the running time. This illustrates that the tower of Hanoi game is an exponential time problem. Recall that the n-disk game is executed by playing two (n-1)-disk games with one additional move in between. So an n-disk game would take twice as long as an (n-1)-disk game, assuming the speed for one move is uniform throughout.

The game of the tower of Hanoi is played in exponential time. This is evidenced by the above observation that by increasing one unit (one disk in this case), the total time (the total number of moves in this case) is increased by a constant multiple (doubling in this case). The relation that describes the game of the tower of Hanoi is T \approx 2^n, and in general T \approx a^n. To contrast, a game is played in linear time if by increasing one unit, the total time increases by a constant but additive amount. The relation that describes the linear time increase is T \approx a \times n.

What would be the increase in time (the number of moves) if the number of disks is doubled? From the above discussion, the total time for the 32-disk game is about 136.2 years (assuming one second per move). The total time for the 64-disk game is about 585 billion years (again assuming one second per move). No clarity here. By comparing in the same time unit of measurement (e.g. in seconds), we will see that the time for 64-disk game is roughly the square of the 16-disk game! This is also a phenomenal increase as there is an unbridgeable gulf between 1.36 century and 585 billion years! This is exponential time increase. Assuming the original number of disks is n, the total time for 2n disks is T \approx 2^{2n}=(2^n)^2. For a smaller illustration, the total time for an 8-disk game is 2^8-1=255 while the total time for a 16-disk game is 2^{16}-1=65535. The square of 255 is 65,025, which is roughly 65,535.

Towers with 8 or fewer than disks are good practices for anyone learning the game. Any game with 9 to 11 disks takes about a half hours or less and should be challenging but should be doable if someone desires to play by hand. Any game listed in the table that are stated in hours would be extremely challenging. The number of moves to keep track of would be staggering (for a human mind). The 16-disk game would take over 65,000 moves (65,535 to be exact)! This and the larger games listed in the table would be good computer programming exercises.

For the mathematical facts and/or generalizations of the 3-rod tower of Hanoi game, see this Wikipedia entry and [1].

___________________________________________________________________________

Reference

  1. Hinz Andreas M., Kla. Dzar Sandi, Milutinovic Uros, Ciril Petr, The Tower of Hanoi – Myths and Maths, Springer Basel, Heidelberg, New York, Dordrecht, London, 2013

___________________________________________________________________________
\copyright \ 2016 \text{ by Dan Ma}