More on the Game of Tower of Hanoi

The tower of Hanoi is a game that works on multiple levels. It is a challenging game that test the agility and organization skills of the player. It is also a game mathematicians would love since the game is an excellent illustration of math concepts such as mathematical induction and exponential growth. It is also a concrete illustration of a recursive algorithm.

The game of tower of Hanoi involves moving disks from one rod into another rod. The following is a tower of Hanoi game with 8 disks and three rods.

8-disk Tower of Hanoi (Wikipedia)

The goal of the game is to move all the disks from the left most rod to the right most rod, one disk at a time. Only the uppermost disk on a rod can be moved. In addition, you can only place a smaller disk on top of a larger disk.

Tower of Hanoi sets such as the one shown above are available from Amazon. A home made tower of Hanoi set can also be created. For example, use paper to mark three spots (to serve as rods). Then stack books of varying sizes in one spot and proceed to move the books to another spot according to the rules described above. Kitchen plates can also be used in place of books.

There are also many online versions of the game. Tow examples: version from Math is Fun (from 3-disk to 8-disk game) and 3-disk game to 9-disk game are available in this version. Both sites are easy to use. I prefer the Math is Fun version since the disks are in different colors. Of course, there are many others to choose from (simply Google tower of Hanoi game).

Obviously, the more disks there are in the game, the more difficult it is to successfully to transfer the disks. It is possible that the player may make more moves than necessary if the player is not organized or gets lost.

A 3-disk game can be played in 7 moves and no less than 7 moves. A 4-disk game can be played in a minimum of 15 moves. For a player who gets lost may end up taking more than 15 moves in a 4-disk game. Any player in the know can finish the 4-disk game in 15 moves. The 5-disk game can be played in 31 moves. For 6-disk games, 63 moves. For 7-disk game, 127 moves. To see these for yourself, explore the game using a home made set or play online. The game is also discussed here in a companion blog.

Notice that whenever an additional disk is added to the game, the minimum number of moves is doubled, e.g. from 7 moves to 15 moves (from 3 disks to 4 disks), from 15 to 31 (from 4 disks to 5 disks) and so on. In general, the n-disk game requires a minimum of 2^n-1 moves. Thus the tower of Hanoi is a concrete example of an illustration of exponential growth – increasing the size of the game by one disk doubles the time required to play the game.

In general exponential growth is a phenomenon such that increasing the input by one unit increases the output by a constant multiple (e.g. doubling, tripling, or multiplying with other constant). In contrast, linear growth (or growing linearly) means that increasing the input by one unit increases the output by a constant but as an additive amount.

The exponential growth is even easier to see if the moves are converted into time. Assume that it takes one second to move a disk. It would take 63 seconds to play the 6-disk game, roughly one minute. It would take 127 seconds to play the 7-disk game, roughly 2 minutes. In that two minutes, the play would need to know exactly what the moves should be. Otherwise it would be easy to make a mistake and hence taking more moves than necessary. So converting the moves to seconds further illustrates the exponential growth inherent in the tower of Hanoi game.

A more subtle aspect of the tower of Hanoi game is that in order to play it successfully (i.e. in the minimum number of moves), the game must be played recursively. Take the 4-disk game as example. Imagine that the 4 disks are at first in the left rod. The goal is to move them to the right rod (the destination rod). The rod in the middle is the intermediate rod. The strategy is to move the first 3 disk to the middle rod. Then move the 4th disk (the largest disk) from the left rod to the right rod. The remaining moves will be to move the three disks in the middle rod to the right rod.

With n disks, move the top n-1 disks into the intermediate rod (by following the rules of course). Then move the largest disk in the starting rod into the destination rod. To finish off the game, move the n-1 disks in the intermediate rod into the destination rod. So the n-disk game is executed by playing two (n-1)-disk games with the move of the largest disk in between. So the tower of Hanoi is a great introduction to a recursive algorithm. The tower of Hanoi game would be a great computer programming exercise.

Because of the recursive nature of the game, it would be a challenge to keep track of the moves when the number of disks is large. In a 4-disk game, you would play 3-disk games twice with one move of the largest disk in between. This can be managed with ease after some minimal practice. Say, you want to play the 8-disk game, you would need to play 7-disk game twice with one move of the largest disk in between. For each of the two 7-disk games, you would need to play 6-disk game twice with one more move in between. That would mean four 6-disk games. Then in each of the 6-disk game, you need to play 5-disk game twice with one more move in between. The recursion can get complicated fast! It will be helpful to use diagrams to keep track of all the sub games that are required in the recursive algorithm. This is discussed here in a companion blog.

Now that we know adding one disk to the game of tower of Hanoi doubles the number of moves, hence doubling the time it takes to play. What about doubling the number of disks?

The 8-disk game only requires a minimum of 255 moves (about 4 minutes with one second per move). The 16-disk game would require 65,535 moves, over 1,000 minutes (assuming one second per move) or over 18 hours! The following shows a 32-disk tower of Hanoi set, which is located in a museum in Mexico.

tower-of-hanoi-32-disk

32-disk Tower of Hanoi (Wikipedia)

A 32-disk game would require 2^{32}-1 moves, which is 4,294,967,295. Assuming one second per move, this would be over 136 years! If the workers in the museum is required to move the disks from one rod to another rod by following the rules of the game, that’s would be job security!

The game of Tower of Hanoi is a deceptively simple game. Yet the effect of doubling the number of disks is very dramatic. What about doubling the number of disks to 64, twice as many disks as the one shown above? The following is an interesting tale of the origin of the game of Tower of Hanoi [1].

    In the Temple of Benares, beneath the dome which marks the centre of the world, rests a brass-plate in which are fixed three diamond needles, each a cubit high and as thick as the body of a bee. On one of these needles, at the creation, God placed sixty-four discs of pure gold, the largest disc resting on the brass plate, and the others getting smaller and smaller up to the top one. This is the Tower of Bramah! Day and night unceasingly the priests transfer the discs from one diamond needle to another according to the fixed and immutable laws of Bramah, which require that the priest must not move more than one disc at a time and that he must place this disc on a needle so that there is no smaller disc below it. When the sixty-four discs shall have been thus transferred from the needle on which at the creation God placed them to one of the other needles, tower, temple and Brahmins alike will crumble into dust, and with a thunderclap the world will vanish.

The game of the tower of Hanoi was invented by the French mathematician Édouard Lucas in 1883. A year later, an author called Henri de Parville told of the above interesting tale about the origin of the tower of Hanoi.

It is not known whether Lucas, the inventor of the game, invented this legend or was inspired by it. One thing is clear. The legend accurately describes the enormity of the 64-disk game of the tower of Hanoi.

The least number of moves that are required to play the 64-disk game is 2^{64}-1, which is 18,446,744,073,709,551,615, when converted to years would be 585 billion years (again, assuming one second per disk). In contrast, the age of the universe is believed to be 13.82 billion years. The age of the sun is believed to be 4.6 billion years. The remaining lifetime of the sun is believed to be around 5 billion years. So by the time the sun dies out the game is still not finished!

Back to the question about what happens when the number of disks is doubled. For the 8-disk game, the number of moves is 255. For the 16-disk game, the number of moves is 65,535. Note that the square of 255 is 65,025. So doubling the number of disks has the effect of squaring the number of moves. This is another demonstration of exponential growth.

Reference

  1. Hinz Andreas M., Kla. Dzar Sandi, Milutinovic Uros, Ciril Petr, The Tower of Hanoi – Myths and Maths, Springer Basel, Heidelberg, New York, Dordrecht, London, 2013.

\copyright 2017 – 2023 – Dan Ma

Celebrate the year of the dog

A little over one year ago, this blog marked the beginning of the year of the rooster in this blog post. Now it’s time to celebrate the year of the dog.

2018 is the year of the dog

The Lunar New Year actually falls on February 16 this year. To be precise, the latest year of the dog will span from February 16, 2018 to February 4, 2019. We mark the New Year by highlighting a math problem that is related to the Chinese zodiac signs. In fact, this problem was also discussed in the blog post on the year of the rooster.

The Problem

The problem is this. There are 12 animal signs in Chinese zodiac – Rat, Ox, Tiger, Rabbit, Dragon, Snake, Horse, Goat, Monkey, Rooster, Dog and Pig. You ask people repeatedly their animal signs until all 12 signs have been represented. How many people do you have to ask in order to have encountered all 12 animal signs? We assume that the 12 animal signs are equally represented. This assumption makes the math more tractable.

The number of people you ask is a random quantity. Obviously the number of people you have to ask is at least 12. It is likely that you have to ask many more than 12 people. In the previous post, we discuss this problem in two different ways – through simulations and using a math formula based on the coupon collector problem. In this post we view this problem as an occupancy problem and solve it using the method of Markov chains.

The Occupancy Problem

Imagine there are 12 cells (e.g. boxes or urns). Balls are thrown into the cells one at a time at random (assuming that each ball always lands into one of the 12 cells). On average, how many balls do we need to throw so that each cell has at least one ball? In other words, how many balls do we need to throw so that there are no empty cell? This is one form of the occupancy problem. The occupancy problem being described here is a natural reformulation of the problem about asking people until all 12 animal signs are represented.

The problem can also be described as sampling with replacement. Put 12 balls labeled 1 through 12 in an urn. Draw balls at random one at a time with replacement. How many selections do we need to make so that each of the 12 numbers are drawn at least once?

Of course, there is nothing special about 12 cells. The number of cells in the occupancy problem can vary. Consider the problem of 6 cells. So balls are thrown until all 6 cells are occupied. Or an urn has 6 balls labeled 1 through 6. Then balls are drawn from the urn until all each of the 6 numbers has been chosen at least once. There is another way of looking at the 6-cell occupancy problem – rolling a die repeatedly until each face has appeared at least once.

Markov Chain Approach

There are more than one way to solve the occupancy problem (two ways are discussed in the previous posts).

The method to highlight here is to use Markov chains. A Markov chain is a series of random steps. Each random step is identified by a state. Each state is dependent only on the preceding state and not on the states prior to the preceding state. We use X_n to denote the state after the nth random step or at time n.

In the 12-cell occupancy problem, a state is the number of occupied cells after a ball is thrown. For example, X_0=0, which is the initial state (we assume that before throwing the first ball, the cells are empty). Then X_1=1 (after throwing one ball, one cell is occupied). After throwing the second ball, the number of occupied cells is random. For example, it could be that X_2=1 (if the second ball goes into the occupied cell holding the first ball) or it could be that X_2=2 (if the second ball goes into an empty cell). The first scenario has probability 1/12 and the second scenario has probability 11/12. Similarly, we can list out all the scenarios for X_3 (the number of occupied cells after throwing the third ball) and their probabilities. In fact, it will be much more efficient if we describe all these probabilities in a matrix.

\mathbf{P} =       \bordermatrix{ & 0 & 1 & 2 & 3 & 4 & 5 & 6 & 7 & 8 & 9 & 10 & 11 & 12\cr        0 & 0 & 1 & 0  & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \cr        1 & 0 & 1/12 & 11/12  & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \ \cr        2 & 0 & 0 & 2/12  & 10/12 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \cr        3 & 0 & 0 & 0  & 3/12 & 9/12 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \cr        4 & 0 & 0 & 0  & 0 & 4/12 & 8/12 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \cr        5 & 0 & 0 & 0  & 0 & 0 & 5/12 & 7/12 & 0 & 0 & 0 & 0 & 0 & 0 \cr        6 & 0 & 0 & 0  & 0 & 0 & 0 & 6/12 & 6/12 & 0 & 0 & 0 & 0 & 0 \cr        7 & 0 & 0 & 0  & 0 & 0 & 0 & 0 & 7/12 & 5/12 & 0 & 0 & 0 & 0 \cr        8 & 0 & 0 & 0  & 0 & 0 & 0 & 0 & 0 & 8/12 & 4/12 & 0 & 0 & 0 \cr        9 & 0 & 0 & 0  & 0 & 0 & 0 & 0 & 0 & 0 & 9/12 & 3/12 & 0 & 0 \cr         10 & 0 & 0 & 0  & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 10/12 & 2/12 & 0 \cr        11 & 0 & 0 & 0  & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 11/12 & 1/12 \cr       12 & 0 & 0 & 0  & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 1 \cr   } \qquad

The above matrix looks kind of huge. But there is an order behind it. The Markov chain of the 12-cell occupancy problem has 13 states (0, 1, 2, …, 12). In this case, a state refers to the number of occupied cells after a ball is thrown into the cells. So X_n is the number of occupied cells after the nth ball is thrown. For convenience, we can also think of the subscripts n as time. The type of Markov chains discussed here would be called discrete-time Markov chains.

The above matrix \mathbf{P} is called the transition probability matrix. It contains the probabilities of transitioning from one state into various states. Here’s how to read the matrix. Pick a row, say the row labeled 2. This refers to the situation of the process being at a point when 2 of the 12 cells are occupied. The next state can be 2, meaning that the next ball goes into one of the two occupied cells. Since there are two occupied cells, the probability is 2/12. The next state can be 3, meaning that the next ball goes into one of the empty cells. Since there are 10 cells not occupied, the probability is 10/12.

In general, this matrix tells us that if the process starts at state i, the process will remain at state i in the next time period with probability i /12 and the process will transition to i+1 with probability 1-i/12.

The state 12 is an interesting one. When all the cells are occupied, any additional balls thrown at them will still mean that the cells are occupied. Thus when the process reaches state 12, the process remains at 12 indefinitely. A state is called an absorbing state if the process does not leave that state once that state is entered. Thus state 12 is an absorbing state in 12-cell occupancy problem. The other states (0 to 11) are called transient states.

The probabilities in the matrix \mathbf{P} are called one-step transition probabilities because these probabilities tell us how the process transitions from a given step to the next step. These probabilities are notated by P_{ij}, signifying the probability that the state in the next period is j given that the process begins in state i. For example, P_{5,6}=7/12 and P_{10,11}=2/12. What about the two-step transition probabilities P_{ij}^2? The quantity P_{ij}^2 is the probability that the process will be in state j after two steps given that the process is currently in state i. The idea generalizes quite naturally. The quantity P_{ij}^k, a k-step transition probability, is the probability that the process will be in state j after k steps are taken given that the process is currently in state i.

To compute the k-step transition probabilities, all we need to do is to raise the matrix \mathbf{P} to the kth power. A matrix calculator will be helpful. Using this online matrix calculator, we raise \mathbf{P} to the 4th power, the following shows the non-zero elements in the first row.

    \displaystyle P_{0,1}^4=\frac{1}{20736}=0.00048

    \displaystyle P_{0,2}^4=\frac{165}{20736}=0.00796

    \displaystyle P_{0,3}^4=\frac{2750}{20736}=0.13262

    \displaystyle P_{0,4}^4=\frac{9900}{20736}=0.47743

    \displaystyle P_{0,5}^4=\frac{7920}{20736}=0.38194

After throwing 5 balls into 12 empty cells, the above probabilities tells us the likelihoods of the number of occupied cells. It is very unlikely to have 1 or 2 occupied cells. The most likely would be 4 or 5 occupied cells. To determine the likelihoods after throwing 6 balls into 12 empty cells, raise \mathbf{P} to 6 and so on.

For a detailed discussion on calculation of k-step transition probabilities, see this blog post.

Solving the Occupancy Problem

We now give an indication on how to solve the 12-cell occupancy problem (or the 12 animal signs problem). Recall that state 12 is an absorbing state. Now consider the matrix obtained by removing state 12 from \mathbf{P}. Call this new matrix Q. It is a 12 x 12 matrix. Let I be the 12 x 12 identity matrix. Compute the matrix I-Q. Then use an online matrix calculator to find the inverse matrix of I-Q. Let (I-Q)^{-1} be the inverse. The sum of the all the elements in the first row of (I-Q)^{-1} is the average number of steps that the Markov process takes to reach state 12. The following is the sum of the elements on the first row of (I-Q)^{-1}.

    \displaystyle 1+\frac{12}{11}+\frac{6}{5}+\frac{4}{3}+\frac{3}{2}+\frac{12}{7}+2+\frac{12}{5}+3+4+6+12=37.2385

On average, we need to throw 37.24 balls into the 12 cells to have no empty cells. With respect to the 12 animal signs, on average, it requires sampling 37.24 people to have encountered all 12 signs.

The matrix I-Q is called the fundamental matrix of the Markov chain represented by the matrix \mathbf{P}. The method is detailed in this blog post. That blog post also discusses the occupancy problem in some details.

The discussion here is a roundabout way and a long winded way to solve the coupon collector problem, which is discussed in this blog post from a year ago. However, the example discussed here is an excellent introduction to Markov chains. Further information can be found in the following blog posts.

\text{ }

\text{ }

Dan Ma math blog
Daniel Ma math blogs

Dan Ma math
Daniel Ma math

Dan Ma mathematics
Daniel Ma mathematics

\copyright 2018 – Dan Ma

Using a 4000-year old clay tablet to solve math problems

This post discusses the clay tablet that is known as Plimpton 322. This tablet gives us a glimpse of the powerful mathematics practiced in ancient Babylonia almost 4000 years ago. We aim to give a sense of why Plimpton 322 is fascinating. The math in the tablet informs us of the past as well as informing us of the present. We also work some trigonometry problems using Plimpton 322.

What is Plimpton 322?

Figure 1 – Plimpton 322 (Credit: UNSW/Andrew Kelly)

The dimensions of the tablet are 8.8 cm by 12.7 cm (3.5 inches by 5 inches), about the size of a small pocket calculator. It was purchased by the New York publisher George Arthur Plimpton in 1923 from Edgar J. Banks and was donated to Columbia University upon Plimpton’s death in 1936. Henceforth, the tablet had been known as Plimpton 322, signifying that it is the 322nd item in Plimpton’s collection.

According to Edgar J. Banks, the tablet came from a location near the ancient city of Larsa (modern Tell Senkereh) in Southern Iraq. Eleanor Robson, an Oriental scholar at the University of Oxford, estimated that Plimpton 322 was created around 1800 BC in Babylonia, more specifically about six decades before Larsa fell to Hammurabi of Babylon in 1762 BC. Thus Plimpton 322 dated back to the Old Babylonian period in Mesopotamia about 4,000 years ago.

What is in Plimpton 322?

The Plimpton 322 was written in cuneiform script. The numbers contained in the tablet are sexagesimal numbers (base 60). It was at first assumed to be just another Babylonian ledgerbook. In the 1940s, Otto Neugebauer, a historian of ancient science at Brown University, and his assistant Abraham Sachs found that Plimpton 322 actually contains interesting mathematical contents. The entries in the tablet are essentially Pythagorean triples, i.e. the integer solutions to the equation a^2+b^2=c^2.

In order to appreciate Plimpton 322, let’s look at how the contents of the tablet are structured. The front side of Plimpton 322 has 15 lines of numbers displayed in four columns. The line at the top above the numbers contains some labels. The rightmost column contains the row numbers (or line numbers) from 1 to 15. The middle two columns contain the short side s and the hypotenuse d of 15 right triangles. In other words, the second and third columns of Plimpton 322 are two sides a right triangle such as the one shown below.

Figure 2 – A right triangle

The third column of the tablet shows d, the hypotenuse of a right triangle (or diagonal). The second column shows s, the short side of a right triangle. The long side l of a right triangle is not shown. The first column in Plimpton 322 is the square of a ratio, which can be one of two interpretations, either the square of \frac{d}{l} (diagonal over long side) or the square of \frac{s}{l} (short side over long side). The following diagram shows the descriptions of the four columns.

Figure 3 – The structure of Plimpton 322

What is Special about Plimpton 322?

The discovery made by Otto Neugebauer and his assistant in the 1940s was an important one. The numbers in Plimpton 322 are what are now called Pythagorean triples. It gives the short side and the diagonal (hypotenuse) of 15 right triangles. The long sides of the right triangles are not shown. As we will see below, the 15 right triangles have steadily decreasing slopes. The Sumerians in the Old Babylonian period knew about the Pythagorean theorem over 1,000 years before the time of Pythagoras!

Since the discovery made by Otto Neugebauer, Plimpton 322 was a subject of extensive research by mathematicians. Obviously mathematicians are intrigued by the connection of a 4000-year tablet with modern mathematics. Because of the intricate mathematical interpretations they made of the tablet, many mathematicians thought highly of the tablet. For example, the author of the tablet must be a mathematical prodigy or a professional mathematician, doing high level research in the Old Babylonian Period.

However, there are opposing views. Eleanor Robson does not view Plimpton 322 as the work of a math prodigy or professional mathematician. Her view of Plimpton 322 is more mundane. She believes that Plimpton 322 was created as teaching aid with a purpose of generating problems involving right triangles and reciprocal pairs. Links are provided below for research stating these different points of view.

Looking at Plimpton 322 in Decimal Numbers

High level math research or merely teaching aid, the fact that the tablet contains Pythagorean triples is fascinating and interesting from a mathematical point of view. Let’s continue to examine the tablet. We give a small demonstration that it can be used for working trigonometry problems. The numbers in the tablet are sexagesimal numbers (base 60). To make things easy for us, the following table shows the decimal conversion of Plimpton 322, taken from the Wikipedia entry on Plimpton 322.

Table 1 – Decimal Conversion of Plimpton 322

Squared Ratios Short Side Diagonal Row
(1).9834028 119 169 1
(1).9491586 3367 4825 2
(1).9188021 4601 6649 3
(1).8862479 12709 18541 4
(1).8150077 65 97 5
(1).7851929 319 481 6
(1).7199837 2291 3541 7
(1).6927094 799 1249 8
(1).6426694 481 769 9
(1).5861226 4961 8161 10
(1).5625 45 75 11
(1).489417 1679 2929 12
(1).4500174 161 289 13
(1).430289 1771 3229 14
(1).3871605 56 106 15

The first column is either the square of the diagonal over the long side or the square of the short side over the long side. For example, the long side in Row 1 is 120. The square of 169/120 is 1.9834. The square of 119/120 is 0.9834. To help us work problems, we expand the table with three more columns.

Table 2 – Decimal Conversion of Plimpton 322 (Expanded)

Squared Ratios Short Side Diagonal Row Long Side S/L D/L
(1).9834028 119 169 1 120 0.99167 1.40832
(1).9491586 3367 4825 2 3456 0.97425 1.39612
(1).9188021 4601 6649 3 4800 0.95854 1.38521
(1).8862479 12709 18541 4 13500 0.94141 1.37341
(1).8150077 65 97 5 72 0.90278 1.34722
(1).7851929 319 481 6 360 0.88611 1.33611
(1).7199837 2291 3541 7 2700 0.84852 1.31148
(1).6927094 799 1249 8 960 0.83229 1.30104
(1).6426694 481 769 9 600 0.80167 1.28167
(1).5861226 4961 8161 10 6480 0.76559 1.25941
(1).5625 45 75 11 60 0.75 1.25
(1).489417 1679 2929 12 2400 0.69958 1.22042
(1).4500174 161 289 13 240 0.67083 1.20417
(1).430289 1771 3229 14 2700 0.65593 1.19593
(1).3871605 56 106 15 90 0.62222 1.17778

The three additional columns are the long side and the ratios of Short over Long and Diagonal over Long. The square of these two ratios would be the first column of the table. The 6th column (S/L) is the slope of the right triangle in Figure 1. So the 15 right triangles in the table have steadily decreasing slopes. The angle between the diagonal and the long side goes from 44.76 degrees (in Row 1) to 31.89 degrees (in Row 15).

How did the creator of Plimpton 322 calculate the long side l in Table 2? For example, in Row 2, the short side is 3367 and the diagonal side is 4825. Modern calculation for the long side would be the square root \sqrt{4825^2-3367^2}=\sqrt{11943936}=3456. How was 3456 obtained for the creator of Plimpton 322? It turns out that the triples (s, l, d) in the tables are of special form. The three sides can be expressed as:

    s=p^2-q^2

    d=p^2+q^2

    l=2 \times p \times q

such that p and q are integers. The ingenuity is that the special right triangles obtained in this fashion can be used for solving trigonometric problems as demonstrated below.

Working Examples

Solve for the unknown side for each of the following two right triangles A and B.

Figure 4 – Solve for the unknown sides using Plimpton 322

Obviously the triangles are not drawn to scale. They are only meant to convey the problems. We show how to use Plimpton 322 to estimate x and y.

In triangle A, we are given the diagonal and the long side. Immediately we can compute the ratio D/L=190/145=1.310344828. This ratio is closest to the D/L ratio in Row 7 in Table 2. We use the right triangle in Row 7 as the reference triangle, i.e. the triangle in Row 7 and triangle A are (approximately) congruent. The ratios of the short side to the long side for both triangles should be approximately identical.

    \displaystyle \frac{x}{190}=\frac{2291}{3541}

Solving for x gives 122.9285513. The modern approach would be to use the Pythagorean theorem with the help of a calculator in taking square root. Thus the exact answer is x=\sqrt{190^2-145^2}=\sqrt{15075}=122.7802916. Of course, this approach was not possible in the Old Babylonian period as the concept of square root was not known at the time.

In triangle B, the short side and the long side are given. It follows that the square of the diagonal equals the square of the short side plus the square of the long side (this is known as the Pythagorean theorem to us but the relationship is also known to Babylonian users of Plimpton 322). Compute the following ratio.

    \displaystyle \frac{56^2+79^2}{79^2}=\frac{9377}{6241}=1.502483576

The above result is identical to the square of the ratio of the diagonal over the long side. Then compare this result to the first column of Table 2 (or Table 1). The closest is the number 1.489417 in Row 12. Then use the right triangle in Row 12 as the reference triangle. Thus the two triangles are approximately congruent.

    \displaystyle \frac{y}{79}=\frac{2929}{2400}

Solving for y gives 96.4291667. The answer from a modern approach would be y=\sqrt{56^2+79^2}=\sqrt{9377}=96.83491106.

Why the Examples are Special

Both answers using Plimpton 322 are quite close to the exact answers, even though the discrepancies are significant (0.148 for x and 0.422 for y). However the problem solving using Plimpton 322 is indeed special. Essentially we are solving trigonometric problems without using trigonometry, i.e. using angles and sine and cosine functions. In the Old Babylonian period, there was no concept of angles and there certainly was no trigonometry as we know it today. Hipparchus (190-120 BC), a Greek astronomer, geographer, and mathematician, is considered to be the father of trigonometry. He lived 1,600 years after the creation of Plimpton 322!

Note that the modern answer for x is \sqrt{15075} and for y is \sqrt{9377}. These two square roots are the results of applying the Pythagorean theorem. Plimpton 322 allows taking square root without using square root! Pythagoras (570-495 BC) lived more than a thousand years after the creation of Plimpton 322. So using a cheat sheet from 1800 BC, we can solve trigonometric problems without using methods that only came thousand or more years later.

Recent research by Daniel Mansfield and N. J. Wildberger compared the methods of using Plimpton 322 to using the well-known sine table created by the Indian astronomer-mathematician Madhava (1340–1425 AD), over 3,000 years after Plimpton 322. Their problems are similar to the examples given here. The approach using Plimpton 322 produces much more accurate answers than the approach of using the sine table of Madhava. It is amazing that an 1800 BC “trigonometric” table beats a trigonometric table that came 3,000 later! If the Babylonians were indeed using Plimpton 322 as a trigonometric table, then it preceded Hiapparchus’ table of chords by about 1,600 years.

The accuracy of using Plimpton 322 is highly dependent on the given sides of the right triangles, i.e. the approach is accurate only if the squared ratios are close to the ratios in Plimpton 322. However, Mansfield and Wildberger showed that the usefulness of Plimpton 322 can be extended using interpolation (the ancient Babylonians were big on interpolation).

The calculation demonstrated here is only scratching the surface. What we have shown is just a small demonstration of the mathematical specialness of Plimpton 322. The examples we work are in no way a suggestion that it is how the stone tablet was used in the Old Babylonian period.

The math in Plimpton 322 is wonderful and exciting. There is actually quite a bit of controversy about the tablet. What purpose did the tablet serve at its time? There are divergent views just on this questions alone.

According to Mansfield and Wildberger, Plimpton 322 not only replaces Hiapparchus’ table of chords as the world’s oldest trigonometric table, it is the world’s only completely accurate trigonometric table. For more information, see the article by Mansfield and Wildberger. Mansfield and Wildberger believe that Plimpton 322 would have been used in engineering calculations for the construction of palaces, canals or perhaps the Hanging Gardens of Babylon.

According to Eleanor Robson, Plimpton 322 was merely a teaching aid for problems involving right triangles (article). According to Robson, ancient mathematical texts and artifacts such as Plimpton 322 must be viewed in light of their historical and cultural contexts (in addition to the mathematical). The mathematics contained in Plimpton 322 should not be examined in isolation.

Though the modern mathematical interpretations of Plimpton 322 may have no relation to its original use, it is undeniable that the mathematics in Plimpton 322 is fascinating. It makes for good material for any math teacher’s lesson plans. It ought to be reassuring to students that the math topics that they deal with were also practiced by students 4,000 years ago. The proof is in the tablet called Plimpton 322.

Reporting of Plimpton 322 is easily found on the Internet (examples: here and here). The wikipedia entry on Plimpton 322 is a good source of information, as is the entry on Edgar J. Banks.

\text{ }

\text{ }

Dan Ma math blog
Daniel Ma math blogs

Dan Ma math
Daniel Ma math

Dan Ma mathematics
Daniel Ma mathematics

\copyright 2017 – Dan Ma

Should dictionary words be used in passwords?

It is commonly advised that dictionary words should not be used when forming passwords. We would like to make the case that dictionary words can be used as long as the words are randomly chosen. This post illustrates how this may be done.

We pick 5 words at random from the following dictionary.

The idea is to choose 5 pages at random. Then choose a word at random from each page. There are 1,317 pages. We calculate the Excel function =RANDBETWEEN(1, 1317) 5 times to generate the following random numbers, each of which is considered a page number in the dictionary.

    562, 1292, 397, 857, 1171

Assuming that there are around 50 words in a page, calculate the Excel function =RANDBETWEEN(1, 50) and generate the following random numbers.

    40, 8, 19, 13, 29

Thus the first random word is the 40th word in the 562nd page in the dictionary, the second word is the 8th word in the 1292nd page in the dictionary and so on. The 5 random words are:

    idiotic, wideopen, evulsion, pinhead, theodolite

Putting these 5 words in a string produces the following password, which is 41-character long.

    idioticwideopenevulsionpinheadtheodolite

How secure is this password? The 5 words are selected at random from a fairly large dictionary. It has 1317 pages. Assuming 50 words per page, the dictionary would have around 65,000 words. According to the multiplication principle, there would be 65000^5=1.16 \times 10^{24} many ways to choose 5 words from this dictionary. This is 1 followed by 24 zeros, which is 1 septillion. When 1 is followed by 12 zeros, the result is 1 trillion. So 1 followed by 24 zeros is the same as 1 trillion times 1 trillion.

So a brute force dictionary attack would have to cover the universe of these 1 septillion 5-word strings. To get a sense of how big 1 septillion is, try this scenario. For a computer than can check 1,000 5-word strings per second, it will take over 1 million years to exhaust all the 1 septillion 5-word strings. Such a brute force attack may be more suitable for a parallel computing project that involves a massive number of computers than for a cyber criminal who has only a limited number of computers. Examples of parallel computing projects include the ones for searching for the largest known prime number (one example is GIMPS – Great Internet Mersenne Prime Search).

The words have to be chosen at random for this approach to work. If the words are based on movie titles, sport team names, names of celebrities and other types of familiar proper nouns as well as idiomatic phrases, then the universe of the word strings would be much smaller, maybe 20,00 or 30,000. In relation 1 septillion, 30,000 is in effect zero. The word strings from this tiny universe would be vulnerable to brute force attack.

Of course, the security of the random 5-word strings can be further enhanced. Use more random words, for example. Another possibility is to make them case sensitive. The above 41-character string can become the following:

    iDIoTIcwideoPenevuLsIonpiNhEADthEodoLItE

Another possibility is to add numeric characters and special characters ($, *, # etc).

Of course, the password will be harder to remember if it is made case sensitive (especially if the upper cases letters are chosen at random). So a possible compromise is to make the first letter of a word upper case just to satisfy the case sensitivity requirement of many systems and websites along with throwing in some numbers and special characters. Simply add more random words for enhanced security.

In general, the approach of using multi-word phrase should be taken with care. The 5-word string that is demonstrated above requires some effort to produce – randomly selecting pages in the dictionary and randomly selecting one word from each selected page. I actually use a function in Excel to generate the random numbers to locate the pages. Instead, I can randomly flip through the pages. For some, that may still be too much effort. The danger is that someone may get lazy and simply use familiar proper nouns like favorite movies and sport teams such as the following:

    PirateoftheCaribbeanLALakers

Instead, the following is a better alternative.

    MfmiPotCaIaadhLALf

The above string is taken from the first letters of the sentence “My favorite movie is Pirate of the Caribbean and I am a die hard LA Lakers fan”. It is an 18-character password that is taken from a memorable phrase. The resulting password is definitely much more secure than stringing the movie title and the basket team name together. See here for information on the approach of using a memorable phrase or several phrases.

\text{ }

\text{ }

\text{ }

Dan Ma math
Daniel Ma math

Dan Ma mathematics
Daniel Ma mathematics

\copyright 2017 – Dan Ma

Strength in numbers

It is commonplace for frequent users of the Internet to have multiple online accounts that require passwords. Advice and guidelines abound on the Internet on picking safe and strong passwords. They are usually common sense guidelines – e.g. it has to be long enough and it should not be easily guessed. Sometimes a long enough password may not be safe and secure. The word password is an 8-letter word but sits on top of just about every list of no-no passwords. On the other hand, the 8-character string such as kHgfDsAa is a reasonable choice.

Recently it was reported in Cnet.com that security researcher Troy Hunt has released a searchable tool that taps the database of previously compromised passwords (here’s the Cnet.com article and here’s the tool). There are 306 millions of them. Supposedly the utility of the tool is that no one should use the passwords in this list. The thinking is that 306 million is a large number, and since it is such a large number, the search database would be a public service.

A counting exercise is sometimes needed in order to get a proper perspective on password strength. How big is 306 million? Consider the total number of 8-character strings. It would be 26^8 which is 208,827,064,576. This is 208 billion or 208,827 million, about 682 times of 306 million. The universe of 8-letter passwords is 682 times bigger than this searchable database tool. This universe of 208 billion 8-letter passwords is only for lower case passwords. If including case sensitive passwords, this would be even a much bigger universe. It can be even bigger by adding the possibility of numeric characters and special symbols. The following table gives the numbers of possible passwords from lengths of 8 to 12 (just letters, case insensitive).

Number of case insensitive passwords

Length Total Total
8-letter 26^8 208,827,064,576
9-letter 26^9 5,429,503,678,976
10-letter 26^{10} 141,167,095,653,376
11-letter 26^{11} 3,670,344,486,987,780
12-letter 26^{12} 95,428,956,661,682,200

It is a good idea to use the searchable tool. But bear in mind that any password that is reasonably long and reasonably strong is likely not to be in this list of 306 million. Note that the number of 12-letter case insensitive (letter only) passwords is 95,428 trillion! The passwords represented in this table alone would dwarf the 306 million in this searchable database.

Here’s a peculiar way to find strong passwords. This scheme is to produce 26-letter passwords such that every letter is known and is fixed! In fact, the first letter of the password is the first letter in the English alphabets, the second letter of the password is the second letter of the English alphabets and so on. The length of the password is long but every letter is fixed. This scheme is discussed in this blog post. This universe of passwords is not as big as the ones in the above table. But it is a big enough collection of possibilities that it is all but impossible to hack without computer help. There are 67,108,864 many different possibilities (over 67 million). How does this scheme work? Why is it that every letter is known but the passwords can be strong?

Curious? Think about it or go to this blog post. This particular scheme is a way to learn the concept of binomial distribution. Any one who understands this scheme understands binomial distribution.

Having to come up with multiple passwords for multiple online accounts is a fact of life in the age of the Internet. Having a good way to generate secure passwords is critical. Keeping track of all the passwords is definitely a challenge. Often times, what is overlooked is that thinking about passwords is a good way to get close to the mathematics of counting.

\text{ }

\text{ }

\text{ }

\copyright 2017 – Dan Ma

Tower of Hanoi

The following shows a set for the game of Tower of Hanoi.

8-disk Tower of Hanoi (Wikipedia)

The game is discussed in a companion blog. The first paragraph in that blog post:

    The tower of Hanoi is a game that works on multiple levels. It is a challenging game that test the agility and organization skills of the player. It is also a game mathematicians would love since the game is an excellent illustration of math concepts such as mathematical induction and exponential growth. It is also a concrete illustration of a recursive algorithm. To read more …

The game is discussed previously in this blog post in this blog, which is more detailed in terms of strategy of playing the game.

\copyright 2017 – Dan Ma

The probabilities of poker hands

This post works with 5-card Poker hands drawn from a standard deck of 52 cards. The discussion is mostly mathematical, using the Poker hands to illustrate counting techniques and calculation of probabilities

Working with poker hands is an excellent way to illustrate the counting techniques covered previously in this blog – multiplication principle, permutation and combination (also covered here). There are 2,598,960 many possible 5-card Poker hands. Thus the probability of obtaining any one specific hand is 1 in 2,598,960 (roughly 1 in 2.6 million). The probability of obtaining a given type of hands (e.g. three of a kind) is the number of possible hands for that type over 2,598,960. Thus this is primarily a counting exercise.

___________________________________________________________________________

Preliminary Calculation

Usually the order in which the cards are dealt is not important (except in the case of stud poker). Thus the following three examples point to the same poker hand. The only difference is the order in which the cards are dealt.

These are the same hand. Order is not important.

The number of possible 5-card poker hands would then be the same as the number of 5-element subsets of 52 objects. The following is the total number of 5-card poker hands drawn from a standard deck of 52 cards.

    \displaystyle \binom{52}{5}=\frac{52!}{5! \ 47!}=2,598,960

The notation \binom{n}{r} is called the binomial coefficient and is pronounced “n choose r”, which is identical to the number of r-element subsets of a set with n objects. Other notations for \binom{n}{r} are _nC_r, C_{n,r} and C(n,r). Many calculators have a function for _nC_r. Of course the calculation can also be done by definition by first calculating factorials.

Thus the probability of obtaining a specific hand (say, 2, 6, 10, K, A, all diamond) would be 1 in 2,598,960. If 5 cards are randomly drawn, what is the probability of getting a 5-card hand consisting of all diamond cards? It is

    \displaystyle P(\text{all diamond})=\frac{1287}{2598960}=0.000495198=0.0495198 \%

This is definitely a very rare event (less than 0.05% chance of happening). The numerator 1,287 is the number of hands consisting of all diamond cards, which is obtained by the following calculation.

    \displaystyle \binom{13}{5} \times \binom{39}{0}=\frac{13!}{5! \ 8!} \times 1=1,287

The reasoning for the above calculation is that to draw a 5-card hand consisting of all diamond, we are drawing 5 cards from the 13 diamond cards and drawing zero cards from the other 39 cards. Since \binom{39}{0}=1 (there is only one way to draw nothing), \binom{13}{5}=1287 is the number of hands with all diamonds.

If 5 cards are randomly drawn, what is the probability of getting a 5-card hand consisting of cards in one suit? The probability of getting all 5 cards in another suit (say heart) would also be 1287/2598960. So we have the following derivation.

    \displaystyle \begin{aligned} P(\text{one suit})&=P(\text{all diamond})+P(\text{all heart})+P(\text{all club})+P(\text{all spade}) \\&=\frac{1287}{2598960}+\frac{1287}{2598960}+\frac{1287}{2598960}+\frac{1287}{2598960} \\&=\frac{5148}{2598960} \\&=0.001980792 \\&=0.198 \% \end{aligned}

Thus getting a hand with all cards in one suit is 4 times more likely than getting one with all diamond, but is still a rare event (with about a 0.2% chance of happening). Some of the higher ranked poker hands are in one suit but with additional strict requirements. They will be further discussed below.

Another example. What is the probability of obtaining a hand that has 3 diamonds and 2 hearts? The answer is 22308/2598960 = 0.008583433. The number of “3 diamond, 2 heart” hands is calculated as follows:

    \displaystyle \binom{13}{3} \times \binom{13}{2} \times \binom{26}{0}=\frac{13!}{3! \ 11!} \times \frac{13!}{2! \ 13!}=286 \times 78=22308

One theme that emerges is that the multiplication principle is behind the numerator of a poker hand probability. For example, we can think of the process to get a 5-card hand with 3 diamonds and 2 hearts in three steps. The first is to draw 3 cards from the 13 diamond cards, the second is to draw 2 cards from the 13 heart cards, and the third is to draw zero from the remaining 26 cards. The third step can be omitted since the number of ways of choosing zero is 1. In any case, the number of possible ways to carry out that 2-step (or 3-step) process is to multiply all the possibilities together.

___________________________________________________________________________

The Poker Hands

Here’s a ranking chart of the Poker hands.

The chart lists the rankings with an example for each ranking. The examples are a good reminder of the definitions. The highest ranking of them all is the royal flush, which consists of 5 consecutive cards in one suit with the highest card being Ace. There is only one such hand in each suit. Thus the chance for getting a royal flush is 4 in 2,598,960.

Royal flush is a specific example of a straight flush, which consists of 5 consecutive cards in one suit. There are 10 such hands in one suit. So there are 40 hands for straight flush in total. A flush is a hand with 5 cards in the same suit but not in consecutive order (or not in sequence). Thus the requirement for flush is considerably more relaxed than a straight flush. A straight is like a straight flush in that the 5 cards are in sequence but the 5 cards in a straight are not of the same suit. For a more in depth discussion on Poker hands, see the Wikipedia entry on Poker hands.

The counting for some of these hands is done in the next section. The definition of the hands can be inferred from the above chart. For the sake of completeness, the following table lists out the definition.


Definitions of Poker Hands

Poker Hand Definition
1 Royal Flush A, K, Q, J, 10, all in the same suit
2 Straight Flush Five consecutive cards,
all in the same suit
3 Four of a Kind Four cards of the same rank,
one card of another rank
4 Full House Three of a kind with a pair
5 Flush Five cards of the same suit,
not in consecutive order
6 Straight Five consecutive cards,
not of the same suit
7 Three of a Kind Three cards of the same rank,
2 cards of two other ranks
8 Two Pair Two cards of the same rank,
two cards of another rank,
one card of a third rank
9 One Pair Three cards of the same rank,
3 cards of three other ranks
10 High Card If no one has any of the above hands,
the player with the highest card wins

___________________________________________________________________________

Counting Poker Hands

Straight Flush
Counting from A-K-Q-J-10, K-Q-J-10-9, Q-J-10-9-8, …, 6-5-4-3-2 to 5-4-3-2-A, there are 10 hands that are in sequence in a given suit. So there are 40 straight flush hands all together.

Four of a Kind
There is only one way to have a four of a kind for a given rank. The fifth card can be any one of the remaining 48 cards. Thus there are 48 possibilities of a four of a kind in one rank. Thus there are 13 x 48 = 624 many four of a kind in total.

Full House
Let’s fix two ranks, say 2 and 8. How many ways can we have three of 2 and two of 8? We are choosing 3 cards out of the four 2’s and choosing 2 cards out of the four 8’s. That would be \binom{4}{3} \times \binom{4}{2} = 4 x 6 = 24. But the two ranks can be other ranks too. How many ways can we pick two ranks out of 13? That would be 13 x 12 = 156. So the total number of possibilities for Full House is

    \displaystyle 13 \times 12 \times \binom{4}{3} \times \binom{4}{2}= 13 \times 12 \times 4 \times 6 =3744

Note that the multiplication principle is at work here. When we pick two ranks, the number of ways is 13 x 12 = 156. Why did we not use \binom{13}{2} = 78?

Flush
There are \binom{13}{5} = 1,287 possible hands with all cards in the same suit. Recall that there are only 10 straight flush on a given suit. Thus of all the 5-card hands with all cards in a given suit, there are 1,287-10 = 1,277 hands that are not straight flush. Thus the total number of flush hands is 4 x 1277 = 5,108.

Straight
There are 10 five-consecutive sequences in 13 cards (as shown in the explanation for straight flush in this section). In each such sequence, there are 4 choices for each card (one for each suit). Thus the number of 5-card hands with 5 cards in sequence is 10 \times 4^5= 10240. Then we need to subtract the number of straight flushes (40) from this number. Thus the number of straight is 10240 – 10 = 10,200.

Three of a Kind
There are 13 ranks (from A, K, …, to 2). We choose one of them to have 3 cards in that rank and two other ranks to have one card in each of those ranks. The following derivation reflects all the choosing in this process.

    \displaystyle \binom{13}{1} \times \binom{4}{3} \times \binom{12}{2} \times \binom{4}{1} \times \binom{4}{1}=13 \times 4 \times 66 \times 4 \times 4 = 54912

Two Pair and One Pair
These two are left as exercises.

High Card
The count is the complement that makes up 2,598,960.

The following table gives the counts of all the poker hands. The probability is the fraction of the 2,598,960 hands that meet the requirement of the type of hands in question. Note that royal flush is not listed. This is because it is included in the count for straight flush. Royal flush is omitted so that he counts add up to 2,598,960.


Probabilities of Poker Hands

Poker Hand Count Probability
2 Straight Flush 40 0.0000154
3 Four of a Kind 624 0.0002401
4 Full House 3,744 0.0014406
5 Flush 5,108 0.0019654
6 Straight 10,200 0.0039246
7 Three of a Kind 54,912 0.0211285
8 Two Pair 123,552 0.0475390
9 One Pair 1,098,240 0.4225690
10 High Card 1,302,540 0.5011774
Total 2,598,960 1.0000000

___________________________________________________________________________
\copyright 2017 – Dan Ma

How to calculate winning odds in Powerball

Powerball is a lottery game available for play in 44 states in the United States and in the District of Columbia, Puerto Rico and the US Virgin Islands. It is a “mega” lottery because of the usually huge jackpot. As of the writing of this post, the estimated jackpot is $113 million (see the below picture). The largest Powerball jackpot is $1.59 billion on January 13, 2016 (this was also the largest in US lottery history). The average Powerball jackpot is $62.71 million. Recent jackpots had been in the range of hundreds of million dollars.

With such large jackpots, it is not surprising that Powerball is a popular lottery game, especially when the impending jackpot is in the hundreds of millions dollars. Bear in mind that the odds of winning the jackpot are slim: 1 in 292,201,338, i.e. one in over 292 million. This post is to demonstrate how to mathematically derive the winning odds of the jackpot and other smaller prizes of Powerball (the first objective). Calculating the odds is a great math lesson. It is equally interesting and equally important to make sense of the large numbers in the small winning odds. The second objective is to focus on the dim prospect of Powerball players as told by the mathematical odds.

Figure 1 – Powerball Results on April 26, 2017

___________________________________________________________________________

The Game of Powerball

This is how the game is played. In a Powerball playslip, a player picks 5 numbers from 1 through 69 and 1 number from 1 through 26 (this is the Powerball number). Drawings are held Wednesday and Saturday evenings at 10:59 p.m. Eastern Time. In each drawing, five balls are drawn from a drum with 69 white balls (labeled 1 through 69) and one ball is drawn from a drum with 26 red balls (labeled 1 through 26). Figure 1 above shows that the winning numbers on April 26, 2017 are 1, 15, 18, 26 and 51 (white) and 26 (red). The winnings are determined by whether the numbers in the playslip match the balls drawn at the Powerball drawing.

The grand prize is awarded to the player (or players) whose ticket matches all of the numbers on the five chosen white balls and the one chosen red ball. Smaller prizes are awarded to players whose tickets match fewer number of white balls with or without the red ball. The following table shows the levels of prizes and the odds. These Powerball payout rules had been in place since October 7, 2015.

Figure 2 – Powerball Prizes and Odds

\bold S \bold o \bold u \bold r \bold c \bold e: \bold w \bold w \bold w. \bold p \bold o \bold w \bold e \bold r \bold b \bold a \bold l \bold l. \bold c \bold o \bold m

As shown by the table, the prizes are for scenarios ranging from matching zero white balls to all 5 of the white balls (with or without the match of the red ball). Of course, matching all 6 balls would lead to the grand prize. Matching just the 5 white balls without the red ball would be the fixed prize of $1 million. Matching any 4 of the white balls plus the red ball would lead to the prize of $50,000. The rest of the prizes are for practically insignificant amounts.

Each Powerball ticket is $2. For an additional $1 per game, a player may activate the Power Play option. When this option is activated, the lower prizes that are $50,000 or less are multiplied by a factor from 1 up to 5 or 10 (10x is available when the jackpot is under $150 million). With the Power Play Option, the prize for the “5 white balls and no red ball” match is increased to $2 million. That is, the “5 + 0” prize is doubled under the Power Play Option. For further information on Powerball, see the Wikipedia entry.

The odds represented in the above table are essentially winning probabilities for a Powerball player (per $2 bet). The odds for the grand prize is 1 in 292,201,338. So the probability of winning the grand prize with one ticket is 1 / 292,201,338, which of course is essentially zero. This probability is essentially this statement: out of 292,201,338 different possible Powerball combinations and only one of them is the winning combination.

The odds for the $1 million prize is 1 in 11,688,353.52, about 1 in 11 million (still very slim odds). So the probability of winning $1 million with one ticket is 1 / 11,688,353.52, which is also essentially zero. Note that 1 / 11,688,353.52 is the same as 25 / 292,201,338. This last probability is the statement: out of 292,201,338 different possible Powerball combinations, only 25 of them are winning (i.e. matching all 5 white balls without the matching the red Powerball number).

Thus calculating the odds for Powerball (or any other lottery game for that matter) is about counting correctly the number of possible lottery ticket combinations (the denominator) and counting the number of winning tickets (the numerator). So this is a counting exercise. In the area of mathematics called combinatorics, which at the elementary level concerns with the counting of number of ways objects can be chosen from a set (a set of balls in the case of lottery). Several previous posts of this blog are devoted to this subject. We will reference them when necessary. However, we strive to make the discussion here as self contained as possible.

___________________________________________________________________________

Using a Toy Lottery

As indicated above, there are 292,201,338 different possible Powerball tickets. Out of these, there are 25 possible winning pickets for the $1 million prize. Of the 292,201,338 possible tickets, there are 320 possible winning tickets for the $50,000 prize. Instead of attacking these problems head on, we use a “toy” lottery to illustrate the idea. Any reader who feels that such introduction is not needed can skip to the next section.

Based on the last paragraph in the preceding section, the winning probability is the number of possible ways to win over the number of possible “tickets.”

    Probability of winning = \displaystyle \frac{\text{the number of winning numbers}}{\text{the total number of possible lottery numbers}}

For example, if 40 tickets are sold in a raffle and only one of them is the winner, then the odds of winning would be 1 in 40. The probability of winning would be 1 / 40 = 0.025 (2.5% chance). If two of the 40 tickets are winners, then the odds are 1 in 20. The probability of winning would then be 2 / 40 = 0.05 (5% chance). Though the Powerball calculation is based on this simple idea, there are subtle points that need to be addressed in order to fully understand the calculation. So we take the approach of using a toy lottery.

Let’s call the game “toy Powerball.” The player picks two numbers from 5 white balls (labeled 1 through 5) and picks one number from 4 red balls (labeled 1 through 4). The prizes are:

  • grand prize (match all 2 white balls and 1 red ball),
  • $100 prize (match 2 white balls and not matching the red ball),
  • $10 prize (match 1 white ball and the red ball).

How many different possible tickets are there? Since there are two drawings (one for the white ball and one for the red ball), we need to count them separately and multiply the results. Since there are only 5 white balls, we can actually count the number of ways to choose 2 balls out of 5 white balls.

    1, 2
    1, 3
    1, 4
    1, 5
    2, 3
    2, 4
    2, 5
    3, 4
    3, 5
    4, 5

There are obviously 4 ways to choose one ball out of 4 red balls. So the total number of different possible tickets are 10 x 4 = 40. This is based on the so called multiplication principle – if one event can occur in M ways and another event can occur, independent of the first event, in N ways, then the two events together can occur in M \times N ways. The multiplication principle, discussed here, will be a great help in calculating the odds. For example, the first event is the choosing of the white balls and the second event is the choosing of the red balls. This idea can be used to find the number of possibilities in both the denominator and the numerator.

There is only one possible winning ticket for the grand prize (of course, if two people pick the same winning combination, they will split the prize). So the odds for winning the grand prize is 1 in 40. The probability of winning the grand prize is 1 / 40 = 0.025 (2.5%).

Is there a way to calculate the number of ways to choose two balls out of 5 white balls (labeled 1 through 5) without writing down all possibilities? This will be key to the Powerball calculation. There are two ways. One is an intuitive approach using the multiplication principle and the other is to use a formula for combination.

The first idea. Choosing two numbers is like filling two spots with numbers. In this case, the first spot has 5 choices and the second spot has 4 choices after the first spot is filled. This gives a total of 5 x 4 = 20. But 20 is an over count. For example, the result 3-2 (first number is 3 and the second number is 2) is actually the same as 2-3 as far as the lottery is concerned. So each 2-number combination appears twice in the count of 20. Thus dividing by 2 gives 20 / 2 = 10.

The other way is via a formula for combination. In the toy Powerball example, we need to choose 2 balls out of 5 balls. The following is the calculation.

    \displaystyle \frac{5!}{2! \ 3!}=\frac{120}{2  \times 6}=10

Let’s unpack this calculation. The notation 5!, read 5 factorial, is 5 x 4 x 3 x 2 x 1 = 120, in other words, obtained by multiplying together 5 and all the positive integers below 5. Thus 6! = 6 x 5 x 4 x 3 x 2 x 1 = 6 x 120 = 720. In general, whenever n is a positive integer, n!, read n factorial, is obtained by multiplying n and all the positive integers below n. To make the calculation works out, we define 0! = 1.

Is there any natural interpretation of factorial? For example, 5! is the number ways to arrange 5 people in a row for a group photo. Five people are to be assigned into 5 spots. There are 5 possibilities for the first spot, 4 possibilities for the second spot after the first person is chosen and so on. By the multiplication principle, the total number of ways to do this would be 5 x 4 x 3 x 2 x 1. In general n! is the number of ordered arrangements of n objects.

Thus the number of ways to choose 2 balls out of 5 balls is 10 (also the number of ways to choose two people out of 5 to form a committee of 2, the number of ways to choose 2 students out of 5 for awards, or the number of ways to choose 2 ice cream flavors out of 5 – the possibility is endless). The notation for “choose 2 from 5” is \binom{5}{2}. The top number in this notation is the total number of objects to choose from and the bottom number is the number of objects to be chosen. In general, the number of ways to choose r objects out of n objects is

    \displaystyle \binom{n}{k}=\frac{n!}{r! \ (n-r)!}

The number \binom{n}{r} is called the number of possible combinations of n objects chosen r at a time. Other notations include _nC_r and C(n,r). Regardless the notations, it is the number of ways to choose r objects from a set of n objects or the number of different groups of size r that can be chosen from a set of n distinct objects. Many calculators have a function for the number \binom{n}{r} (in a calculator the notation is probably _nC_r). If it is to be calculated by hand, bear in mind that n!, factorial of the total number of object, is in the numerator and the two factorials in the denominator are r! and (n-r)! where r is the number of objects to be chosen. Note that the sum of r and n-r is n.

For more information about how to calculate combination, see here and the summary section here.

Let’s look at the second prize in toy Powerball – match 2 white balls and no match for red ball. As example, let’s say the winning numbers are 1 and 3 (white) and 2 (red). How many 3-number combinations satisfy the winning criteria for this prize – matching two white numbers and not the red number? There is still only one way to match the two white numbers. But there are 3 ways to not match the red winning numbers (the non-winning red numbers would be 1, 3, and 4). The number of possible winning tickets would be 1 x 3 = 3. So the odds for winning the $100 prize are 3 in 40. The probability of winning the $100 prize is 3 / 40 = 0.075 (7.5%).

The third prize is $10 won by matching 1 white ball and 1 red ball. Again, using the example of winning numbers of 1 and 3 (white) and 2 (red), how many 3-number combinations satisfy the winning criteria for this prize – matching 1 white number and the red number? Remember that the player of the game chooses 2 white numbers and 1 red number. Since there is only 1 correct match for white number, there are 1 winning white number and 1 non-winning white number in the ticket. There are two ways to match 1 winning white number (1 or 3) and there are 3 ways to match 1 non-winning white number (2, 4 and 5). Of course, there is only one way to match the winning red number. The number of possible winning tickets would be 2 x 3 x 1 = 6 (by the multiplication principle again). This observation will be crucial in understanding the Powerball calculation below.

So the odds for winning the $10 prize are 6 in 40. The probability of winning would be 6 / 40 = 0.15 (15%).

___________________________________________________________________________

Back to Powerball

The winning probability of a prize in a lottery would be the fraction of all of the possible lottery numbers which count as winning. In other words, the winning probability would be a fraction with the denominator being the total number of possible number combinations that can be picked by the lottery players and the numerator being the number of possible winning combinations. So the denominator is always fixed for a given lottery. The numerator would vary depending on the prize. The bigger the prize, the smaller the numerator (the harder it is to win).

For the Powerball game uses a 6-number combination (5 white numbers + 1 red number). the 5 white numbers are chosen out of 69 numbers and the 1 red number is chosen out of 26 numbers. So based on the discussion in the preceding section, the total number of possible 6-number combinations would be:

Total Number of Possible Powerball Combinations

    \displaystyle \binom{69}{5} \cdot \binom{26}{1}=\text{11,238,513} \times 26=\text{292,201,338}

There are \binom{69}{5} many possible 5-number sets that can be picked from 69 numbers. There are over 11 millions ways to match the 5 winning white balls. There are \binom{26}{1} = 26 possible ways to pick one number from 26 numbers. By the multiplication principle, the product of these numbers would be the total number of possible Powerball combinations.

For the winning probability of any Powerball prize, the denominator would be 292,201,338, which is a staggering number. To help with understanding what should go into the numerator, let’s use the latest winning combination: 1, 15, 18, 26 and 51 (white balls) and 26 (red ball), drawn on April 26, 2017 (shown in Figure 1 above). So all the winning calculations below are based on this example. The number that goes into the numerator would be the number of tickets matching some or all of the white numbers with or without matching the red number depending on the rules. Let’s consider the prizes one by one.

Match 5 white balls + 1 red ball (grand prize)
There is only winning combination. So the winning probability is 1 over 292,201,338. The winning odds would be 1 in 292,201,338.

Match 5 white balls + no red ball ($1,000,000)

    Probability of winning = \displaystyle \frac{\displaystyle \binom{5}{5} \cdot \binom{1}{0} \cdot \binom{25}{1}}{292201338}=\frac{1 \cdot 25}{292201338}=\frac{25}{292201338}

The relevant question here is: how many ticket combinations are that that match all 5 winning white numbers and that do match the 1 red winning number? There is only one way to match all 5 white numbers 1, 15, 18, 26 and 51 (as in the grand prize). There are 25 ways to not match the winning red number 26 (1 through 25). So the numerator is 25. So the odds for winning $1 million are 25 in 292,201,338 or 1 in 11,688,053.52 (1 in over 11 million).

Match 4 white balls + 1 red ball ($50,000)

    Probability of winning = \displaystyle \frac{\displaystyle \binom{5}{4} \cdot \binom{64}{1} \cdot \binom{1}{1} \cdot \binom{25}{0}}{292201338}=\frac{5 \cdot 64}{292201338}=\frac{320}{292201338}

To win, a player needs to match 4 of the white numbers. To count all the winning combinations, choose 4 numbers from the 5 winning white numbers and choose 1 number from the 64 non-winning white numbers Recall that the ticket has 5 white numbers. If only 4 of them match the winning numbers, then one of the white numbers on the ticket must from a non-winning white number.

Furthermore, choose 1 number from the 1 winning red number and 0 numbers from the 25 non-winning red numbers. This observation explains what appear in the numerator. Then the odds for winning the $50,000 prize are 320 in 292,201,338 or 1 in 913,129.18.

Match 4 white balls + no red ball ($100)

    Probability of winning = \displaystyle \frac{\displaystyle \binom{5}{4} \cdot \binom{64}{1} \cdot \binom{1}{0} \cdot \binom{25}{1}}{292201338}=\frac{5 \cdot 64 \cdot 25}{292201338}=\frac{8000}{292201338}

Without the red ball match, there are more ways to win (25 more to be precise). To count the number of winning tickets, choose 4 from the 5 winning white numbers, choose 1 from the 64 non-winning white numbers, choose 0 from the 1 winning red number and choose 1 from the 25 non-winning red numbers. Doing this gives 8,000. Then the odds for winning the $100 prize are 8,000 in 292,201,338 or 1 in 36,525.17.

Match 3 white balls + 1 red ball ($100)

    Probability of winning = \displaystyle \frac{\displaystyle \binom{5}{3} \cdot \binom{64}{2} \cdot \binom{1}{1} \cdot \binom{25}{0}}{292201338}=\frac{10 \cdot 2016 \cdot 1}{292201338}=\frac{20160}{292201338}

To count the number of winning tickets, choose 3 from the 5 winning white numbers, choose 2 from the 64 non-winning white numbers, choose 1 from the 1 winning red number and choose 0 from the 25 non-winning red numbers. Doing this gives 20,160. Then the odds for winning the second $100 prize are 20,160 in 292,201,338 or 1 in 14,494.11.

Match 3 white balls + no red ball ($7)

    Probability of winning = \displaystyle \frac{\displaystyle \binom{5}{3} \cdot \binom{64}{2} \cdot \binom{1}{0} \cdot \binom{25}{1}}{292201338}=\frac{10 \cdot 2016 \cdot 25}{292201338}=\frac{504000}{292201338}

Let’s use the latest winning numbers 1, 15, 18, 26 and 51 (white balls) and 26 (red ball) to illustrate. How many ticket combinations would satisfy this criteria: match 3 numbers from 1, 15, 18, 26 and 51 (white balls) and not match the winning red number 25? The answer is the number of ways to choose 3 from 1, 15, 18, 26 and 51, choose 2 from the other 64 white numbers, choose 0 from 1 winning red number (26) and choose 1 from 25 non-winning red numbers. This explains what is in the numerator, which is 504,000. Then the odds for winning the first $7 prize are 504,000 in 292,201,338 or 1 in 579.76.

Match 2 white balls + 1 red ball ($7)

    Probability of winning = \displaystyle \frac{\displaystyle \binom{5}{2} \cdot \binom{64}{3} \cdot \binom{1}{1} \cdot \binom{25}{0}}{292201338}=\frac{10 \cdot 41664 \cdot 1}{292201338}=\frac{416640}{292201338}

Using similar idea to set up the numerator, the numerator is 416,640. Then the odds for winning the second $7 prize are 416,640 in 292,201,338 or 1 in 701.33.

Match 1 white ball + 1 red ball ($4)

    Probability of winning = \displaystyle \frac{\displaystyle \binom{5}{1} \cdot \binom{64}{4} \cdot \binom{1}{1} \cdot \binom{25}{0}}{292201338}=\frac{5 \cdot 635376 \cdot 1}{292201338}=\frac{3176880}{292201338}

As discussed earlier, when the match of the white balls is partial, the white numbers that do not match the winning white numbers must come from the 64 non-winning numbers. So we choose 1 from 5 winning white numbers and choose 4 from 64 non-winning white numbers. There is only one way to pick the winning red number. The numerator is 3,176,880. Then the odds for winning the first $4 prize are 3,176,880 in 292,201,338 or 1 in 91.98.

Match no white ball + 1 red ball ($4)

    Probability of winning = \displaystyle \frac{\displaystyle \binom{5}{0} \cdot \binom{64}{5} \cdot \binom{1}{1} \cdot \binom{25}{0}}{292201338}=\frac{1 \cdot 7624512 \cdot 1}{292201338}=\frac{7624512}{292201338}

The numerator is 7,624,512. Then the odds for winning the second $4 prize are 7,624,512 in 292,201,338 or 1 in 38.32.

The overall odds of winning a prize
The total number of possible 6-number combinations that result in a prize is obtained by summing all the numerators in the above calculation of odds. There is a total of 11,750,538 many possible prizes. The following table shows the breakdown.

Number of Possible Prizes
\begin{array}{rrrrrrr}      \text{Match} & \text{ } & \text{Prize} & \text{ } & \text{Odds} & \text{ } & \text{Number of Prizes}\\      \text{ } & \text{ } \\      \text{5 white, 1 red} & \text{ } & \text{Grand Prize} & \text{ } & \text{1 in 292,201,338} & \text{ } & 1\\      \text{5 white, no red}  & \text{ } & \$ \text{1 million} & \text{ } & \text{1 in 11,688,053.52} & \text{ } & 25\\      \text{4 white, 1 red}  & \text{ } & \$ \text{50,000} & \text{ } & \text{1 in 913,129.18} & \text{ } & 320\\      \text{4 white, no red}  & \text{ } & \$ 100 & \text{ } & \text{1 in 36,525.17}  & \text{ } & \text{8,000}\\  \text{3 white, 1 red}  & \text{ } & \$ 100 & \text{ } & \text{1 in 14,494.11}  & \text{ } & \text{20,160}\\  \text{3 white, no red}  & \text{ } & \$ 7 & \text{ } & \text{1 in 579.76}  & \text{ } & \text{504,000}\\  \text{2 white, 1 red}  & \text{ } & \$ 7 & \text{ } & \text{1 in 701.33}  & \text{ } & \text{416,640}\\  \text{1 white, 1 red}  & \text{ } & \$ 4 & \text{ } & \text{1 in 91.98}  & \text{ } & \text{3,176,880}\\  \text{no white, 1 red}  & \text{ } & \$ 4 & \text{ } & \text{1 in 38.32}  & \text{ } & \text{7,624,512}\\      \text{ }  & \text{ } & \text{ } & \text{ } & \text{ } & \text{ } & \text{ }\\      \text{ }  & \text{ } & \text{Any Prize} & \text{ } & \text{1 in 24.87} & \text{ } & 11,750,538    \end{array}

The odds of winning any prize would be 11,750,538 in 292,201,338, or 1 in 24.87. It ought to be mentioned that the last column in the table is only for the number of potential prizes in a category (they are number of possible winning combinations). A prize is awarded only when someone purchased the winning combination. For example, on April 26, 2017, no one won the grand prize since no one had purchased the winning numbers of 1, 15, 18, 26, and 51 and 26. No one had won the $2 million prize for match 5 power play. However, there were three winners for the match 5 (they did not pay for the extra power play option). There are 25 possible winning combinations but only three were won. Figure 1 also shows that there were 592,253 winners on that day. Potentially, there could be 11,750,538 (or more) winners in each drawing.

Remarks on Calculations
First, we emphasize again the point that has been made several times already. The smaller prizes only require partial matching of white numbers with or without matching of the red number. The matching of white numbers must account for winning and non-winning numbers. For example, if the prize requires the matching of 3 white numbers, then we need to count 3 numbers from the 5 winning white numbers and 2 numbers from 64 non-winning white numbers. Hence \binom{5}{3} \times \binom{64}{2}. Recall that to play Powerball, the player chooses 5 numbers from the 69 white balls. If the player matches only 3 winning white balls, the other two white numbers chosen by the player must come from the 64 non-winning white balls.

The odds for the last prize (matching the red number only) are not 1 in 26. This is a common misconception. There are only 26 red balls. So matching the red ball must have a 1 in 26 chance. However, we cannot ignore the white balls! The fact that only the red ball is matching means that the 5 white numbers chosen by the player must come from the 64 non-winning white numbers. So the correct value to go into the numerator is \binom{5}{0} \times \binom{64}{5}.

___________________________________________________________________________

Do You Feel Lucky Today?

The odds for winning the Powerball jackpot are mathematically zero (being 1 in 292 millions). The entire population of the United States is 321.4 millions in 2015, with 248 millions of them age 18 or over. If every adult in the U.S. purchases a Powerball ticket, it is still possible that there will be no winner of the grand prize (but there could be a few $1 million winners). To get a sense of how big the number 292 million is, look at this piece from WSJ. The piece strives to illustrate how vast a quantity 292,201,338 is. Just to scroll the page over that many dots is a near impossible task.

One thing is for sure. With no winners of jackpot in a several drawings in a row, the excitement for Powerball is turned into a frenzy. In fact, the last revision of the Powerball rules in 2015 made winning of the jackpot much less likely. As a result the jackpot usually keeps building until it reaches hundreds of millions range or the 1 billion range. The rules were designed to rachet up the excitement and as a result driving up sales (discussed in this piece from Washington Post).

Playing Powerball is an excellent entertainment. With a $2 admission price, you can dream and fantasize for a few days. Once a week habit would be about $100 a year in entertainment cost. Regular customers of Starbucks would spend more than that amount. Of the regular lottery players, roughly the top 5% spend a few thousand dollars a year (a hundred thousand dollars on average in their lifetime, assuming no interest) for an illusive chance to win $1 million or more. What if these regular players invest the money elsewhere?

If they invest the money instead on a conservative fund earning 2% a year, they would accumulate an nest egg of hundreds of thousands of dollars. Assuming a rate of returns at 2% a year, investing $3,000 a year would yield approximately $185,000. In fact, if they invest in a broad base stock market index (e.g. S&P 500 index), they would do even better in the long run. The long run historical S&P 500 returns are around 10% a year (7% after inflation). Assuming a rate of returns at 7% a year, investing $3,000 a year would yield around $640,000!

Even in the rare chance that someone wins $1 million, the winning is taxed and would be greatly reduced, e.g. the winning could be $500,000 after tax. Thus the rates of return of a lifetime investment in playing lottery are not as great as people imagine. Without winning, the lifetime lottery investment of a few thousands dollars a year would be money going down the drain. For a more detailed discussion, see this piece from Washington Post.

The calculation of the Powerball winning odds is a great math lesson. Rather than looking at the winning odds, perhaps it is more instructive to look at losing odds. The odds for not winning the Powerball grand prize (per $2 bet) are 292,201,337 to 292,201,338, which are essentially 1 in 1.

___________________________________________________________________________
\copyright 2017 – Dan Ma

Multinomial coefficients

This is the fourth post in a series of posts on combinatorial analysis. The post is opened with the following problem. This post builds on the previous post on binomial coefficients.

    Figure 1 – three choices for lunch

Multinomial Lunch Problem
Each of the twelve people have 3 choices for lunch – McDonald, Burger King and IN-N-OUT, all fast food restaurants. The choice for each person is independent of the choices of the other diners.

  1. In how many ways can the lunch choices for this 12-person group be made?
  2. In how many ways can the lunch choices for this 12-person group be made in such a way that 3 people choose McDonald, 4 people choose Burger King and 5 people choose IN-N-OUT?
  3. In how many ways can the lunch choices for this 12-person group be made in such a way that 3 people choose one restaurant, 4 people choose another restaurant and 5 people choose the remaining restaurant?

The previous posts on combinatorial analysis are: binomial coefficients, combination and permutation and multiplication principle.

___________________________________________________________________________

Multinomial Coefficients

The problem for lunch choices is a multinomial coefficient problem. The second question in the problem is equivalent to any one of the following question.

  1. How many ways can a set of 12 distinct objects be divided into 3 subgroups, one consisting of 3 objects, one consisting of 4 objects and one consisting of 5 objects?
  2. How many different 12-letter strings are there consisting of 3 M’s, 4 B’s and 5 I’s?
  3. Suppose each of the questions in a 12-question multiple choice test has three choices (M, D and I). A student chooses the answers by pure guessing. How many of the possible test answers consist of 3 M’s, 4 B’s and 5 I’s?
  4. How many ways can 12 people be assigned to 3 committees such that one committee consists of 3 people, one committee consists of 4 people and one committee consists of 5 people? Assume that each person is assigned to only one committee.
  5. An urn contains three letters M, B and I. Randomly sample 12 letters from the urn with replacement. What is the number of sample results that consist of 3 M’s, 4 B’s and 5 I’s?

The first question is basically the definition of multinomial coefficient. It is the total number of ways to divide a set of objects into several subsets such that each subset is of a pre-determined size. In the second question, the objects are the 12 positions in the letter strings. In the third question, the objects to be divided are the 12 multiple choice questions. In the fourth question, the objects to be divided are the 12 people to be assigned into committees.

The fifth question is from a random sampling perspective. In any of the question, the dividing can be thought of as a choice (or decision) for each object. For each of the 12 objects, should the object be put into group 1, group 2 or group 3 (or should choice 1 be taken, choice 2 be taken, choice 3 be taken)? In the lunch problem, each diner has 3 choices. The choices in their totality would be viewed as the results from a random sampling of the population of 3.

The answer to any of the above questions is a multinomial coefficient.

Multinomial Coefficient

If n_1+n_2+\cdots+n_k=n, the notation for the multinomial coefficient is \displaystyle \binom{n}{n_1,n_2,\cdots,n_k} and is defined by \displaystyle \binom{n}{n_1,n_2,\cdots,n_k}=\frac{n!}{n_1! n_2! \cdots n_k!}. The multinomial coefficient is the number of ways a set of n distinct objects can be divided into k subsets, one of which consists of n_1 objects, one of which consists of n_2 objects, …., one of which consists of n_k objects.

It is clear that multinomial coefficients are a generalization of binomial coefficients. Instead of having to choose from one of two choices in each random selection, there are multiple choices to choose from. Even the notation \displaystyle \binom{n}{n_1,n_2,\cdots,n_k} is a generalization of the notation \displaystyle \binom{n}{n_1}, the notation for binomial coefficients. Note that \displaystyle \binom{n}{n_1,n-n_1}=\binom{n}{n_1}.

___________________________________________________________________________

Multinomial Lunch Problem

Viewing the lunch choices as random sampling (each diner randomly chooses from M, B and I), there are 3^{12}= 531,441 many ways can the lunch choices be made for the whole 12-person group. This count contains all the possible choices including possibilities that some restaurant is not chosen. The following is one possible outcome that fall under question 2.

    B-M-B-I-I-M-B-B-M-I-I-I

The above string shows that the first person chooses Burger King, the second person chooses McDonald, the third person chooses Burger King and so on. How many other strings are like this one in the sense that there are 3 M’s, 4 B’s and 5 I’s? This would be a multinomial coefficient.

    \displaystyle \frac{12!}{3! \ 4! \ 5!}=\frac{12 \cdot 11 \cdot 10 \cdot 9 \cdot 8 \cdot 7 \cdot 6}{3 \cdot 2 \cdot 1 \cdot 4 \cdot 3 \cdot 2 \cdot 1}=\text{27,720}

The answer to the third question is more than 27,720. For example, there could be three diners choosing IN-N-OUT, 4 diners choosing Burger King and 5 diners choosing McDonald. In other words, we need to apply multinomial coefficient on the three positions (three restaurants). There are 6 ways to divide 3 positions resulting in three subsets with one each since \frac{3!}{1! \ 1! \ 1!}=6. Then there are 6 x 27,720 = 166,320 ways for 12 diners making choices so that 3 of them go to one place, 4 of them go to another place and five of them go to the remaining place. This is an example of the double use of the multinomial coefficients – the first application is on dividing the objects into subgroups and the second application on the dividing the subgroups. The following example gives another illustration.

Example 1
A fair die is rolled 15 times. How many of the outcomes consist of four 2’s, five 3’s and six 4’s? How many of the outcomes consist of four of one face, five of another face and six of another face (different from the other two faces)?

The first question is a straight application of one multinomial coefficient.

    \displaystyle \frac{15!}{4! \ 5! \ 6!}=\frac{15 \cdot 14 \cdot 13 \cdot 12 \cdot 11 \cdot 10 \cdot 9 \cdot 8 \cdot 7}{4 \cdot 3 \cdot 2 \cdot 1 \cdot 5 \cdot 4 \cdot 3 \cdot 2 \cdot 1}=\text{630,630}

To understand the second question, consider the notation (0, 4, 5, 6, 0, 0), which means that there are zero 1’s, four 2’s, five 3’s, six 4’s, zero 5’s and zero 6’s from rolling the die 15 times. Basically it shows how many rolls fall into each position (face). Now we are interested in counting other outcomes that have 4, 5 and 6 of another 3 positions. Basically we are trying to divide 6 positions into four groups, one groups with 3 positions that do not appear in the 15 rolls, one groups with one position that appear 4 times, one group with one position that appear 5 times, and one group with one position that appears 6 times. The following shows how many ways to divide the 6 positions.

    \displaystyle \frac{6!}{3! \ 1! \ 1! \ 1!}=120

For each of the 120 times, there are 630,630 outcomes. Thus the answer to the second question is 120 x 630,630 = 75,675,600. \square

___________________________________________________________________________

Multinomial Sampling

We now explore the random sampling angle of multinomial coefficients. To make it easier to follow, we continue to use the Multinomial Lunch Problem. The problem, as discussed above, is like sampling 12 times with replacement from an urn with 3 letters M, B and I. Each of the 12 random selections has 3 outcomes. Thus the experiment in total has 3^{12} = 531,441 outcomes, each of which can be written as a string of 12 letters. Three examples:

    B-M-B-I-I-M-B-B-M-I-I-I

    I-M-M-I-M-M-M-I-M-I-I-M

    M-M-M-M-M-M-M-M-M-M-M-M

These three strings can be called ordered samples for the experiment of randomly selecting from the urn 12 times with replacement. These display the results of each draw sequentially. We can also use unordered samples, which do not contain the orders but contain the number of times each letter is drawn. For the first string, the unordered sample would be (3, 4, 5) where the first number is the number of people choosing McDonald, the second number is the number of people choosing Burger King and the third number is the number of people choosing IN-N-OUT. For the second string, the unordered sample would be (7, 0, 5). For the third, everyone goes to McDonald and so it is (12, 0, 0).

Note that (3, 4, 5) is called an unordered sample because it does not tell us the order of the chosen letters. It only tells us how many times each letter is chosen. The number of ordered samples resulting in an unordered sample would be precisely the multinomial coefficient. For the unordered sample (3, 4, 5), there would be

    \displaystyle \frac{12!}{3! \ 4! \ 5!}=\frac{12 \cdot 11 \cdot 10 \cdot 9 \cdot 8 \cdot 7 \cdot 6}{3 \cdot 2 \cdot 1 \cdot 4 \cdot 3 \cdot 2 \cdot 1}=\text{27,720}

many ordered samples. In randomly selecting letters 12 times with replacement from an urn containing the letters M, B and I, how many unordered samples are there? To answer this question, it is helpful to look at unordered samples using a star-and-bar diagram. The star-and-bar diagrams for above three unordered samples are:

    * * * | * * * * | * * * * *

    * * * * * * * | \text{ } | * * * * *

    * * * * * * * * * * * * | \text{ } | \text{ }

In the random experiment in question, a star-and-bar diagram has 12 stars, representing the 12 letters selected, and 2 bars, creating three spaces for the three letters M, B and I. In each star-and-bar diagram, there are 12 + 2 = 14 positions, 12 of which are for the stars and the remaining 2 positions are for the bars. In other words, each of the 14 positions can be either a star or a bar. Thus star-and-bar diagrams are the results of a binomial experiment. How many possible star-and-bar diagrams? There are

    \displaystyle \binom{12+2}{11}=\binom{14}{11}=\binom{14}{2}=364

many different diagrams. Thus there are 364 many unordered samples from drawing letters 12 times from an urn containing the letters M, B and I.

A related question is: how many unordered samples are there in this experiments such that each letter is selected at least once? This means that each space in the star-and-bar diagram has at least one star. Then the remaining 12 – 3 = 9 stars will have to be distributed in the three spaces. There are

    \displaystyle \binom{9+2}{9}=\binom{11}{9}=55

unordered samples in which each letter is chosen at least once.

There is another way to interpret the unordered samples. What is the total number of non-negative integer solutions to the equation x + y + x = 12? The solution x = 3, y = 4, z = 5 would be the unordered sample (3, 4, 5). Thus there are 364 different solutions. What is the total number of positive integer solutions to the equation x + y + x = 12? There are 55 such solution. Each such solution would be an unordered sample where each letter is selected at least once.

___________________________________________________________________________

Multinomial Theorem

We now generalize the discussion on multinomial sampling.

Multinomial Sampling

Suppose that a multinomial experiment consisting of n independent trials with each trial resulting in exactly k outcomes is performed. For convenience, these k outcomes are labeled by the symbols x_1,x_2,\cdots,x_k. The result of the experiment can be recorded by an ordered sample, which is a string of n symbols with each symbol being one of x_1,x_2,\cdots,x_k. The result of the experiment can also be recorded by an unordered sample, which is a sequence of k numbers with the first number being the number occurrences of the outcome x_1, the second number being the number of occurrences of the outcome x_2 and so on. The number of possible ordered samples in the multinomial experiment is k^n. The number of ordered samples resulting in a given unordered sample (n_1,n_2,\cdots,n_k) is the multinomial coefficient \displaystyle \binom{n}{n_1,n_2,\cdots,n_k}=\frac{n!}{n_1! \ n_2! \cdots n_k!}. The number of unordered samples in the experiment is \displaystyle \binom{n+k-1}{n} based on a combinatorial argument using stars and bars. The number of unordered samples such that in each sample, each of the outcomes x_1,x_2,\cdots,x_k occurs at least once is \displaystyle \binom{n-k+k-1}{n-k}=\binom{n-k+k-1}{k-1}=\binom{n-1}{k-1}.

Number of Integer Solutions to Equations

The number of non-negative integer solutions to the equation x_1+x_2+\cdots+x_k=n is \displaystyle \binom{n+k-1}{n}, which is the total number of unordered samples as indicated above. The number of positive integer solutions to the equation x_1+x_2+\cdots+x+k=n is \displaystyle \binom{n-1}{k-1}, which is the number of unordered samples such that each symbol from x_1,x_2,\cdots,x_k occurs at least once in the multinomial sampling.

There is a lot to unpack in the above two paragraphs. It is helpful to follow the random sampling idea on a specific example, either the lunch problem discussed above or another example. The ideas in the above paragraphs contains all the information for stating the multinomial theorem.

Sum of all Multinomial Coefficients

In the multinomial experiment described above, the sum of all possible multinomial coefficients is k^n, i.e. \displaystyle \sum \limits_{a_1+a_2+\cdots a_k=n} \frac{n!}{a_1! \ a_2! \cdots a_k!}=k^n

Multinomial Theorem

The formula shown below shows how to raise a multinomial to a power. \displaystyle (x_1+x_2+\cdots x_k)^n =\sum \limits_{a_1+a_2+\cdots a_k=n} \frac{n!}{a_1! \ a_2! \cdots a_k!} \ x_1^{a_1} x_2^{a_2} \cdots  x_k^{a_k}

The count k^n is identical to the sum of all the possible multinomial coefficients in the experiment. This fact follows from the observations we make about the multinomial sampling. The count k^n is the total number of all ordered samples. The ordered samples can be divided into groups where each group consists of of all ordered samples associating with the same unordered sample. Note that the size of each of these groups is a multinomial coefficient.

The multinomial theorem is a compact way of describing the multinomial experiment. When it is expanded out, the left-hand-side lists out all the k^n ordered samples. A typical ordered sample consists of a_1 many x_1, a_2 many x_2 and so on, which can be represented by x_1^{a_1} x_2^{a_2} \cdots  x_k^{a_k}, which is a term on the right-hand-side. Thus a term such as x_1^{a_1} x_2^{a_2} \cdots  x_k^{a_k} is another way to notate an unordered sample and it represents \frac{n!}{a_1! \ a_2! \cdots a_k!} many ordered samples. Thus \frac{n!}{a_1! \ a_2! \cdots a_k!} x_1^{a_1} x_2^{a_2} \cdots  x_k^{a_k} is the result of aggregating all the ordered samples for a given unordered sample. Summing all these aggregated results give all the possible ordered samples in the multinomial experiment.

___________________________________________________________________________

Examples

The best way to absorb the concepts discussed here is to work excises. Three more examples are presented. Several exercises are given in the section below.

Example 2
Suppose that 11 new hires are assigned to four office locations – the headquarter, branch office North, branch office South and branch office East.

  • How many assignments can be made?
  • How many assignments can be made such that 3 new hires are assigned to the headquarter, 3 new hires are assigned to branch office North, 4 new hires are assigned to branch office South and 1 new hire is assigned to branch office East?
  • How many assignments can be made such that 3 new hires are assigned to one location, 3 new hires are assigned to another location, 4 new hires are assigned to another location and 1 new hire is assigned to the remaining location?

The answer to the first question is 4^{11} = 4,194,304. The answer to the second question is a multinomial coefficient.

    \displaystyle \frac{11!}{3! \ 3! \ 4! \ 1!}=46200

Each assignment can be viewed as a string of 11 letters, each of which is chosen from H, N, S and E (these are the ordered samples discussed above). An unordered sample is a sequence of 4 numbers. For example, the unordered sample for the second question is (3, 3, 4, 1). The third question would include other unordered samples with the numbers summing to 11, e.g. (1, 3, 4, 3). Basically we are trying to divide the 4 letters into 3 groups, one group of one letter appearing once, one group with two letters appearing 3 times each, one group with one letter appearing 4 times. This is another multinomial coefficient.

    \displaystyle \frac{4!}{1! \ 1! \ 2!}=12

Thus the answer to the third question is 12 x 46,200 = 554,400. \square

Example 3
Suppose that 16 new hires are assigned to four office locations – the headquarter, branch office North, branch office South and branch office East. Six of the new hires are engineers and only work in the headquarter or branch office East. The other ten new hires are technicians and can be assigned to any one of the four locations.

  • How many assignments can be made?
  • How many assignments can be made such that four of the engineers will work in the headquarter and two of the new technicians will work in the headquarter, 3 new technicians are assigned to branch office North, 4 new technicians are assigned to branch office South and 1 new technician is assigned to branch office East?
  • How many assignments can be made such that four of the engineers will work in either the headquarter or branch office East and two of the new technicians will work in one of the 4 locations, 3 new technicians are assigned to another location, 4 new technicians are assigned to a third location and 1 new technician is assigned to the remaining location?

There are two assignments, one for engineers and one for technicians. The answer would be obtained by multiplying the two assignment counts. The answer to the first question is 2^6 \cdot 4^{11} = 640,000. The following is the answer to the second question.

    \displaystyle \binom{6}{4} \times \frac{10!}{2! \ 3! \ 4! \ 1!}=15 \times 12600=189000

Note that there are \binom{6}{4} = 15 ways to 4 of the engineers to the headquarter (and thus assigning the other two engineers to the other office). The number of ways to assign 10 technicians to the 4 locations in the indicated numbers is the multinomial coefficient indicated above (the one resulting in 12,600). The following is the answer to the third question.

    \displaystyle \binom{6}{4} \times 2 \times \frac{10!}{2! \ 3! \ 4! \ 1!} \times \frac{4!}{1! \ 1! \ 1! \ 1!}=30 \times 302400=9072000

Note that for the third question, a second multinomial coefficient on the locations is required for each type of workers. \square

Example 4
Fifteen identical brand new mail delivery trucks are assigned to 5 post offices.

  • How many assignments can be made?
  • How many assignments can be made such that each post office is assigned at least one truck?

The problem can be done using the combinatorial argument with stars and bars discussed above. The 15 trucks are stars and there 4 bars creating 5 spaces representing the post offices. Any assignment of trucks would be like a string of 15 stars and 4 bars. The following is an example.

    * * * * | * * * * * * | | * * * * *

The above star-and-bar diagram shows that 4 trucks are assigned to the first post office, 6 trucks are assigned to the second post office, zero trucks are assigned to the third post office and 5 trucks are assigned to the fourth post office. What is more important to notice is that there are 19 positions in the diagram. The problem is to choose 15 of them to place the stars. The following is the answer to the first question.

    \displaystyle \binom{19}{15}=3876

The second question requires that each space has at least one star. We place one star in each space. There are 11 stars left. We now consider 11 stars and 3 bars. The following is the number of ways of arranging 11 stars in 14 positions.

    \displaystyle \binom{14}{11}=364

Thus there are 364 ways to assign 15 trucks to four post offices such that each office gets at least one truck. \square

___________________________________________________________________________

Exercises

Exercise 1
A father is to distribute 9 gifts to his three children.

  1. In how many ways can the gifts be distributed if the eldest child is to receive 2 gifts, the middle child is to receive 2 gifts and the youngest child is to receive 5 gifts?
  2. In how many ways can the gifts be distributed if one child is to receive 2 gifts, another child is to receive 2 gifts and the remaining child is to receive 5 gifts?

Exercise 2
Eleven job assignments are randomly distributed to four workers Marcus, Issac, Samantha and Paul. In how many ways can these jobs be assigned to the four workers such that Marcus will receive one job, Issac will receive 4 jobs, Samantha will receive 4 jobs and Paul will receive 2 jobs?

Exercise 3

  1. Ten students are to be assigned to two math classes. How many ways can the students be divided into the two math classes with 5 students in each class?
  2. Fifteen students are to be assigned to three math classes. How many ways can the students be divided into the three math classes with 5 students in each class?

Exercise 4
An investor has 25 thousand dollars to invest among 5 possible investments. The amount to invest in each investment is in the unit of one thousand dollars. Suppose that the entire amount of 25 thousand dollars is to be invested.

  1. How many different investment strategies are possible?
  2. How many different investment strategies are possible if at least one unit is put into each investment choice?

Exercise 5
Recall the Multinomial Lunch Problem. Each of the twelve people have 3 choices for lunch – McDonald, Burger King and IN-N-OUT, all fast food restaurants. The choice for each person is independent of the choices of the other diners. In how many ways can the lunch choices for this 12-person group be made in such a way that each restaurant is visited by at least 3 diners?

\text{ }

\text{ }

\text{ }

\text{ }

\text{ }

\text{ }

\text{ }

\text{ }

Answers

Exercise 1

  1. 756
  2. 2,268

Exercise 2
34,650

Exercise 3

  1. 252
  2. 756,756

Exercise 4

  1. 23,751
  2. 10,626

Exercise 5
256,410
Hint: There are three cases, each of which requires the use of two multinomial coefficients, one for the diners and one for the restaurants.

___________________________________________________________________________
\copyright 2017 – Dan Ma

Binomial coefficients

This is the third post in a series of posts on combinatorial analysis. The post is opened with the following problem.

    Figure 1
    path-problem

Path Problem
Point Q is 7 blocks east and 6 blocks north of Point P (see Figure 1). A person wants to walk from Point P to Point Q by walking 13 blocks. Assume that all the paths from any point to any point in the above diagram are available for walking. In how many ways can the person go from Point P to Point Q?

One way to solve this problem is to count the number of paths in a “brute force” approach by tracing all possible paths in the diagram. The total number of paths is 1,716. So counting directly is not a practical approach. It turns out that the path problem and other similar problems will become routine once the concept of binomial coefficients is clearly understood.

___________________________________________________________________________

Review

First we highlight the important concepts introduced in the previous post on permutations and combinations.

The number n!, read n factorial, is defined by the product n!=n \times (n-1) \times (n-2) \times \cdots \times 3 \times 2 \times 1. It is the number of ordered arrangements of n objects.

For convenience we define 0! = 1. An ordered arrangement of a set of objects is called a permutation. How many permutations are there of 7 objects? For example, how many ways can the following 7 people be arranged in a row for a group photo: Abby, Beth, Charlie, David, Edward, Frank, and George? An equivalent question: how many permutations are there of the letters A, B, C, D, E, F and G? The answer is 7! = 5040. The reasoning is based on the multiplication principle (see here). There are 7 people to choose for the first position, and there are 6 people to choose from for the second position after the first position is fixed, and so on. Thus there are 7 x 6 x 5 x 4 x 3 x 2 x 1 = 5040.

The number of permutations of k objects chosen from n objects is defined by \displaystyle P(n,k)=\frac{n!}{(n-k)!}.

Out of the 7 people mentioned above, how many ways can three people be randomly selected to receive three cash prizes of $1000, $500 and $100? An equivalent question: how many permutations are there of 3 letters chosen from the 7 letters A, B, C, D, E, F and G? The answer can also be obtained by the multiplication principle. There are 7 x 6 x 5 = 210 permutations, which is identical to P(7,3)=\frac{7!}{(7-3)!}=\frac{7!}{4!}=7 \times 6 \times 5.

The number of different groups of size k that can be chosen from a set of n distinct objects is \displaystyle \binom{n}{k}=\frac{n!}{k! \ (n-k)!} where n is a positive integer and k=0,1,2,\cdots,n. This number is called the number of possible combinations of n objects chosen k at a time. The values \displaystyle \binom{n}{k} are often referred to as binomial coefficients because of their connection with the binomial theorem.

Out of the 7 people mentioned above, how many ways can three people be randomly selected to receive three cash prizes, each of which is $1000? An equivalent question: how many three-letter sets can be chosen from the 7 letters A, B, C, D, E, F and G? Here, the order does not matter. The answer 210 obtained previously is too large because it contains many repeats. For example, the letters A, B and C has 3! = 6 permutations. In the answer 210, each 3-letter combination occurs 6 times. So the answer is 210 / 6 = 35, which is identical to \binom{7}{3}=35.

Other notations for the binomial coefficient \binom{n}{k} are _nC_k, C(n,k) and C_{n,k}. Permutation and combination can all be computed directly in terms of factorials. The calculation can also be done using calculator or software. Most calculators have functions for combination (binomial coefficient) and permutation. In Excel, the binomial coefficient is computed by the formula =COMBIN(n, k), which will give the result as \binom{n}{k}. The Excel formula for permutation is =PERMUT(n,k).

___________________________________________________________________________

The Path Problem

Recall that the problem is to walk 7 blocks East and 6 blocks North starting at point P. Let use E to mean walking one block East and N to mean walking one block North. The following is one specific path that the person might take.

    Figure 2
    path-problem-specific-path

The path in Figure 2 can be described by the string E-E-N-N-N-E-N-E-E-E-N-N-E. Of course there are many other possible paths that can be taken. Observe that each possible path correspond to a string of 7 E’s and 6 N’s. On the other hand, each possible 13-character string with 7 E’s and 6 N’s correspond to a path in the problem. So the problem is about finding the number of possible 13-character string with 7 E’s and 6 N’s. In other words, we are in interested in all the ways we can write 7 E’s and 6 N’s in the following 13 positions.

    \LARGE \Box \ \ \ \Box \ \ \ \Box \ \ \ \Box \ \ \ \Box \ \ \ \Box \ \ \ \Box \ \ \ \Box \ \ \ \Box \ \ \ \Box \ \ \ \Box \ \ \ \Box \ \ \ \Box

Writing 7 E’s and 6 N’s in 13 boxes is the same as choosing 7 boxes out of 13 boxes to fill with 7 E’s. The number of ways to choose 7 objects out of 13 objects is simply \binom{13}{7}=1716.

___________________________________________________________________________

Several Interpretations of the Binomial Coefficients

The path problem makes it clear that there are several ways of looking at the binomial coefficient \displaystyle \binom{n}{k}. Each of the following is a count that is computed by \displaystyle \binom{n}{k}.

  1. The number of different groups of size k that can be chosen from a set of n objects.
  2. The number of ways a set of n objects can be labeled by 0 or 1.
  3. The number of different n-character strings that can be formed by using k of one symbol and n-k of another symbol.
  4. The number of permutations of n items, of which k are alike and n-k are alike.
  5. A random experiment consists of n trials, each of which results in one of two outcomes (outcome 1 and outcome 2). The number of ways the experiment can be performed such that k of the trials result in outcome 1 and n-k of the trials result in outcome 2.
  6. An urn contains a large number of balls, each of which is labeled 0 or 1. Randomly select a ball n times from the urn with replacement. The number of ways the balls can be selected in such a way that k of the balls are labeled 1 and n-k of the balls are labeled 0.

The same theme runs through the above 6 ideas. The first one is the standard definition of binomial coefficient. The number \binom{n}{k}, according to idea 1, would be the number of all possible subsets of size k that can be chosen from a set of n objects. We can also think of choosing a subset as a random experiment of n trials where each trial has two distinct outcomes. Out of the n objects, we only want k of them with k \le n. Then for each object, there is a decision – choose it or not choose it (two distinct outcomes). So there are n trials and each trial has two outcomes (idea 5). We can also think of a random sampling process too. Perform each trial that ends in two outcomes will be like a random sampling from a population consisting of 0’s and 1’s (idea 6).

Going back to the n objects in idea 1, deciding whether to choose each object is also like labeling that object with 0 or 1 (idea 2). Since each object generates two outcomes (choose or not choose), the n decisions successively is like a string of n characters, each of which is one of two distinct symbols (ideas 3 and 4).

Though the interpretation may be nuanced, the calculation of the binomial coefficient is straightforward. Thinking of the binomial coefficient as the number of ways to making a series of two-outcome decisions is crucial to the understanding of binomial distribution. In the remainder of the post, we discuss other properties of the binomial coefficients.

___________________________________________________________________________

The Binomial Theorem

The sum of two symbols, say x+y, is called a binomial. The binomial theorem is a formula for deriving the power of a binomial, i.e. for deriving (x+y)^n for n=1,2,3,\cdots. We explain the idea behind the formula. The ideas 3 and 4 discussed above are particularly useful ways of looking at the binomial theorem. Consider the first several expansions.

    \displaystyle \begin{aligned} (x+y)^1&=x+y  \end{aligned}

    \displaystyle \begin{aligned} (x+y)^2&=(x+y) (x+y)\\&=xx+xy+yx+yy \\&=x^2+2xy+y^2 \end{aligned}

    \displaystyle \begin{aligned} (x+y)^3&=(x+y) (x+y) (x+y)\\&=xxx+xxy+xyx+yxx+xyy+yxy+yyx+yyy \\&=x^3+3x^2y^1+3xy^2+y^3  \end{aligned}

    \displaystyle \begin{aligned} (x+y)^4&=(x+y) (x+y) (x+y) (x+y)\\&=xxxx+xxxy+xxyx+xyxx+yxxx\\& \ \ +xxyy+ xyxy+xyyx+yxxy+yxyx+yyxx\\&\ \ +xyyy+yxyy+yyxy+yyyx+yyyy \ \ \ \ \ \ \ \ \ \text{(16 terms)} \\&=x^4+4x^3y^1+6x^2y^2+4xy^3+y^4  \end{aligned}

    \displaystyle \begin{aligned} (x+y)^5&=(x+y) (x+y) (x+y) (x+y) (x+y)\\&=xxxxx+xxxxy+xxxyx+xxyxx+xyxxx+yxxxx \cdots \\& \ \ +xxxyy+xxyxy+xyxxy+yxxxy+xxyyx \\& \ \ + xyxyx+yxxyx+xyyxx+yxyxx+yyxxx \ \ + \cdots (\text{32 terms}) \\&=x^5+5x^4y^1+10x^3y^2+10x^2y^3+5xy^4+y^5  \end{aligned}

The above derivations are deliberately detailed to make a point. For example, the result of (x+y)^5 is product of 5 identical factors of x+y. Multiplying out these five factors of x+y results in 2^5=32 terms. Each of these 32 items is a string of x’s and y’s. Let’s look at the strings with 3 x’s and 2 y’s. they are:

    xxxyy+xxyxy+xyxxy+yxxxy+xxyyx

    xyxyx+yxxyx+xyyxx+yxyxx+yyxxx

To find out how many such strings are possible, we need to find out the number of 5-letter strings of which 3 letters are the same and the other 2 letters are the same. That would be \binom{5}{3}=\binom{5}{2}=10.

In general, (x+y)^n is obtained by multiplying n factors of x+y. In the expansion, a typical term is x^{n-j} y^j, which can be viewed as a n-letter string of n-j of the letters are x and j of the letters are y. We can also look at the situation as choosing j of the n positions to write the letter y. There are \binom{n}{j} many ways of writing j instances of y in n positions. Thus \binom{n}{j} x^{n-j} y^j would be one of the terms in the final result of the expansion. This is the reason why \binom{n}{j} is referred to as binomial coefficients.

The binomial theorem shows how to derive the power of a binomial. The formula is: \displaystyle (x+y)^n=\binom{n}{0} x^n+\binom{n}{1} x^{n-1}y^1+\binom{n}{2} x^{n-2}y^2+\cdots+\binom{n}{j} x^{n-j}y^j+\cdots+\binom{n}{n-1} x^{1}y^{n-1}+\binom{n}{n} y^n.

A more compact way of stating the binomial theorem is: \displaystyle (x+y)^n=\sum \limits_{j=1}^n \binom{n}{j} x^{n-j}y^j.

___________________________________________________________________________

A Recursive Formula

The combination \displaystyle \binom{n}{k} can be evaluated using calculator or software. We may not need the following formula for the purpose of calculation. However, it does provide a great deal of insight.

    \displaystyle \binom{n}{k}=\binom{n-1}{k}+\binom{n-1}{k-1}

The formula can be derived by the following thought process. A k-subset of a set of n objects is a subset consisting of exactly k elements. The quantity \displaystyle \binom{n}{k} is the number of k-subsets of a set of n objects. For our purpose here, the n objects are the numbers 1,2,3,\cdots,n. Let fix one of the objects, say 1. There are two types of k-subsets – the ones containing 1 and the ones not containing 1. The number of k-subsets of the first type is \displaystyle \binom{n-1}{k-1}. The point 1 is known to be in each k-subset of the first type and the other k-1 points must be chosen from n-1 objects. The number of k-subsets of the second type is \displaystyle \binom{n-1}{k}. The point 1 is not in each k-subset of the second type. So all k are chosen from the other n-1 objects. Thus \displaystyle \binom{n}{k} is the sum of these two numbers.

___________________________________________________________________________

Pascal’s Triangle

No discussion of the binomial coefficients is complete without mentioning the Pascal’s triangle. The triangle is usually represented as an isosceles triangle (or an equilateral triangle). We use a right angle representation.

\begin{array}{rrrrrrrrr}      1 & \text{ } & \text{ } & \text{ } & \text{ } & \text{ } & \text{ } & \text{ } & \text{ }    \\ 1 & 1 & \text{ } & \text{ } & \text{ } & \text{ } & \text{ } & \text{ } & \text{ }   \\ 1 & 2 & 1 & \text{ } & \text{ } & \text{ } & \text{ } & \text{ } & \text{ }   \\ 1 & 3 & 3 & 1 & \text{ } & \text{ } & \text{ } & \text{ } & \text{ }  \\ 1 & 4 & 6 & 4 & 1 & \text{ } & \text{ } & \text{ } & \text{ }  \\ 1 & 5 & 10 & 10 & 5 & 1 & \text{ } & \text{ } & \text{ }  \\ 1 & 6 & 15 & 20 & 15 & 6 & 1 & \text{ } & \text{ }  \\ 1 & 7 & 21 & 35 & 35 & 21 & 7 & 1 & \text{ }  \\ 1 & 8 & 28 & 56 & 70 & 56 & 28 & 8 & 1        \end{array}

Each row in the triangle gives the binomial coefficients in the expansion (x+y)^n, starting at n=0. Thus the second row gives the coefficients for the expansion of (x+y)^1, the third row gives the coefficients for (x+y)^2 and so on. The most interesting part about the triangle is that it is recursive. Each row is derived from the previous row.

Each cell in a row is the sum of the two cells in the row above it – the cell directly above and the cell to the left of that. For example, the last row in the above triangle shows the coefficients for (x+y)^8. The third number from the left is 28, which is the sum of 21 and 7. The fourth element 56 is the sum of 35 (the cell directly above it) and 21 (the cell to the left of 35). As a result, any row can be derived from the previous row. The next row will be for (x+y)^9 have the following coefficients:

    1, 9, 36, 84, 126, 126, 84, 36, 9, 1

The Pascal’s triangle is really the recursive formula discussed above. A typical cell in the triangle is the binomial coefficient \binom{n}{k}. The cell directly above it is \binom{n-1}{k}. The cell to the left of that one is \binom{n-1}{k-1}.

___________________________________________________________________________

How Many Subsets Are There of a Given Set?

Given n objects, we now know that there are \binom{n}{k} many sets of k elements from these n objects. There are \binom{7}{3}=35 many sets of 3 letters from the letters C, K, M, L, T, P and O. There are several ways to come up with the answer.

One way is to use the binomial theorem. Let x=y=1 and n=7. Write out the binomial theorem. The following is the result.

    \displaystyle 128=2^7=(1+1)^7=\binom{7}{0}+\binom{7}{1}+\binom{7}{2}+\binom{7}{3}+\binom{7}{4}+\binom{7}{5}+\binom{7}{6}+\binom{7}{7}

There are 128 different subsets of the 7 letters (or any other set of 7 objects). The reason is given in the right side of the above equation, which is the sum of the number of subsets with zero elements, plus the number of subsets with exactly one element, plus the number of subsets with exactly two elements, …, plus the number of subsets with seven elements.

When buying a hamburger at a fast food restaurant, customers can always add toppings. A basic hamburger comes with a beef patty inside a bun. A customer can choose to add toppings from this list: Cheese, Ketchup, Mustard, Lettuce, Tomato, Pickles and Onion (compare the first letters with the 7 letters indicated earlier). How many different hamburgers can be created? Basically the problem is to find out the number of different subsets of the 7 letters C, K, M, L, T, P and O. The answer is of course 2^7=128. Interestingly the answer of 128 includes one choice that is the plain burger with no toppings – just the beef and the bun. So the total number of different hamburgers with at least one topping would be 127.

In general, the total number of subsets of a set with n elements is 2^n. This is confirmed by the plugging x=1 and y=1 into the binomial theorem: \displaystyle 2^n=(1+1)^n=\binom{n}{0}+\binom{n}{1}+\binom{n}{2}+\binom{n}{3}+\cdots+\binom{n}{n-1}+\binom{n}{n}

___________________________________________________________________________

Exercises

Exercise 1
Starting at Point P, a student walks 6 blocks to a diner for breakfast (represented by the red dot in the following diagram). After breakfast, the students 7 blocks to school that is located at Point Q. In how many ways can the student walk from Point P to Point Q? Assume that all the paths from any point to any point in the above diagram are available for walking.

    Figure 2

Exercise 2
In ordering a pizza, a customer can choose from a list of 10 toppings: mushroom, onion, olive, bell pepper, pineapple, spinach, extra cheese, sausage, ham, and pepperoni.

  • How many different pizza can the customer create if any topping in the list is desirable to the customer and if the customer chooses at least one topping?
  • Suppose that the customer is a meat lover. He will always choose the three meat toppings: sausage, ham and pepperoni. He will also choose up to 3 vegetable toppings. How many different pizza can the customer create?

Exercise 3
A committee of 8 people is to be chosen from a class consisting of 8 men and 9 women. If the committee must consist of at least 3 men and at least 4 women, how many different committees are possible?

Exercise 4

  • In how many ways can a father divide 8 gifts among his two children if the eldest is to receive 5 gifts and the other child 3 gifts?
  • In how many ways can a father divide 8 gifts among his two children if one child is to receive 5 gifts and the other child 3 gifts?

\text{ }

\text{ }

\text{ }

\text{ }

\text{ }

Answers

Exercise 1
315

Exercise 2

  • 1023
  • 792

Exercise 3
15876

Exercise 4

  • 56
  • 112

___________________________________________________________________________
\copyright \ 2017 \text{ by Dan Ma}