## Wednesday, 12 January 2011

### Getting the Full Set

I recently wrote about some different forms of probability distributions, Binomial, Multinomial, Geometric and Poisson. Today I want to address a pretty type of problem that doesn't fit any of those.

Everybody who plays around much with recreational mathematics has encountered the famous "birthday problem." It is often stated similar to this: "How many randomly selected people must be assembled so that the probability that at least two of them have the same birthday is at least p (for some specific p, often 1/2)."

But today I want you to think about a slightly different variation. How many people (on average) would you have to gather in order to have at least one with a birthday on every day of the year" (like the birthday problem, we make some assumptions that are not exactly true... births occur with equal probability on every day of the year (they don't, see here. consider only 365 days a year, etc..)

That's probably a hard one to start, so let's break down to an easier problem. How many times should you have to flip a coin until you get at least one head and one tail?

The probability you get one of them on the first roll is 1. But how long should it take to get the "other" one. Well, each flip the probability it happens is p=1/2, and we know that the expected value of this geometric event is Ev= 1/p.. or 2 flips. So we figure that, on average, we will get both outcomes in three flips. To be suggestive, I will suggest that you see that as 2/2 + 2/1.

Ok, upping the ante a little. How many times would you have to roll a die to get all six possible outcomes? Again, break it into packets. How many times should you have to roll until you get one of the six outcomes? Good, you recognize that ONE of them has to happen on the first roll. Now to get one of the five others on the following rolls has a probability of 5/6, so it should take only 6/5 rolls (on average) to get a second outcome. By now you can see we will probably end up with 1+6/5+6/4+...+6/1. I point out here that that can be written as $\frac{6}{6}+\frac{6}{5}+\frac{6}{4}+\frac{6}{3}+\frac{6}{2}+\frac{6}{1} = \sum_{k=1}^{6}\frac{6}{k}=6\sum_{k=1}^{6}\frac{1}{k}$.
Ok, but that summation of 1/k as k goes from one to some number is a harmonic series. It grows very slowly, and no body wants to add up 365 fractions that get very small very quickly. Fortunately, Euler came to our rescue. What Euler did was come up with a really good approximation for 1/1 + 1/2 + 1/3 + ..... + 1/n for any n... It turns out that as n gets bigger and bigger, the sum gets closer and closer to n*ln(n).. and Euler came up with a really good estimate of how wrong it would be. As n gets really big the sum of $n\sum_{k=1}^{n}\frac{1}{k}$ is about n ln(n) + .577. So 365 outcomes is not much more difficult to calculate than six was. (365)ln(365)+.577 turns out to be about 2154 people. So if you go to a pretty big school, maybe there is a person enrolled who has a birthday on each day of the year.

So you need a challenge.... Suppose you don't have 2154 people on campus. What if you have say 1000. Now what is the probability that there is someone on campus with all 365 birth dates.

Be the first on YOUR campus to solve it, and send the solution here. It might be easier to start by making the problem smaller.