Monday, 1 December 2008

More on Birthday Problems...

Four questions about Birthdays left unanswered, so I wanted to get to that. The first:

"What happens to the birthday problem if you switch to a different number of days in the year (say 400 or 1000) or what if we use weeks or months... in short, is there a general formula for the number of people to reach the p=1/2 that there is at least one match?" The problem can be attacked in the same way as shown above for any number of categories, but a nice approximation (I believe this was from Perci Diaconis) is that for n categories, the approximate number of selections to have a probability of 1/2 of a match is 1.2 times the square root of n. So for 1000 days it would be about 38 people. You can reach 95% confindence of a match by adjusting the formula to 2.5 times the square root of n. This means that you can be 95% sure of a match for the conventional 365 day case with only 48 people.

The second problem asks, "what is the number of people needed to make the probability that EVERYONE has a match equal to or greater than 1/2?" This is called the "Strong Birthday Problem." I will leave it to you to search the derivation of the solution, but it seems that with more than 3064 people, the probability becomes more than 1/2 that ALL of them share a common birthday.

"And what if we only wanted to have two people who came close. What is the number of people that must be present to make the probaiblity that at least two have birthdays no more than one day apart?" Well, a little research led me to this, "It turns out, for example, that it takes just 14 people in a room to have even odds of finding two birthdays that are identical or fall on consecutive days. Among seven people, there is about a 60 percent probability that two will have birthdays within a week of each other. Among four people, the probability that two will have birthdays within 30 days of each other is about 70 percent. " I was a little surprised that it took so few to have either a match or adjacent days. I would think that means that with the traditional 25 person classroom, you might not get a match, but you would almost certainly get a "near-match."

"...suppose a group (several hundred) are entering an auditorium, and you know that the first person to enter having a birthday that matches someone already inside will win a prize. Ok, you know not to go first, and if you go too late, it will already be gone.... so where in the line do you insert yourself to have the highest probability of being the first to enter and have a birthday match in the room? " Well if you look at the change in the probability of a match after each person enters the room. This requires only that we calculate p(n) − p(n − 1) and find its max value. It turns out to be twenty, so if you are the twentieth person in line you have a slightly higher chance of being the first match than people before or after you.