PlanetMath (more info)
 Math for the people, by the people. Sponsor PlanetMath
Encyclopedia | Requests | Forums | Docs | Wiki | Random | RSS  
Login
create new user
name:
pass:
forget your password?
Main Menu
Owner confidence rating: High Entry average rating: Very high
[parent] approximating the birthday problem (Derivation)

A straightforward generalization of the birthday problem can be stated as:

Given $N$ possible outcomes, how many random samples must be taken so that with a probability of at least $1/2$ there are two equal samples?
Answer: Approximately $1.2\sqrt{N}$ samples must be taken.

So in the typical birthday problem setting the $N=365$ - the number of days in the typical year, and the result is that 23 people in a group gives at least a 50% chance of two people with a common birthday. This result can be obtained by a concise computation of the probilities for various sizes of the group until the proper value of 23 is found, but this approach does not lend itself to finding the value of 23 efficiently.

The result however has an easy to compute approximation of $1.2\sqrt{N}$ , so in the case of $N=365$ , the result is approximated by $1.2\sqrt{365}\approx 23$ . We now validate the estimate.

Recall the probability of selecting a duplicate from $N$ possible outcomes within $k$ choices is $$ p(N,k)=\binom{N}{k}\frac{1}{N^k} =\left(1-\frac{1}{N}\right)\left(1-\frac{2}{N}\right) \cdots\left(1-\frac{k-1}{N}\right) $$ As $$ e^x=\lim_{m\rightarrow \infty}\left( 1+ \frac{x}{m}\right)^{m $$ we can roughly approximate the formal product above by: $$ p(N,k)\approx e^{-1/N-2/N-\cdots - k/N}=e^{-\binom{k+1}{2}\frac{1}{N}} $$ Now we are after a formula in which given $N$ and the desired probability of $1/2$ we can estimate $k$ . So we solve for $k$ in the equation: $$ \frac{1}{2}=p(N,k)=e^{-\binom{k}{2}\frac{1}{N}}=e^{-\frac{k(k-1)}{2N}} $$ $$ -2N\ln \frac{1}{2} = k^2-k $$ $$ \frac{1}{4}+2N\ln 2 = k^2-k+\frac{1}{4}=\left( k-\frac{1}{2}\right)^2 $$ \begin{equation}\label{eq:2} k=\frac{1}{2}+\sqrt{\frac{1}{4}+2N\ln 2}. \end{equation} A less precise but easier to compute approximation comes from simplifying our original estimate further: $$ p(N,k)\approx e^{-\frac{k(k-1)}{2N}}\approx e^{-k^2/2N} $$ Then set $\displaystyle 1/2=e^{-\frac{k(k-1)}{2N}}\approx e^{-k^2/2N}$ and once again solve for $k$ to find: $$ k=\sqrt{2N \ln 2}\approx 1.2\sqrt{N} $$

We also see that if we wish to have an outcome with probability $\varepsilon>0$ of no matches then we can estimate the sample size by: $$ \varepsilon=p(N,k)\approx e^{-k^2/2N};\qquad k\approx \sqrt{2N \ln \frac{1}{\varepsilon}} $$ So the number of samples required to find a match with probability $\varepsilon>0$ is approximately $$ k=\sqrt{2N\frac{1}{1-\varepsilon}} $$

Bibliography

1
Wikipedia, Birthday paradox, http://en.wikipedia.org/w/index.php?title=Birthday_paradox&oldid=69569553. Accessed 14 August, 2006.




Anyone with an account can edit this entry. Please help improve it!

"approximating the birthday problem" is owned by Algeboy. [ full author list (5) ]
(view preamble | get metadata)

View style:


This object's parent.
Log in to rate this entry.
(view current ratings)

Cross-references: equation, formula, product, estimate, approximation, sizes, group, number, outcomes, birthday problem

This is version 6 of approximating the birthday problem, born on 2006-08-14, modified 2007-05-13.
Object id is 8249, canonical name is ApproximatingTheBirthdayProblem.
Accessed 2127 times total.

Classification:
AMS MSC60-00 (Probability theory and stochastic processes :: General reference works )
 05A10 (Combinatorics :: Enumerative combinatorics :: Factorials, binomial coefficients, combinatorial functions)
 60C05 (Probability theory and stochastic processes :: Combinatorial probability)

Pending Errata and Addenda
None.
Discussion
Style: Expand: Order:
forum policy

No messages.

Interact
post | correct | update request | add example | add (any)