# proof of variance of the hypergeometric distribution

We will first prove a useful property of binomial coefficients. We know

 ${n\choose k}=\frac{n!}{k!(n-k)!}.$

This can be transformed to

 ${n\choose k}=\frac{n}{k}\frac{(n-1)!}{(k-1)!(n-1-(k-1))!}=\frac{n}{k}{n-1% \choose k-1}.$ (1)

The variance $\operatorname{Var}[X]$ of $X$ is given by:

 $\operatorname{Var}[X]=\sum_{x=0}^{n}\left(x-\frac{nK}{M}\right)^{2}\frac{{K% \choose x}{M-K\choose n-x}}{{M\choose n}}.$

We expand the right hand side:

 $\displaystyle\operatorname{Var}[X]$ $\displaystyle=$ $\displaystyle\sum_{x=0}^{n}\frac{x^{2}{K\choose x}{M-K\choose n-x}}{{M\choose n}}$ $\displaystyle-\frac{2nK}{M}\sum_{x=0}^{n}\frac{x{K\choose x}{M-K\choose n-x}}{% {M\choose n}}$ $\displaystyle+\frac{n^{2}K^{2}}{M^{2}}\sum_{x=0}^{n}\frac{{K\choose x}{M-K% \choose n-x}}{{M\choose n}}.$

The second of these sums is the expected value of the hypergeometric distribution, the third sum is $1$ as it sums up all probabilities in the distribution. So we have:

 $\operatorname{Var}[X]=-\frac{n^{2}K^{2}}{M^{2}}+\sum_{x=0}^{n}\frac{x^{2}{K% \choose x}{M-K\choose n-x}}{{M\choose n}}.$

In the last sum for $x=0$ we add nothing so we can write:

 $\operatorname{Var}[X]=-\frac{n^{2}K^{2}}{M^{2}}+\sum_{x=1}^{n}\frac{x^{2}{K% \choose x}{M-K\choose n-x}}{{M\choose n}}.$

Applying equation (1) and $x=(x-1)+1$ we get:

 $\operatorname{Var}[X]=-\frac{n^{2}K^{2}}{M^{2}}+\frac{nK}{M}\sum_{x=1}^{n}% \frac{(x-1){K-1\choose x-1}{M-K\choose n-x}}{{M-1\choose n-1}}+\frac{nK}{M}% \sum_{x=1}^{n}\frac{{K-1\choose x-1}{M-K\choose n-x}}{{M-1\choose n-1}}.$

Setting $l:=x-1$ the first sum is the expected value of a hypergeometric distribution and is therefore given as $\frac{(n-1)(K-1)}{M-1}$. The second sum is the sum over all the probabilities of a hypergeometric distribution and is therefore equal to $1$. So we get:

 $\displaystyle\operatorname{Var}[X]$ $\displaystyle=$ $\displaystyle-\frac{n^{2}K^{2}}{M^{2}}+\frac{nK(n-1)(K-1)}{M(M-1)}+\frac{nK}{M}$ $\displaystyle=$ $\displaystyle\frac{-n^{2}K^{2}(M-1)+Mn(n-1)K(K-1)+KnM(M-1)}{M^{2}(M-1)}$ $\displaystyle=$ $\displaystyle\frac{nK(M^{2}+(-K-n)M+nK)}{M^{2}(M-1)}$ $\displaystyle=$ $\displaystyle\frac{nK(M-K)(M-n)}{M^{2}(M-1)}$ $\displaystyle=$ $\displaystyle n\frac{K}{M}\left(1-\frac{K}{M}\right)\frac{M-n}{M-1}.$

This is the one we wanted to prove.

Title proof of variance of the hypergeometric distribution ProofOfVarianceOfTheHypergeometricDistribution 2013-03-22 13:27:41 2013-03-22 13:27:41 mathwizard (128) mathwizard (128) 13 mathwizard (128) Proof msc 62E15