proof of variance of the hypergeometric distribution


We will first prove a useful property of binomial coefficientsMathworldPlanetmath. We know

(nk)=n!k!(n-k)!.

This can be transformed to

(nk)=nk(n-1)!(k-1)!(n-1-(k-1))!=nk(n-1k-1). (1)

The varianceMathworldPlanetmath Var[X] of X is given by:

Var[X]=x=0n(x-nKM)2(Kx)(M-Kn-x)(Mn).

We expand the right hand side:

Var[X] = x=0nx2(Kx)(M-Kn-x)(Mn)
-2nKMx=0nx(Kx)(M-Kn-x)(Mn)
+n2K2M2x=0n(Kx)(M-Kn-x)(Mn).

The second of these sums is the expected valueMathworldPlanetmath of the hypergeometric distributionMathworldPlanetmath, the third sum is 1 as it sums up all probabilities in the distributionPlanetmathPlanetmath. So we have:

Var[X]=-n2K2M2+x=0nx2(Kx)(M-Kn-x)(Mn).

In the last sum for x=0 we add nothing so we can write:

Var[X]=-n2K2M2+x=1nx2(Kx)(M-Kn-x)(Mn).

Applying equation (1) and x=(x-1)+1 we get:

Var[X]=-n2K2M2+nKMx=1n(x-1)(K-1x-1)(M-Kn-x)(M-1n-1)+nKMx=1n(K-1x-1)(M-Kn-x)(M-1n-1).

Setting l:=x-1 the first sum is the expected value of a hypergeometric distribution and is therefore given as (n-1)(K-1)M-1. The second sum is the sum over all the probabilities of a hypergeometric distribution and is therefore equal to 1. So we get:

Var[X] = -n2K2M2+nK(n-1)(K-1)M(M-1)+nKM
= -n2K2(M-1)+Mn(n-1)K(K-1)+KnM(M-1)M2(M-1)
= nK(M2+(-K-n)M+nK)M2(M-1)
= nK(M-K)(M-n)M2(M-1)
= nKM(1-KM)M-nM-1.

This is the one we wanted to prove.

Title proof of variance of the hypergeometric distribution
Canonical name ProofOfVarianceOfTheHypergeometricDistribution
Date of creation 2013-03-22 13:27:41
Last modified on 2013-03-22 13:27:41
Owner mathwizard (128)
Last modified by mathwizard (128)
Numerical id 13
Author mathwizard (128)
Entry type Proof
Classification msc 62E15