Processing math: 100%

conditional distribution of multi-variate normal variable


Theorem.

Let X be a random variableMathworldPlanetmath, taking values in Rn, normally distributed with a non-singular covariance matrixMathworldPlanetmath Σ and a mean of zero.

Suppose Y is defined by Y=B*X for some linear transformation B:RkRn of maximum rank. ( to denotes the transpose operator.)

Then the distributionPlanetmathPlanetmath of X conditioned on Y is multi-variate normal, with conditional means and covariancesMathworldPlanetmath of:

𝔼[XY]=ΣB(BΣB)-1Y,Var[XY]=Σ-ΣB(BΣB)-1(ΣB).

If k=1, so that B is simply a vector in Rn, these formulasMathworldPlanetmathPlanetmath reduce to:

𝔼[XY]=ΣBYVar[Y],Var[XY]=Σ-ΣBBΣVar[Y].

If X does not have zero mean, then the formula for E[XY] is modified by adding E[X] and replacing Y by Y-E[Y], and the formula for Var[XY] is unchanged.

Proof.

We split up X into two stochastically independent parts, the first part containing exactly the information embodied in Y. Then the conditional distribution of X given Y is simply the unconditional distribution of the second part that is independentPlanetmathPlanetmath of Y.

To this end, we first change variables to express everything in terms of a standard multi-variate normal Z. Let A:nn be a “square root” factorization of the covariance matrix Σ, so that:

AA=Σ,Z=A-1X,X=AZ,Y=BAZ.

We let H:nn be the orthogonal projectionPlanetmathPlanetmath onto the range of AB:kn, and decompose Z into orthogonalMathworldPlanetmathPlanetmathPlanetmath componentsPlanetmathPlanetmath:

Z=HZ+(I-H)Z.

It is intuitively obvious that orthogonality of the two random normal vectorsMathworldPlanetmath implies their stochastic independence. To show this formally, observe that the Gaussian density function for Z factors into a productPlanetmathPlanetmath:

(2π)-n/2exp(-12z2)=(2π)-n/2exp(-12Hz2)exp(-12(I-H)z2).

We can construct an orthonormal system of coordinates on n under which the components for Hz are completely disjoint from those components of (I-H)z. On the other hand, the densities for Z, HZ, and (I-H)Z remain invariantMathworldPlanetmath even after changing coordinates, because they are radially symmetricMathworldPlanetmathPlanetmathPlanetmathPlanetmath. Hence the variables HZ and (I-H)Z are separable in their joint density and they are independent.

HZ embodies the information in the linear combinationMathworldPlanetmath Y=BAZ. For we have the identity:

Y=(BA)Z=(BA)(HZ+(I-H)Z)=(BA)HZ+0.

The last term is null because (I-H)Z is orthogonal to the range of AB by definition. (Equivalently, (I-H)Z lies in the kernel of (AB)=BA.) Thus Y can always be recovered by a linear transformation on HZ.

Conversely, Y completely determines HZ, from the analytical expression for H that we now give. In general, the orthogonal projection onto the range of an injectivePlanetmathPlanetmath transformationMathworldPlanetmath T is T(TT)-1T. Applying this to T=AB, we have

H =AB(BAAB)-1BA
=AB(BΣB)-1BA.

We see that HZ=AB(BΣB)-1Y.

We have proved that conditioning on Y and HZ are equivalentMathworldPlanetmathPlanetmathPlanetmathPlanetmath, and so:

𝔼[ZY]=𝔼[ZHZ]=𝔼[HZ+(I-H)ZHZ]=HZ+0,

and

Var[ZY]=Var[ZHZ] =Var[HZ+(I-H)ZHZ]
=0+Var[(I-H)Z]
=𝔼[(I-H)ZZ(I-H)]
=(I-H)(I-H)
=I-H-H+HH=I-H,

using the defining property H2=H=H of orthogonal projections.

Now we express the result in terms of X, and remove the dependence on the transformation A (which is not uniquely defined from the covariance matrix):

𝔼[XY]=A𝔼[ZY]=AHZ=ΣB(BΣB)-1Y

and

Var[XY]=AVar[ZY]A=AA-AHA=Σ-ΣB(BΣB)-1BΣ.

Of course, the conditional distribution of X given Y is the same as that of (I-H)Z, which is multi-variate normal.

The formula in the statement of this theorem, for the single-dimensional case, follows from substituting in Var[Y]=Var[BX]=BΣB. The formula for when X does not have zero mean follows from applying the base case to the shifted variable X-𝔼[X]. ∎

Title conditional distribution of multi-variate normal variable
Canonical name ConditionalDistributionOfMultivariateNormalVariable
Date of creation 2013-03-22 18:39:09
Last modified on 2013-03-22 18:39:09
Owner stevecheng (10074)
Last modified by stevecheng (10074)
Numerical id 5
Author stevecheng (10074)
Entry type Theorem
Classification msc 62E15
Classification msc 60E05