canonical correlation


Let X be the (T,n) matrix corresponding to the n signals and Y be a (T,p) matrix corresponding to one set of p signals. Time indexes each row of the matrix (T time samples). Let Σ11 and Σ22 be the sample covariance matrices of X and Y, respectively, and let Σ12=Σ21 be the sample covariance matrix between X and Y. For simplicity, we suppose that all signals have zero mean.

Canonical correlation analysis (CCA) finds the linear combinationsMathworldPlanetmath of the column of X and Y that has the largest correlationMathworldPlanetmath; i.e., it finds the weight vectors (loadings) a and b that maximize:

ρ=aΣ12baΣ11abΣ22b. (1)

We follow the derivations of Johnson and we do a change of basis: c=Σ111/2a and d=Σ221/2b.

ρ=cΣ11-1/2Σ12Σ22-1/2dccdd (2)
ρcΣ11-1/2Σ12Σ22-1/2Σ22-1/2Σ21Σ11-1/2cddccdd=cΣ11-1/2Σ12Σ22-1Σ21Σ11-1/2ccc. (3)

The inequality above is an equality when Σ22-1/2Σ21Σ11-1/2c and d are collinear. The right hand side of the expression above is a Rayleigh quotient and it is maximum when c is the eigenvectorMathworldPlanetmathPlanetmathPlanetmath corresponding to the largest eingenvalue of Σ11-1/2Σ12Σ22-1Σ21Σ11-1/2 (we obtain the other rows by using the other eigenvaluesMathworldPlanetmathPlanetmathPlanetmathPlanetmath in decreasing magnitude). This results if the basis of the CCA. We can compute the two canonical variables: U1=Xa and V1=Yb.

We can continue this way to find the subsequent vectors

Title canonical correlation
Canonical name CanonicalCorrelation
Date of creation 2013-03-22 19:16:11
Last modified on 2013-03-22 19:16:11
Owner tony_bruguier (26297)
Last modified by tony_bruguier (26297)
Numerical id 4
Author tony_bruguier (26297)
Entry type Definition
Classification msc 62H20