Loading [MathJax]/jax/output/CommonHTML/jax.js

Rayleigh-Ritz theorem


Let Aβˆˆπ‚nΓ—n be a Hermitian matrixMathworldPlanetmath. Then its eigenvectorsMathworldPlanetmathPlanetmathPlanetmath are the critical points (vectors) of the ”Rayleigh quotient”, which is the real function R:β„‚n\{𝟎}→ℝ

R(𝐱)=𝐱HA𝐱𝐱H𝐱,βˆ₯𝐱βˆ₯β‰ 0

and its eigenvaluesMathworldPlanetmathPlanetmathPlanetmathPlanetmath are its values at such critical points.

As a consequence, we have:

Ξ»max=maxβˆ₯𝐱βˆ₯β‰ 0𝐱HA𝐱𝐱H𝐱

and

Ξ»min=minβˆ₯𝐱βˆ₯β‰ 0𝐱HA𝐱𝐱H𝐱

Proof:

First of all, let’s observe that for a hermitian matrix, the number 𝐱HA𝐱 is a real one (actually, <𝐱,A𝐱β‰₯ 𝐱HA𝐱=(AH𝐱)H𝐱=(A𝐱)H𝐱=<A𝐱,𝐱β‰₯<𝐱,A𝐱>βˆ—, whence <𝐱,A𝐱β‰₯𝐱HA𝐱 is real), so that the Rayleigh quotient is real as well.

Let’s now compute the critical points ¯𝐱 of the Rayleigh quotient, i.e. let’s solve the equations system dR(¯𝐱)d𝐱=𝟎T. Let’s write 𝐱=𝐱(R)+j𝐱(I), 𝐱(R) and 𝐱(I) being respectively the real and imaginary part of 𝐱. We have:

dR(𝐱)d𝐱=dR(𝐱)d𝐱(R)+jdR(𝐱)d𝐱(I)

so that we must have:

dR(¯𝐱)d𝐱(R)=dR(¯𝐱)d𝐱(I)=𝟎T

Using derivativesPlanetmathPlanetmath rules, we obtain:

dR(𝐱)d𝐱(R) = dd𝐱(R)(𝐱HA𝐱𝐱H𝐱)=d(𝐱HA𝐱)d𝐱(R)𝐱H𝐱-𝐱HA𝐱d(𝐱H𝐱)d𝐱(R)(𝐱H𝐱)2=d(𝐱HA𝐱)d𝐱(R)-R(𝐱)d(𝐱H𝐱)d𝐱(R)𝐱H𝐱.

Applying matrix calculus rules, we find:

d(𝐱HA𝐱)d𝐱(R) = 𝐱HAd𝐱d𝐱(R)+𝐱TATdπ±βˆ—d𝐱(R)=𝐱HA+𝐱TAT=𝐱HA+(𝐱HAH)βˆ—=

and since A=AH,

=𝐱HA+(𝐱HA)βˆ—=2β„œ(𝐱HA).

In a similarPlanetmathPlanetmath way, we get:

d(𝐱H𝐱)d𝐱(R)=2β„œ(𝐱H).

Substituting, we obtain:

dR(𝐱)d𝐱(R)=2β„œ(𝐱HA)-R(𝐱)β„œ(𝐱H)𝐱H𝐱

and, after a transpositionMathworldPlanetmath, equating to the null column vectorMathworldPlanetmath,

𝟎 = (β„œ(¯𝐱HA)-R(¯𝐱)β„œ(¯𝐱H))T=
= β„œ(ATΒ―π±βˆ—)-R(¯𝐱)β„œ(Β―π±βˆ—)=β„œ((AH¯𝐱)βˆ—)-R(¯𝐱)β„œ(Β―π±βˆ—)=
= β„œ((A¯𝐱)βˆ—)-R(¯𝐱)β„œ(Β―π±βˆ—)=β„œ(A¯𝐱)-R(¯𝐱)β„œ(¯𝐱)

and, since R(𝐱) is real,

β„œ(A¯𝐱-R(¯𝐱)¯𝐱)=𝟎

Let’s then evaluate dR(𝐱)d𝐱(I):

dR(𝐱)d𝐱(I) = dd𝐱(I)(𝐱HA𝐱𝐱H𝐱)=d(𝐱HA𝐱)d𝐱(I)𝐱H𝐱-𝐱HA𝐱d(𝐱H𝐱)d𝐱(I)(𝐱H𝐱)2=d(𝐱HA𝐱)d𝐱(I)-R(𝐱)d(𝐱H𝐱)d𝐱(I)𝐱H𝐱.

Applying again matrix calculus rules, we find:

d(𝐱HA𝐱)d𝐱(I) = 𝐱HAd𝐱d𝐱(I)+𝐱TATdπ±βˆ—d𝐱(I)=j𝐱HA-j𝐱TAT=j(𝐱HA-(𝐱HAH)βˆ—)=

and since A=AH,

=j(𝐱HA-(𝐱HA)βˆ—)=j(2jβ„‘(𝐱HA))=-2β„‘(𝐱HA).

In a similar way, we get:

d(𝐱H𝐱)d𝐱(I)=j𝐱H-j𝐱T=j(𝐱H-(𝐱H)βˆ—)=j(2jβ„‘(𝐱H))=-2β„‘(𝐱H).

Substituting, we obtain:

dR(𝐱)d𝐱(I)=-2β„‘(𝐱HA)-R(𝐱)β„‘(𝐱H)𝐱H𝐱

and, after a transposition, equating to the null column vector,

𝟎 = (β„‘(¯𝐱HA)-R(¯𝐱)β„‘(¯𝐱H))T=
= β„‘(ATΒ―π±βˆ—)-R(¯𝐱)β„‘(Β―π±βˆ—)=β„‘((AH¯𝐱)βˆ—)-R(¯𝐱)β„‘(Β―π±βˆ—)=
= β„‘((A¯𝐱)βˆ—)-R(¯𝐱)β„‘(Β―π±βˆ—)=-β„‘(A¯𝐱)+R(¯𝐱)β„‘(¯𝐱)

and, since R(𝐱) is real,

β„‘(A¯𝐱-R(¯𝐱)¯𝐱)=𝟎

In conclusionMathworldPlanetmath, we have that a stationary vector ¯𝐱 for the Rayleigh quotient satisfies the complex eigenvalue equation

A¯𝐱-R(¯𝐱)¯𝐱=𝟎

whence the thesis.β–‘

Remarks:

1) The two relationsMathworldPlanetmath Ξ»max=maxβˆ₯𝐱βˆ₯β‰ 0𝐱HA𝐱𝐱H𝐱 and Ξ»min=minβˆ₯𝐱βˆ₯β‰ πŸŽπ±HA𝐱𝐱H𝐱 can also be obtained in a simpler way. By Schur’s canonical form theorem, any normal (and hence any hermitian) matrix is unitarily diagonalizable, i.e. a unitary matrixMathworldPlanetmath U exists such that A=UΞ›UH with Ξ›=diag(Ξ»1,Ξ»2,β‹―,Ξ»n). So, since all eigenvalues of a hermitian matrix are real, it’s possible to write:

𝐱HA𝐱 =𝐱 UHΞ›UH𝐱=(UH𝐱)HΞ›(UH𝐱)=𝐲HΛ𝐲=nβˆ‘i=1Ξ»i|yi|2≀λmaxnβˆ‘i=1|yi|2=
= λmax𝐲H𝐲=λmax(UH𝐱)H(UH𝐱)=λmax(𝐱HUUH𝐱)=λmax(𝐱H𝐱)

whence

Ξ»maxβ‰₯𝐱HA𝐱𝐱H𝐱

But, having defined A𝐯M=λmax𝐯M, we have:

𝐯HMA𝐯M𝐯HM𝐯M=𝐯HMλmax𝐯M𝐯HM𝐯M=λmax

so that

Ξ»max=maxβˆ₯𝐱βˆ₯β‰ πŸŽπ±HA𝐱𝐱H𝐱

In a much similar way, we obtain

Ξ»min=minβˆ₯𝐱βˆ₯β‰ πŸŽπ±HA𝐱𝐱H𝐱

2) The above relations yield the following noteworthing bounds for the diagonal entries of a hermitian matrix:

Ξ»min≀aii≀λmax

In fact, having defined

𝐞i=[i-1⏞0,0,…,0,1,0,…,0]T

and observing that aii=𝐞HiA𝐞i𝐞Hi𝐞i=R(𝐞i), we have:

Ξ»min=minβˆ₯𝐱βˆ₯β‰ πŸŽR(𝐱)≀R(𝐞i)≀maxβˆ₯𝐱βˆ₯β‰ πŸŽR(𝐱)=Ξ»max.
Title Rayleigh-Ritz theorem
Canonical name RayleighRitzTheorem
Date of creation 2013-03-22 15:37:18
Last modified on 2013-03-22 15:37:18
Owner gufotta (12050)
Last modified by gufotta (12050)
Numerical id 8
Author gufotta (12050)
Entry type Theorem
Classification msc 15A18