Rayleigh-Ritz theorem

Let Aβˆˆπ‚nΓ—n be a Hermitian matrixMathworldPlanetmath. Then its eigenvectorsMathworldPlanetmathPlanetmathPlanetmath are the critical points (vectors) of the ”Rayleigh quotient”, which is the real function R:β„‚n\{𝟎}→ℝ

R⁒(𝐱)=𝐱H⁒A⁒𝐱𝐱H⁒𝐱,βˆ₯𝐱βˆ₯β‰ 0

and its eigenvaluesMathworldPlanetmathPlanetmathPlanetmathPlanetmath are its values at such critical points.

As a consequence, we have:

Ξ»m⁒a⁒x=maxβˆ₯𝐱βˆ₯β‰ 0⁑𝐱H⁒A⁒𝐱𝐱H⁒𝐱


Ξ»m⁒i⁒n=minβˆ₯𝐱βˆ₯β‰ 0⁑𝐱H⁒A⁒𝐱𝐱H⁒𝐱


First of all, let’s observe that for a hermitian matrix, the number 𝐱H⁒A⁒𝐱 is a real one (actually, <𝐱,A𝐱β‰₯ 𝐱HA𝐱=(AH𝐱)H𝐱=(A𝐱)H𝐱=<A𝐱,𝐱β‰₯<𝐱,A𝐱>βˆ—, whence <𝐱,A⁒𝐱β‰₯𝐱H⁒A⁒𝐱 is real), so that the Rayleigh quotient is real as well.

Let’s now compute the critical points 𝐱¯ of the Rayleigh quotient, i.e. let’s solve the equations system d⁒R⁒(𝐱¯)d⁒𝐱=𝟎T. Let’s write 𝐱=𝐱(R)+j⁒𝐱(I), 𝐱(R) and 𝐱(I) being respectively the real and imaginary part of 𝐱. We have:


so that we must have:


Using derivativesPlanetmathPlanetmath rules, we obtain:

d⁒R⁒(𝐱)d⁒𝐱(R) = dd⁒𝐱(R)⁒(𝐱H⁒A⁒𝐱𝐱H⁒𝐱)=d⁒(𝐱H⁒A⁒𝐱)d⁒𝐱(R)⁒𝐱H⁒𝐱-𝐱H⁒A⁒𝐱⁒d⁒(𝐱H⁒𝐱)d⁒𝐱(R)(𝐱H⁒𝐱)2=d⁒(𝐱H⁒A⁒𝐱)d⁒𝐱(R)-R⁒(𝐱)⁒d⁒(𝐱H⁒𝐱)d⁒𝐱(R)𝐱H⁒𝐱⁒.

Applying matrix calculus rules, we find:

d⁒(𝐱H⁒A⁒𝐱)d⁒𝐱(R) = 𝐱H⁒A⁒d⁒𝐱d⁒𝐱(R)+𝐱T⁒AT⁒dβ’π±βˆ—d⁒𝐱(R)=𝐱H⁒A+𝐱T⁒AT=𝐱H⁒A+(𝐱H⁒AH)βˆ—=

and since A=AH,


In a similarPlanetmathPlanetmath way, we get:


Substituting, we obtain:


and, after a transpositionMathworldPlanetmath, equating to the null column vectorMathworldPlanetmath,

𝟎 = (β„œβ‘(𝐱¯H⁒A)-R⁒(𝐱¯)β’β„œβ‘(𝐱¯H))T=
= β„œβ‘(ATβ’π±Β―βˆ—)-R⁒(𝐱¯)β’β„œβ‘(π±Β―βˆ—)=β„œβ‘((AH⁒𝐱¯)βˆ—)-R⁒(𝐱¯)β’β„œβ‘(π±Β―βˆ—)=
= β„œβ‘((A⁒𝐱¯)βˆ—)-R⁒(𝐱¯)β’β„œβ‘(π±Β―βˆ—)=β„œβ‘(A⁒𝐱¯)-R⁒(𝐱¯)β’β„œβ‘(𝐱¯)

and, since R⁒(𝐱) is real,


Let’s then evaluate d⁒R⁒(𝐱)d⁒𝐱(I):

d⁒R⁒(𝐱)d⁒𝐱(I) = dd⁒𝐱(I)⁒(𝐱H⁒A⁒𝐱𝐱H⁒𝐱)=d⁒(𝐱H⁒A⁒𝐱)d⁒𝐱(I)⁒𝐱H⁒𝐱-𝐱H⁒A⁒𝐱⁒d⁒(𝐱H⁒𝐱)d⁒𝐱(I)(𝐱H⁒𝐱)2=d⁒(𝐱H⁒A⁒𝐱)d⁒𝐱(I)-R⁒(𝐱)⁒d⁒(𝐱H⁒𝐱)d⁒𝐱(I)𝐱H⁒𝐱⁒.

Applying again matrix calculus rules, we find:

d⁒(𝐱H⁒A⁒𝐱)d⁒𝐱(I) = 𝐱H⁒A⁒d⁒𝐱d⁒𝐱(I)+𝐱T⁒AT⁒dβ’π±βˆ—d⁒𝐱(I)=j⁒𝐱H⁒A-j⁒𝐱T⁒AT=j⁒(𝐱H⁒A-(𝐱H⁒AH)βˆ—)=

and since A=AH,


In a similar way, we get:


Substituting, we obtain:


and, after a transposition, equating to the null column vector,

𝟎 = (ℑ⁑(𝐱¯H⁒A)-R⁒(𝐱¯)⁒ℑ⁑(𝐱¯H))T=
= ℑ⁑(ATβ’π±Β―βˆ—)-R⁒(𝐱¯)⁒ℑ⁑(π±Β―βˆ—)=ℑ⁑((AH⁒𝐱¯)βˆ—)-R⁒(𝐱¯)⁒ℑ⁑(π±Β―βˆ—)=
= ℑ⁑((A⁒𝐱¯)βˆ—)-R⁒(𝐱¯)⁒ℑ⁑(π±Β―βˆ—)=-ℑ⁑(A⁒𝐱¯)+R⁒(𝐱¯)⁒ℑ⁑(𝐱¯)

and, since R⁒(𝐱) is real,


In conclusionMathworldPlanetmath, we have that a stationary vector 𝐱¯ for the Rayleigh quotient satisfies the complex eigenvalue equation


whence the thesis.β–‘


1) The two relationsMathworldPlanetmath Ξ»max=maxβˆ₯𝐱βˆ₯β‰ 0⁑𝐱H⁒A⁒𝐱𝐱H⁒𝐱 and Ξ»min=minβˆ₯𝐱βˆ₯β‰ πŸŽβ‘π±H⁒A⁒𝐱𝐱H⁒𝐱 can also be obtained in a simpler way. By Schur’s canonical form theorem, any normal (and hence any hermitian) matrix is unitarily diagonalizable, i.e. a unitary matrixMathworldPlanetmath U exists such that A=U⁒Λ⁒UH with Ξ›=d⁒i⁒a⁒g⁒(Ξ»1,Ξ»2,β‹―,Ξ»n). So, since all eigenvalues of a hermitian matrix are real, it’s possible to write:

𝐱H⁒A⁒𝐱 =𝐱 UH⁒Λ⁒UH⁒𝐱=(UH⁒𝐱)H⁒Λ⁒(UH⁒𝐱)=𝐲H⁒Λ⁒𝐲=βˆ‘i=1nΞ»i⁒|yi|2≀λmaxβ’βˆ‘i=1n|yi|2=
= λmax⁒𝐲H⁒𝐲=λmax⁒(UH⁒𝐱)H⁒(UH⁒𝐱)=λmax⁒(𝐱H⁒U⁒UH⁒𝐱)=λmax⁒(𝐱H⁒𝐱)



But, having defined A⁒𝐯M=λmax⁒𝐯M, we have:


so that

Ξ»max=maxβˆ₯𝐱βˆ₯β‰ πŸŽβ‘π±H⁒A⁒𝐱𝐱H⁒𝐱

In a much similar way, we obtain

Ξ»min=minβˆ₯𝐱βˆ₯β‰ πŸŽβ‘π±H⁒A⁒𝐱𝐱H⁒𝐱

2) The above relations yield the following noteworthing bounds for the diagonal entries of a hermitian matrix:


In fact, having defined


and observing that ai⁒i=𝐞iH⁒A⁒𝐞i𝐞iH⁒𝐞i=R⁒(𝐞i), we have:

Ξ»min=minβˆ₯𝐱βˆ₯β‰ πŸŽβ‘R⁒(𝐱)≀R⁒(𝐞i)≀maxβˆ₯𝐱βˆ₯β‰ πŸŽβ‘R⁒(𝐱)=Ξ»max⁒.
Title Rayleigh-Ritz theorem
Canonical name RayleighRitzTheorem
Date of creation 2013-03-22 15:37:18
Last modified on 2013-03-22 15:37:18
Owner gufotta (12050)
Last modified by gufotta (12050)
Numerical id 8
Author gufotta (12050)
Entry type Theorem
Classification msc 15A18