Rayleigh-Ritz theorem
Let AβπnΓn be a Hermitian matrix. Then its
eigenvectors
are the critical points (vectors) of the βRayleigh quotientβ,
which is the real function R:βn\{π}ββ
R(π±)=π±HAπ±π±Hπ±,β₯π±β₯β 0 |
and its eigenvalues are its values at such critical points.
As a consequence, we have:
Ξ»max=maxβ₯π±β₯β 0π±HAπ±π±Hπ± |
and
Ξ»min=minβ₯π±β₯β 0π±HAπ±π±Hπ± |
Proof:
First of all, letβs observe that for a hermitian matrix, the number π±HAπ± is a real one (actually, <π±,Aπ±β₯ π±HAπ±=(AHπ±)Hπ±=(Aπ±)Hπ±=<Aπ±,π±β₯<π±,Aπ±>β, whence <π±,Aπ±β₯π±HAπ± is real), so that the Rayleigh quotient is real as well.
Letβs now compute the critical points Β―π± of the Rayleigh quotient, i.e. letβs solve the equations system dR(Β―π±)dπ±=πT. Letβs write π±=π±(R)+jπ±(I), π±(R) and π±(I) being respectively the real and imaginary part of π±. We have:
dR(π±)dπ±=dR(π±)dπ±(R)+jdR(π±)dπ±(I) |
so that we must have:
dR(Β―π±)dπ±(R)=dR(Β―π±)dπ±(I)=πT |
Using derivatives rules, we obtain:
dR(π±)dπ±(R) | = | ddπ±(R)(π±HAπ±π±Hπ±)=d(π±HAπ±)dπ±(R)π±Hπ±-π±HAπ±d(π±Hπ±)dπ±(R)(π±Hπ±)2=d(π±HAπ±)dπ±(R)-R(π±)d(π±Hπ±)dπ±(R)π±Hπ±. |
Applying matrix calculus rules, we find:
d(π±HAπ±)dπ±(R) | = | π±HAdπ±dπ±(R)+π±TATdπ±βdπ±(R)=π±HA+π±TAT=π±HA+(π±HAH)β= |
and since A=AH,
=π±HA+(π±HA)β=2β(π±HA). |
In a similar way, we get:
d(π±Hπ±)dπ±(R)=2β(π±H). |
Substituting, we obtain:
dR(π±)dπ±(R)=2β(π±HA)-R(π±)β(π±H)π±Hπ± |
and, after a transposition, equating to the null column vector
,
π | = | (β(Β―π±HA)-R(Β―π±)β(Β―π±H))T= | ||
= | β(ATΒ―π±β)-R(Β―π±)β(Β―π±β)=β((AHΒ―π±)β)-R(Β―π±)β(Β―π±β)= | |||
= | β((AΒ―π±)β)-R(Β―π±)β(Β―π±β)=β(AΒ―π±)-R(Β―π±)β(Β―π±) |
and, since R(π±) is real,
β(AΒ―π±-R(Β―π±)Β―π±)=π |
Letβs then evaluate dR(π±)dπ±(I):
dR(π±)dπ±(I) | = | ddπ±(I)(π±HAπ±π±Hπ±)=d(π±HAπ±)dπ±(I)π±Hπ±-π±HAπ±d(π±Hπ±)dπ±(I)(π±Hπ±)2=d(π±HAπ±)dπ±(I)-R(π±)d(π±Hπ±)dπ±(I)π±Hπ±. |
Applying again matrix calculus rules, we find:
d(π±HAπ±)dπ±(I) | = | π±HAdπ±dπ±(I)+π±TATdπ±βdπ±(I)=jπ±HA-jπ±TAT=j(π±HA-(π±HAH)β)= |
and since A=AH,
=j(π±HA-(π±HA)β)=j(2jβ(π±HA))=-2β(π±HA). |
In a similar way, we get:
d(π±Hπ±)dπ±(I)=jπ±H-jπ±T=j(π±H-(π±H)β)=j(2jβ(π±H))=-2β(π±H). |
Substituting, we obtain:
dR(π±)dπ±(I)=-2β(π±HA)-R(π±)β(π±H)π±Hπ± |
and, after a transposition, equating to the null column vector,
π | = | (β(Β―π±HA)-R(Β―π±)β(Β―π±H))T= | ||
= | β(ATΒ―π±β)-R(Β―π±)β(Β―π±β)=β((AHΒ―π±)β)-R(Β―π±)β(Β―π±β)= | |||
= | β((AΒ―π±)β)-R(Β―π±)β(Β―π±β)=-β(AΒ―π±)+R(Β―π±)β(Β―π±) |
and, since R(π±) is real,
β(AΒ―π±-R(Β―π±)Β―π±)=π |
In conclusion, we have that a stationary vector Β―π± for
the Rayleigh quotient satisfies the complex eigenvalue equation
AΒ―π±-R(Β―π±)Β―π±=π |
whence the thesis.β‘
Remarks:
1) The two relations Ξ»max=maxβ₯π±β₯β 0π±HAπ±π±Hπ± and Ξ»min=minβ₯π±β₯β ππ±HAπ±π±Hπ± can also be obtained in a simpler way.
By Schurβs canonical form theorem, any normal (and hence any hermitian)
matrix is unitarily diagonalizable, i.e. a unitary matrix
U exists such
that A=UΞUH with Ξ=diag(Ξ»1,Ξ»2,β―,Ξ»n). So, since all eigenvalues of a hermitian matrix are
real, itβs possible to write:
π±HAπ± | =π± | UHΞUHπ±=(UHπ±)HΞ(UHπ±)=π²HΞπ²=nβi=1Ξ»i|yi|2β€Ξ»maxnβi=1|yi|2= | ||
= | Ξ»maxπ²Hπ²=Ξ»max(UHπ±)H(UHπ±)=Ξ»max(π±HUUHπ±)=Ξ»max(π±Hπ±) |
whence
Ξ»maxβ₯π±HAπ±π±Hπ± |
But, having defined Aπ―M=Ξ»maxπ―M, we have:
π―HMAπ―Mπ―HMπ―M=π―HMΞ»maxπ―Mπ―HMπ―M=Ξ»max |
so that
Ξ»max=maxβ₯π±β₯β ππ±HAπ±π±Hπ± |
In a much similar way, we obtain
Ξ»min=minβ₯π±β₯β ππ±HAπ±π±Hπ± |
2) The above relations yield the following noteworthing bounds for the diagonal entries of a hermitian matrix:
Ξ»minβ€aiiβ€Ξ»max |
In fact, having defined
πi=[i-1β0,0,β¦,0,1,0,β¦,0]T |
and observing that aii=πHiAπiπHiπi=R(πi), we have:
Ξ»min=minβ₯π±β₯β πR(π±)β€R(πi)β€maxβ₯π±β₯β πR(π±)=Ξ»max. |
Title | Rayleigh-Ritz theorem |
---|---|
Canonical name | RayleighRitzTheorem |
Date of creation | 2013-03-22 15:37:18 |
Last modified on | 2013-03-22 15:37:18 |
Owner | gufotta (12050) |
Last modified by | gufotta (12050) |
Numerical id | 8 |
Author | gufotta (12050) |
Entry type | Theorem |
Classification | msc 15A18 |