PlanetMath (more info)
 Math for the people, by the people.
Encyclopedia | Requests | Forums | Docs | Wiki | Random | RSS  
Login
create new user
name:
pass:
forget your password?
Main Menu
Owner confidence rating: Low Entry average rating: No information on entry rating
Rayleigh-Ritz theorem (Theorem)

Let $ A\in \mathbf{C}^{n\times n}$ be a Hermitian matrix. Then its eigenvectors are the critical points (vectors) of the "Rayleigh quotient", which is the real function $ R:\mathbb{C}^{n}\backslash \{\mathbf{0}% \}\rightarrow \mathbb{R}$

$\displaystyle R(\mathbf{x})=\frac{\mathbf{x}^{H}A\mathbf{x}}{\mathbf{x}^{H}\mathbf{x}},\Vert \mathbf{x}\Vert \neq 0$    

and its eigenvalues are its values at such critical points.

As a consequence, we have:

$\displaystyle \lambda_{max}=\max_{\Vert \mathbf{x}\Vert \neq 0}\frac{\mathbf{x}^{H}A\mathbf{x}}{\mathbf{x}^{H}\mathbf{x}}$    

and
$\displaystyle \lambda_{min}=\min_{\Vert \mathbf{x}\Vert \neq 0}\frac{\mathbf{x}^{H}A\mathbf{x}}{\mathbf{x}^{H}\mathbf{x}}$    

Proof:

First of all, let's observe that for a hermitian matrix, the number $ \mathbf{x}^{H}A\mathbf{x}$ is a real one (actually, $ <\mathbf{x},A\mathbf{x}>=$ $ % \mathbf{x}^{H}A\mathbf{x=(}A^{H}\mathbf{x)}^{H}\mathbf{x=(}A\mathbf{x)}^{H}% \mathbf{x=}<A\mathbf{x},\mathbf{x>=}<\mathbf{x},A\mathbf{x>}^{\ast }$, whence $ <\mathbf{x},A\mathbf{x>=x}^{H}A\mathbf{x}$ is real), so that the Rayleigh quotient is real as well.

Let's now compute the critical points $ \overline{\mathbf{x}}$ of the Rayleigh quotient, i.e. let's solve the equations system $ \frac{dR(\overline{% \mathbf{x}})}{d\mathbf{x}}=\mathbf{0}^{T}$. Let's write $ \mathbf{x=x}^{(R)}+j\mathbf{x}^{(I)}$, $ \mathbf{x}^{(R)}$ and $ \mathbf{x}^{(I)}$ being respectively the real and imaginary part of $ \mathbf{x}$. We have:

$\displaystyle \frac{dR(\mathbf{x})}{d\mathbf{x}}=\frac{dR(\mathbf{x})}{d\mathbf{x}^{(R)}}+j\frac{dR(\mathbf{x})}{d\mathbf{x}^{(I)}}$    

so that we must have:

$\displaystyle \frac{dR(\overline{\mathbf{x}})}{d\mathbf{x}^{(R)}}=\frac{dR(\overline{\mathbf{x}})}{d\mathbf{x}^{(I)}}=\mathbf{0}^{T}$    

Using derivatives rules, we obtain:


$\displaystyle \frac{dR(\mathbf{x})}{d\mathbf{x}^{(R)}}$ $\displaystyle =$ $\displaystyle \frac{d}{d\mathbf{x}^{(R)}} \left( \frac{\mathbf{x}^{H}A\mathbf{x... ...}^{H}\mathbf{x}\right) }{d\mathbf{x} ^{(R)}}}{\mathbf{x}^{H}\mathbf{x}}\text{.}$  

Applying matrix calculus rules, we find:

$\displaystyle \frac{d\left( \mathbf{x}^{H}A\mathbf{x}\right) }{d\mathbf{x}^{(R)}}$ $\displaystyle =$ $\displaystyle % \mathbf{x}^{H}A\frac{d\mathbf{x}}{d\mathbf{x}^{(R)}}+\mathbf{x}... ...athbf{x}^{T}A^{T}=\mathbf{x}^{H}A+\left( \mathbf{x}% ^{H}A^{H}\right) ^{\ast }=$  

and since $ A=A^{H}$,

$\displaystyle =\mathbf{x}^{H}A+\left( \mathbf{x}^{H}A\right) ^{\ast }=2\Re\left( \mathbf{x}^{H}A\right)$   .    

In a similar way, we get:

$\displaystyle \frac{d\left( \mathbf{x}^{H}\mathbf{x}\right) }{d\mathbf{x}^{(R)}}=2\Re \left( \mathbf{x}^{H}\right)$   .    

Substituting, we obtain:

$\displaystyle \frac{dR(\mathbf{x})}{d\mathbf{x}^{(R)}}=2\frac{\Re\left( \mathbf... ...ight) -R(\mathbf{x})\Re\left( \mathbf{x}^{H}\right) }{\mathbf{x}^{H}\mathbf{x}}$    

and, after a transposition, equating to the null column vector,

$\displaystyle \mathbf{0}$ $\displaystyle =$ $\displaystyle \left( \Re\left( \overline{\mathbf{x}}^{H}A\right) -R(% \overline{\mathbf{x}})\Re\left( \overline{\mathbf{x}}^{H}\right) \right) ^{T}=$  
  $\displaystyle =$ $\displaystyle \Re\left( A^{T}\overline{\mathbf{x}}^{\ast }\right) -R(\overline{... ...ht) -R(\overline{% \mathbf{x}})\Re\left( \overline{\mathbf{x}}^{\ast }\right) =$  
  $\displaystyle =$ $\displaystyle \Re\left( (A\overline{\mathbf{x}}\mathbf{)}^{\ast }\right) -R(% \... ...f{x}}\mathbf{)}-R(\overline{\mathbf{x}})\Re \left( \overline{\mathbf{x}}\right)$  


and, since $ R(\mathbf{x})$ is real,
$\displaystyle \Re(A\overline{\mathbf{x}}-R(\overline{\mathbf{x}})\overline{\mathbf{x}}\mathbf{)=0}$    

Let's then evaluate $ \frac{dR(\mathbf{x})}{d\mathbf{x}^{(I)}}$:

$\displaystyle \frac{dR(\mathbf{x})}{d\mathbf{x}^{(I)}}$ $\displaystyle =$ $\displaystyle \frac{d}{d\mathbf{x}^{(I)}}% \left( \frac{\mathbf{x}^{H}A\mathbf{... ...^{H}\mathbf{x}\right) }{d\mathbf{x}% ^{(I)}}}{\mathbf{x}^{H}\mathbf{x}}\text{.}$  

Applying again matrix calculus rules, we find:

$\displaystyle \frac{d\left( \mathbf{x}^{H}A\mathbf{x}\right) }{d\mathbf{x}^{(I)}}$ $\displaystyle =$ $\displaystyle % \mathbf{x}^{H}A\frac{d\mathbf{x}}{d\mathbf{x}^{(I)}}+\mathbf{x}... ...thbf{x}^{T}A^{T}=j(\mathbf{x}^{H}A-\left( \mathbf{x}^{H}A^{H}\right) ^{\ast })=$  

and since $ A=A^{H}$,

$\displaystyle =j(\mathbf{x}^{H}A-\left( \mathbf{x}^{H}A\right) ^{\ast })=j(2j\Im\left( \mathbf{x}^{H}A\right) )=-2\Im\left( \mathbf{x}^{H}A\right)$   .    

In a similar way, we get:

$\displaystyle \frac{d\left( \mathbf{x}^{H}\mathbf{x}\right) }{d\mathbf{x}^{(I)}... ...{H}\right) ^{\ast })=j(2j\Im(\mathbf{x}^{H}))=-2\Im\left( \mathbf{x}^{H}\right)$   .    

Substituting, we obtain:

$\displaystyle \frac{dR(\mathbf{x})}{d\mathbf{x}^{(I)}}=-2\frac{\Im\left( \mathb... ...ight) -R(\mathbf{x})\Im\left( \mathbf{x}^{H}\right) }{\mathbf{x}^{H}\mathbf{x}}$    

and, after a transposition, equating to the null column vector,

$\displaystyle \mathbf{0}$ $\displaystyle =$ $\displaystyle \left( \Im\left( \overline{\mathbf{x}}^{H}A\right) -R(% \overline{\mathbf{x}})\Im\left( \overline{\mathbf{x}}^{H}\right) \right) ^{T}=$  
  $\displaystyle =$ $\displaystyle \Im\left( A^{T}\overline{\mathbf{x}}^{\ast }\right) -R(\overline{... ...ht) -R(\overline{% \mathbf{x}})\Im\left( \overline{\mathbf{x}}^{\ast }\right) =$  
  $\displaystyle =$ $\displaystyle \Im\left( (A\overline{\mathbf{x}}\mathbf{)}^{\ast }\right) -R(% \... ...f{x}}\mathbf{)}+R(\overline{\mathbf{x}})\Im \left( \overline{\mathbf{x}}\right)$  

and, since $ R(\mathbf{x})$ is real,

$\displaystyle \Im(A\overline{\mathbf{x}}-R(\overline{\mathbf{x}})\overline{\mathbf{x} }\mathbf{)=0}$    

In conclusion, we have that a stationary vector $ \overline{\mathbf{x}}$ for the Rayleigh quotient satisfies the complex eigenvalue equation

$\displaystyle A\overline{\mathbf{x}}-R(\overline{\mathbf{x}})\overline{\mathbf{x}}=\mathbf{0}$    

whence the thesis.$ \square $

Remarks:

1) The two relations $ \lambda _{\max }=\max_{\Vert \mathbf{x}\Vert\neq 0} \frac{\mathbf{x}^{H}A\mathbf{x}}{\mathbf{x}^{H}\mathbf{x}}$ and $ \lambda _{\min }=\min_{\Vert \mathbf{x\Vert \neq 0}}\frac{\mathbf{x}^{H}A\mathbf{x}}{% \mathbf{x}^{H}\mathbf{x}}$ can also be obtained in a simpler way. By Schur's canonical form theorem, any normal (and hence any hermitian) matrix is unitarily diagonalizable, i.e. a unitary matrix $ U$ exists such that $ A=U\Lambda U^{H}$ with $ \Lambda =diag(\lambda _{1},\lambda _{2},\cdots,\lambda _{n})$. So, since all eigenvalues of a hermitian matrix are real, it's possible to write:

$\displaystyle \mathbf{x}^{H}A\mathbf{x}$ $\displaystyle \mathbf{=x}$ $\displaystyle ^{H}U\Lambda U^{H}\mathbf{x=}\left( U^{H}\mathbf{x}\right) ^{H}\L... ...t\vert ^{2}\leq \lambda _{\max }\sum_{i=1}^{n}\left\vert y_{i}\right\vert ^{2}=$  
  $\displaystyle =$ $\displaystyle \lambda _{\max }\mathbf{y}^{H}\mathbf{y=}\lambda _{\max }\left( U... ...thbf{x}\right) \mathbf{=}\lambda _{\max }\left( \mathbf{x}^{H}\mathbf{x}\right)$  

whence
$\displaystyle \lambda _{\max }\geq \frac{\mathbf{x}^{H}A\mathbf{x}}{\mathbf{x}^{H}\mathbf{x}}$    

But, having defined $ A\mathbf{v}_{M}=\lambda _{\max }\mathbf{v}_{M}$, we have:
$\displaystyle \frac{\mathbf{v}_{M}^{H}A\mathbf{v}_{M}}{\mathbf{v}_{M}^{H}\mathb... ...mbda _{\max }\mathbf{v}_{M}}{\mathbf{v}_{M}^{H}\mathbf{v}_{M}}=\lambda _{\max }$    

so that
$\displaystyle \lambda _{\max }=\max_{\Vert \mathbf{x\Vert \neq 0}}\frac{\mathbf{x}^{H}A\mathbf{x}}{\mathbf{x}^{H}\mathbf{x}}$    

In a much similar way, we obtain
$\displaystyle \lambda _{\min }=\min_{\Vert \mathbf{x\Vert \neq 0}}\frac{\mathbf{x}^{H}A\mathbf{x}}{\mathbf{x}^{H}\mathbf{x}}$    

2) The above relations yield the following noteworthing bounds for the diagonal entries of a hermitian matrix:

$\displaystyle \lambda _{\min }\leq a_{ii}\leq \lambda _{\max }$    

In fact, having defined
$\displaystyle \mathbf{e}_{i}=[\overbrace{0,0,...,0,}^{i-1}1,0,...,0]^{T}$    

and observing that $ a_{ii}=\frac{\mathbf{e}_{i}^{H}A\mathbf{e}_{i}}{\mathbf{e}_{i}^{H}\mathbf{e}_{i}}=R(\mathbf{e}_{i})$, we have:
$\displaystyle \lambda _{\min }=\min_{\Vert \mathbf{x\Vert \neq 0}}R(\mathbf{x})... ...f{e}_{i})\leq \max_{\Vert \mathbf{x\Vert \neq 0}}R(\mathbf{x})=\lambda _{\max }$.    



"Rayleigh-Ritz theorem" is owned by gufotta.
(view preamble)

View style:

Log in to rate this entry.
(view current ratings)

Cross-references: diagonal, bounds, eigenvalues of a Hermitian matrix are real, unitary matrix, unitarily diagonalizable, normal, canonical, relations, complex, stationary, conclusion, column vector, null, transposition, similar, Calculus, matrix, derivatives, imaginary part, equations, real, number, proof, consequence, real function, Rayleigh quotient, vectors, critical points, eigenvectors, Hermitian matrix
There is 1 reference to this entry.

This is version 5 of Rayleigh-Ritz theorem, born on 2005-12-30, modified 2006-07-03.
Object id is 7546, canonical name is RayleighRitzTheorem.
Accessed 6394 times total.

Classification:
AMS MSC15A18 (Linear and multilinear algebra; matrix theory :: Eigenvalues, singular values, and eigenvectors)

Pending Errata and Addenda
None.
[ View all 1 ]
Discussion
Style: Expand: Order:
forum policy

No messages.

Interact
post | correct | update request | prove | add result | add corollary | add example | add (any)