|
|
|
|
Fisher information matrix
|
(Definition)
|
|
|
Given a statistical model $\lbrace f_{X}(\boldsymbol{x}\mid\boldsymbol{\theta})\rbrace$ of a random vector ${X}$ , the Fisher information matrix, $I$ , is the variance of the score function $U$ . So, $$I=\operatorname{Var}[U].$$ If there is only one parameter involved, then $I$ is simply called the
Fisher information or information of $f_{X}(\boldsymbol{x}\mid\theta)$ .
Remarks
- If $f_{X}(\boldsymbol{x}\mid\boldsymbol{\theta})$ belongs to the exponential family, $I=\operatorname{E}\big[U^{\operatorname{T}}U\big]$ . Furthermore, with some regularity conditions imposed, we have $$I=-\operatorname{E}\Big[\frac{\partial U}{\partial\boldsymbol{\theta}}\Big].$$
- As an example, the normal distribution, $N(\mu,\sigma^2)$ , belongs to the exponential family and its log-likelihood function $\ell(\boldsymbol{\theta}\mid x)$ is $$-\frac{1}{2}\operatorname{ln}(2\pi\sigma^2)-\frac{(x-\mu)^2}{2\sigma^2},$$ where $\boldsymbol{\theta}=(\mu,\sigma^2)$ . Then the score function $U(\boldsymbol{\theta})$ is given by $$\Big(\pdiff{\ell}{\mu},\pdiff{\ell}{\sigma^2}\Big) = \Big(\frac{x-\mu}{\sigma^2},\frac{(x-\mu)^2}{2\sigma^4}-\frac{1}{2\sigma^2}\Big).$$ Taking the derivative with respect to $\boldsymbol{\theta}$ , we have $$\frac{\partial U}{\partial\boldsymbol{\theta}}= \begin{pmatrix} \displaystyle{\pdiff{U_1}{\mu}} & \displaystyle{\pdiff{U_2}{\mu}} \\ \ \\ \displaystyle{\pdiff{U_1}{\sigma^2}} & \displaystyle{\pdiff{U_2}{\sigma^2}} \\ \end{pmatrix}= \begin{pmatrix} \displaystyle{\frac{-1}{\sigma^2}} & \displaystyle{-\frac{x-\mu}{\sigma^4}} \\ \ \\ \displaystyle{-\frac{x-\mu}{\sigma^4}} & \displaystyle{\frac{1}{2\sigma^4}-\frac{(x-\mu)^2}{\sigma^6}} \end{pmatrix}.$$ Therefore, the Fisher information matrix $I$ is $$-\operatorname{E}\Big[\frac{\partial U}{\partial\boldsymbol{\theta}}\Big]=\frac{1}{2\sigma^4} \begin{pmatrix} 2\sigma^2 & 0 \\ 0 & -1 \end{pmatrix}.$$
- Now, in linear regression model with constant variance $\sigma^2$ , it can be shown that the Fisher information matrix $I$ is $$\frac{1}{\sigma^2}\textbf{X}^{\operatorname{T}}\textbf{X},$$ where ${X}$ is the design matrix of the regression model.
- In general, the Fisher information meansures how much ``information'' is known about a parameter $\theta$ . If $T$ is an unbiased estimator of $\theta$ , it can be shown that $$\operatorname{Var}\big[T(X)\big]\ge\frac{1}{I(\theta)}$$ This is known as the Cramer-Rao inequality, and the number $1/I(\theta)$ is known as the Cramer-Rao lower bound. The smaller the variance of the estimate of $\theta$ , the more information we have on $\theta$ . If there
is more than one parameter, the above can be generalized by saying that $$\operatorname{Var}\big[T(X)\big]-I(\boldsymbol{\theta})^{-1}$$ is positive semidefinite, where $I$ is the Fisher information matrix.
|
"Fisher information matrix" is owned by CWoo.
|
|
(view preamble | get metadata)
| Other names: |
information matrix |
| Also defines: |
Fisher information, information, Cramer-Rao inequality, Cramer-Rao lower bound |
|
|
Cross-references: positive semidefinite, estimate, number, unbiased estimator, regression model, matrix, design, linear regression model, derivative, log-likelihood function, normal distribution, regularity, exponential family, parameter, score function, variance, random vector, statistical model
There are 20 references to this entry.
This is version 11 of Fisher information matrix, born on 2004-07-27, modified 2007-04-21.
Object id is 6041, canonical name is FisherInformationMatrix.
Accessed 32187 times total.
Classification:
| AMS MSC: | 62A01 (Statistics :: Foundational and philosophical topics) | | | 62B10 (Statistics :: Sufficiency and information :: Information-theoretic topics) | | | 62H99 (Statistics :: Multivariate analysis :: Miscellaneous) |
|
|
|
|
|
|
Pending Errata and Addenda
|
|
|
|
|
|
|
|
|
|
|