Login
score function
Given a statistical model $\lbrace f_{\mathbf{X}}(\boldsymbol{x}\mid\boldsymbol{\theta}) : \boldsymbol{\theta} \in \Theta \rbrace$ with log-likelihood function $\ell(\boldsymbol{\theta}\mid\boldsymbol{x})$ , the score function $U$ is defined to be the gradient of $\ell$ : $$U(\boldsymbol{\theta})=\nabla\ell=\frac{\partial\ell}{\partial\boldsymbol{\theta}}.$$ Since the score function $U$ is also a function of the random vector $\boldsymbol{x}$ , $U$ is itself a random vector. By setting $U$ to 0, we have a system of $k$ equation(s), otherwise known as the likelihood equation(s): $$U(\boldsymbol{\theta})=\Big(\frac{\partial\ell}{\partial\theta_1},\ldots,\frac{\partial\ell}{\partial\theta_k}\Big)= (0,\ldots,0).$$ If $\boldsymbol{\theta}=\theta$ is one-dimensional, then the score function is simply referred to as the score of $\theta$ .
The maximum likelihood estimate (MLE) $\hat{\boldsymbol{\theta}}$ of the parameter vector $\boldsymbol{\theta}$ can usually be found by finding the solutions of the likelihood equations. The likelihood equations may also be formed by setting the gradient of the plain likelihood function to zero. The use of the log function often facilitates the algebra as many distributions are exponential in nature. For some distributions it may also be necessary to test that the solution to the likelihood equations is really a minimum as opposed to a point of inflection.
Example. $n$ independent observations are made from a random variable $X$ with a Poisson distribution with parameter $\lambda$ . The observed values are $x_1,\ldots,x_n$ . The log-likelihood of the joint pdf is $$\ell(\lambda\mid\boldsymbol{x})=\sum_{i=1}^{n}-\lambda+x_i\ln(\lambda)-\ln(x_i!)$$ and so the score function is $$U(\lambda)=\frac{d\ell}{d\lambda}=\sum_{i=1}^{n}\big(-1+\frac{x_i}{\lambda}\big)=-n+\frac{n\overline{x}}{\lambda},$$ where $n\overline{x}=\sum x_i$ . To find the MLE of $\lambda$ , we set $U=0$ and solve for $\lambda$ . So the MLE $\hat{\lambda}$ of $\lambda=\overline{x}$ .
