likelihood function
Let X=(X1,…,Xn) be a random vector and
{f𝐗(𝒙∣𝜽):𝜽∈Θ} |
a statistical model parametrized by 𝜽=(θ1,…,θk), the parameter vector in the parameter space Θ. The likelihood function is a map L:Θ→ℝ given by
L(𝜽∣𝒙)=f𝐗(𝒙∣𝜽). |
In other words, the likelikhood function is functionally the same in form as a probability density function
. However, the emphasis is changed from the 𝒙 to the 𝜽. The pdf is a function of the x’s while holding the parameters θ’s constant, L is a function of the parameters θ’s, while holding the x’s constant.
When there is no confusion, L(𝜽∣𝒙) is abbreviated to be L(𝜽).
The parameter vector ^𝜽 such that L(^𝜽)≥L(𝜽) for all 𝜽∈Θ is called a maximum likelihood estimate, or MLE, of 𝜽.
Many of the density functions are exponential in nature, it is therefore easier to compute the MLE of a likelihood
function L by finding the maximum of the natural log of L, known as the log-likelihood function:
ℓ(𝜽∣𝒙)=ln(L(𝜽∣𝒙)) |
due to the monotonicity of the log function.
Examples:
-
1.
A coin is tossed n times and m heads are observed. Assume that the probability of a head after one toss is π. What is the MLE of π?
Solution: Define the outcome of a toss be 0 if a tail is observed and 1 if a head is observed. Next, let Xi be the outcome of the ith toss. For any single toss, the density function is πx(1-π)1-x where x∈{0,1}. Assume that the tosses are independent events, then the joint probability density is
f𝐗(𝒙∣π)=(nΣxi)πΣxi(1-π)Σ(1-xi)=(nm)πm(1-π)n-m, which is also the likelihood function L(π). Therefore, the log-likelihood function has the form
ℓ(π∣𝒙)=ℓ(π)=ln(nm)+mln(π)+(n-m)ln(1-π). Using standard calculus, we get that the MLE of π is
ˆπ=mn=ˉx. -
2.
Suppose a sample of n data points Xi are collected. Assume that the Xi∼N(μ,σ2) and the Xi’s are independent of each other. What is the MLE of the parameter vector 𝜽=(μ,σ2)?
Solution: The joint pdf of the Xi, and hence the likelihood function, is
L(𝜽∣𝒙)=1σn(2π)n/2exp(-Σ(xi-μ)22σ2). The log-likelihood function is
ℓ(𝜽∣𝒙)=-Σ(xi-μ)22σ2-n2ln(σ2)-n2ln(2π). Taking the first derivative
(gradient), we get
∂ℓ∂𝜽=(Σ(xi-μ)σ2,Σ(xi-μ)22σ4-n2σ2). Setting
∂ℓ∂𝜽=𝟎 See score function and solve for 𝜽=(μ,σ2) we have
^𝜽=(ˆμ,ˆσ2)=(ˉx,n-1ns2), where ˉx=Σxi/n is the sample mean
and s2=Σ(xi-ˉx)2/(n-1) is the sample variance. Finally, we verify that ^𝜽 is indeed the MLE of 𝜽 by checking the negativity of the 2nd derivatives (for each parameter).
Title | likelihood function |
---|---|
Canonical name | LikelihoodFunction |
Date of creation | 2013-03-22 14:27:58 |
Last modified on | 2013-03-22 14:27:58 |
Owner | CWoo (3771) |
Last modified by | CWoo (3771) |
Numerical id | 13 |
Author | CWoo (3771) |
Entry type | Definition |
Classification | msc 62A01 |
Synonym | likelihood statistic |
Synonym | likelihood |
Defines | maximum likelihood estimate |
Defines | MLE |
Defines | log-likelihood function |