You are here
Homegeneralized linear model
Primary tabs
generalized linear model
Given a random vector, or the response variable, Y, a generalized linear model, or GLM for short, is a statistical model $\{f_{\textbf{Y}}(\boldsymbol{y}\mid\boldsymbol{\theta})\}$ such that
1. the components of Y are mutually independent of each other,
2. $f_{{Y_{i}}}(y_{i}\mid\theta_{i})$ belongs to the exponential family of distributions and has the following canonical form:
$f_{{Y_{i}}}(y_{i}\mid\theta_{i})=\operatorname{exp}[y\theta_{i}b(\theta_{i})+% c(y)],$ where the parameter $\theta_{i}$ is called the canonical parameter and $b(\theta_{i})$ is called the cumulant function.
3. for each component or variate $Y_{i}$, with a corresponding set of $p$ covariates $X_{{ij}}$, there exists a monotone differentiable function $g$, called the link function, such that
$g(\operatorname{E}[Y_{i}])={\textbf{X}_{i}}^{{\operatorname{T}}}\boldsymbol{% \beta},$ where ${\textbf{X}_{i}}^{{\operatorname{T}}}=(X_{{i1}},\ldots,X_{{ip}})$, and $\boldsymbol{\beta}=(\beta_{1},\ldots,\beta_{p})^{{\operatorname{T}}}$ is a parameter vector.
In practice, an extra parameter called the dispersion parameter, $\phi$, is introducted to the model to lower a phenonmenon known as overdispersion. The GLM now looks like:
$f_{{Y_{i}}}(y_{i}\mid\theta_{i})=\operatorname{exp}[\frac{y\theta_{i}b(\theta% _{i})}{a(\phi)}+c(y,\phi)]$ 
Remarks

Below is a table of canonical parameters and cumulant functions for some wellknown distributions from the exponential family:
distribution notation canonical parameter $\theta$ cumulant function $b(\theta)$ Normal $N(\mu,\sigma^{2})$ $\mu$ $\displaystyle{\frac{\theta^{2}}{2}}$ Poisson $Poisson(\mu)$ $\operatorname{ln}\mu$ $\operatorname{exp}(\theta)$ Binomial $Bin(m,\pi)$ $\operatorname{logit}(\pi)$ $\operatorname{ln}(1+e^{{\theta}})$ Gamma $Gamma(\alpha,\lambda)$ $\lambda$ $\operatorname{ln}(\theta)$ 
GLM is a direct generalization of the general linear model, which includes linear regression models, ANOVA and ANCOVA. The link function for the general linear model is the identity function $g(\mu)=\mu$.

For a GLM, $\operatorname{E}[Y]=b^{{\prime}}(\theta)$ and $\operatorname{Var}[Y]=b^{{\prime\prime}}(\theta)$. $b^{{\prime\prime}}(\theta)$, when expressed in terms of $\mu=\operatorname{E}[Y]$, is known as the variance function $V(\mu)$. Below are some examples of variance functions:
distribution notation variance function Normal $N(\mu,\sigma^{2})$ 1 Poisson $Poisson(\mu)$ $\mu$ Binomial $Bin(m,\pi)$ $\pi(1\pi)$ Gamma $Gamma(\alpha,\lambda)$ $\displaystyle{\frac{1}{\lambda^{2}}}$ 
The logistic regression model, where the response variable $Y$ is categorial in nature, is a special case of GLM, with possible link functions the logit function, $\operatorname{logit}(\pi)=\operatorname{ln}(\operatorname{odds}(\pi))$, the inverse cumulative normal distribution function, or probit function $\Phi^{{1}}(\pi)$, or the complementaryloglog function, $\operatorname{ln}(\operatorname{ln}(1\pi))$, where the parameter $\pi$ is between 0 and 1, usually measured as the frequency of occurrences of certain events.

The loglinear model, where the response variable $Y$ has a Poisson distribution, is also a special case of GLM, with link function the natural logarithm of the parameter $\mu$ in question. Poisson distribution is typically used to model count or frequency data.
References
 1 P. McCullagh and J. A. Nelder, Generalized Linear Models, Chapman & Hall/CRC, 2nd ed., London (1989).
 2 A. J. Dobson, An Introduction to Generalized Linear Models, Chapman & Hall, 2nd ed. (2001).
Mathematics Subject Classification
62J12 no label found Forums
 Planetary Bugs
 HS/Secondary
 University/Tertiary
 Graduate/Advanced
 Industry/Practice
 Research Topics
 LaTeX help
 Math Comptetitions
 Math History
 Math Humor
 PlanetMath Comments
 PlanetMath System Updates and News
 PlanetMath help
 PlanetMath.ORG
 Strategic Communications Development
 The Math Pub
 Testing messages (ignore)
 Other useful stuff
 Corrections