|
|
|
|
generalized linear model
|
(Definition)
|
|
|
Given a random vector, or the response variable, Y, a generalized linear model, or GLM for short, is a statistical model $\lbrace f_{Y}(\boldsymbol{y}\mid\boldsymbol{\theta})\rbrace$ such that
- the components of Y are mutually independent of each other,
- $f_{Y_i}(y_i\mid\theta_i)$ belongs to the exponential family of distributions and has the following canonical form: $$f_{Y_i}(y_i\mid\theta_i)=\operatorname{exp}[y\theta_i-b(\theta_i)+c(y)],$$ where the parameter $\theta_i$ is called the canonical parameter and $b(\theta_i)$ is called the cumulant function.
- for each component or variate $Y_i$ , with a corresponding set of $p$ covariates $X_{ij}$ , there exists a monotone differentiable function $g$ , called the link function, such that $$g(\operatorname{E}[Y_i])={\textbf{X}_i}^{\operatorname{T}}\boldsymbol{\beta},$$ where ${{X}_i}^{\operatorname{T}}=(X_{i1},\ldots,X_{ip})$ , and $\boldsymbol{\beta}=(\beta_1,\ldots,\beta_p)^{\operatorname{T}}$ is a parameter vector.
In practice, an extra parameter called the dispersion parameter, $\phi$ , is introducted to the model to lower a phenonmenon known as overdispersion. The GLM now looks like: $$f_{Y_i}(y_i\mid\theta_i)=\operatorname{exp}[\frac{y\theta_i-b(\theta_i)}{a(\phi)}+c(y,\phi)]$$
Remarks
- Below is a table of canonical parameters and cumulant functions for some well-known distributions from the exponential family:
| distribution |
notation |
canonical parameter $\theta$ |
cumulant function $b(\theta)$ |
| Normal |
$N(\mu,\sigma^2)$ |
$\mu$ |
$\displaystyle{\frac{\theta^2}{2}}$ |
| Poisson |
$Poisson(\mu)$ |
$\operatorname{ln}\mu$ |
$\operatorname{exp}(\theta)$ |
| Binomial |
$Bin(m,\pi)$ |
$\operatorname{logit}(\pi)$ |
$\operatorname{ln}(1+e^{\theta})$ |
| Gamma |
$Gamma(\alpha,\lambda)$ |
$-\lambda$ |
$-\operatorname{ln}(-\theta)$ |
- GLM is a direct generalization of the general linear model, which includes linear regression models, ANOVA and ANCOVA. The link function for the general linear model is the identity function $g(\mu)=\mu$ .
- For a GLM, $\operatorname{E}[Y]=b^{\prime}(\theta)$ and $\operatorname{Var}[Y]=b^{\prime\prime}(\theta)$ . $b^{\prime\prime}(\theta)$ , when expressed in terms of $\mu=\operatorname{E}[Y]$ , is known as the variance function $V(\mu)$ . Below are some examples of variance functions:
| distribution |
notation |
variance function |
| Normal |
$N(\mu,\sigma^2)$ |
1 |
| Poisson |
$ Poisson(\mu)$ |
$\mu$ |
| Binomial |
$Bin(m,\pi)$ |
$\pi(1-\pi)$ |
| Gamma |
$Gamma(\alpha,\lambda)$ |
$\displaystyle{\frac{1}{\lambda^2}}$ |
- The logistic regression model, where the response variable $Y$ is categorial in nature, is a special case of GLM, with possible link functions the logit function, $\operatorname{logit}(\pi)=\operatorname{ln}(\operatorname{odds}(\pi))$ , the inverse cumulative normal distribution function, or probit function $\Phi^{-1}(\pi)$ , or the complementary-log-log
function, $\operatorname{ln}(-\operatorname{ln}(1-\pi))$ , where the parameter $\pi$ is between 0 and 1, usually measured as the frequency of occurrences of certain events.
- The log-linear model, where the response variable $Y$ has a Poisson distribution, is also a special case of GLM, with link function the natural logarithm of the parameter $\mu$ in question. Poisson distribution is typically used to model count or frequency data.
- 1
- P. McCullagh and J. A. Nelder, Generalized Linear Models, Chapman & Hall/CRC, 2nd ed., London (1989).
- 2
- A. J. Dobson, An Introduction to Generalized Linear Models, Chapman & Hall, 2nd ed. (2001).
|
"generalized linear model" is owned by CWoo.
|
|
(view preamble | get metadata)
| Also defines: |
link function, canonical parameter, cumulant function, variance function |
|
|
Cross-references: natural logarithm, Poisson distribution, events, occurrences, complementary-log-log, probit, normal distribution, inverse, logit, logistic regression, terms, identity function, ANCOVA, ANOVA, linear regression models, general linear model, function, overdispersion, dispersion parameter, vector, differentiable function, monotone, parameter, canonical, distributions, exponential family, belongs, independent, components, statistical model, response variable, random vector
There are 4 references to this entry.
This is version 14 of generalized linear model, born on 2004-07-27, modified 2007-12-18.
Object id is 6040, canonical name is GeneralizedLinearModel.
Accessed 17248 times total.
Classification:
| AMS MSC: | 62J12 (Statistics :: Linear inference, regression :: Generalized linear models) |
|
|
|
|
|
|
Pending Errata and Addenda
|
|
|
|
|
|
|
|
|
|
|