generalized linear model

Given a random vector, or the response variable, Y, a generalized linear model, or GLM for short, is a statistical model $\{f_{\textbf{Y}}(\boldsymbol{y}\mid\boldsymbol{\theta})\}$ such that

1. 1.

the components of Y are mutually independent of each other,

2. 2.

$f_{Y_{i}}(y_{i}\mid\theta_{i})$ belongs to the exponential family of distributions and has the following canonical form:

 $f_{Y_{i}}(y_{i}\mid\theta_{i})=\operatorname{exp}[y\theta_{i}-b(\theta_{i})+c(% y)],$

where the parameter $\theta_{i}$ is called the canonical parameter and $b(\theta_{i})$ is called the cumulant function.

3. 3.

for each component or variate $Y_{i}$, with a corresponding set of $p$ covariates $X_{ij}$, there exists a monotone differentiable function $g$, called the link function, such that

 $g(\operatorname{E}[Y_{i}])={\textbf{X}_{i}}^{\operatorname{T}}\boldsymbol{% \beta},$

where ${\textbf{X}_{i}}^{\operatorname{T}}=(X_{i1},\ldots,X_{ip})$, and $\boldsymbol{\beta}=(\beta_{1},\ldots,\beta_{p})^{\operatorname{T}}$ is a parameter vector.

In practice, an extra parameter called the dispersion parameter, $\phi$, is introducted to the model to lower a phenonmenon known as overdispersion. The GLM now looks like:

 $f_{Y_{i}}(y_{i}\mid\theta_{i})=\operatorname{exp}[\frac{y\theta_{i}-b(\theta_{% i})}{a(\phi)}+c(y,\phi)]$

Remarks

• Below is a table of canonical parameters and cumulant functions for some well-known distributions from the exponential family:

Title generalized linear model GeneralizedLinearModel 2013-03-22 14:30:11 2013-03-22 14:30:11 CWoo (3771) CWoo (3771) 17 CWoo (3771) Definition msc 62J12 GLM link function canonical parameter cumulant function variance function