regression model

In statistical modeling of $N$ data observations ($N<\infty$), two types of variables are usually defined. One is the response variable or variate, usually denoted by $Y$, and the other is the explanatory variable or covariate $X$. While there is only one response variable, there may be one or more than one explanatory variables. The response variable is considered random, where as the explanatory variable(s) may or may not be random.

Based on the above setup, a univariate regression model, or simply regression model, is a statistical model with the following assumptions:

1. 1.

all of the variables, random or not, are continuous in nature (as opposed to categorical in nature)

2. 2.

the response variable $Y$ can be expressed as the sum of a function $f(\textbf{X})$, called the regression function, where X represents the row vector of explanatory variables, and an error term $\varepsilon_{i}$:

 $Y=f(\textbf{X})+\varepsilon=f(X_{1},\ldots,X_{p})+\varepsilon$

where $p$ is the number of explanatory variables. $f(\textbf{X})$ is called the systematic component, and $\varepsilon$ is the random error component.

3. 3.

the error component and the systematic component are independent

4. 4.

random error variables $\varepsilon_{i}$ for the $N$ observations are iid normal with mean 0 and variance $\sigma^{2}$

Any unknown variables appearing in the regression function $f$, other than the covariates, are called the regression coefficients.

Remarks

• The conditional distribution of $Y$, given X is normal, or Gaussian, with mean $\mu=\operatorname{E}\big{[}Y\mid\textbf{X}=\boldsymbol{x}\big{]}=\operatorname% {E}\big{[}Y\mid X_{1}=x_{1},\ldots,X_{p}=x_{p}\big{]}$ and variance $\sigma^{2}$. In addition, the random variables $Y_{i}$ corresponding to the reponses are independent.

• Sometimes, Condition 4 above is skipped to encompass a wider class of regression models. Those models that observe Condition 4 is generally called a normal, or Gaussian regression model. Otherwise, they are classified under the non-linear regression model discussed below. Some well known non-normal regression models are the logistic regression for binary data and the Poisson regression for count data.

• A regression model can be classified by the number or explanatory variables. If there is only one explanatory variable, it is called a simple regression model. Otherwise, it is a multiple regression model.

• A regression model can also be classified by the form of the regression function $f$. If $f$ can be expressed as a linear combination of the regression coefficients:

 $f(\textbf{X})=\beta_{0}z_{0}(\textbf{X})+\cdots+\beta_{k}z_{k}(\textbf{X}),$

where the functions $z_{i}(\textbf{X})$ do not contain any regression coefficients, then the model is called a linear regression model. Two examples of linear regression models are:

 $Y=\beta_{0}+\beta_{1}X_{1}+\beta_{2}X_{2}+\beta_{3}X_{1}X_{2}+\varepsilon$

and

 $Y=\beta_{0}+\beta_{1}X+\cdots+\beta_{k}X^{k}+\varepsilon$

The last one is called a polynomial regression model. Linear regression models belong to a more general class of statistical models called the general linear model, where explanatory variables are no longer restricted to be continuous ones only. When $f$ can not be expressed linearly in terms of the regression coefficients, the model is known as a non-linear regression model. An example of a non-linear regression model is

 $Y=\beta_{0}+\frac{1}{\beta_{1}+\beta_{2}X}+\varepsilon$
• The univariate regression model can be generalized to what is known as the multivariate regression model, where at least two response variables are considered.

 Title regression model Canonical name RegressionModel Date of creation 2013-03-22 14:30:31 Last modified on 2013-03-22 14:30:31 Owner CWoo (3771) Last modified by CWoo (3771) Numerical id 10 Author CWoo (3771) Entry type Definition Classification msc 62J02 Classification msc 62J05 Synonym univariate regression model Related topic LinearLeastSquaresFit Defines regression function Defines regression coefficient Defines simple regression model Defines multiple regression model Defines linear regression model Defines polynomial regression model Defines non-linear regression model