regression model


In statistical modeling of N data observations (N<), two types of variables are usually defined. One is the response variable or variate, usually denoted by Y, and the other is the explanatory variable or covariate X. While there is only one response variable, there may be one or more than one explanatory variables. The response variable is considered random, where as the explanatory variable(s) may or may not be random.

Based on the above setup, a univariate regression model, or simply regression model, is a statistical model with the following assumptions:

  1. 1.

    all of the variables, random or not, are continuous in nature (as opposed to categorical in nature)

  2. 2.

    the response variable Y can be expressed as the sum of a function f(𝐗), called the regression function, where X represents the row vectorMathworldPlanetmath of explanatory variables, and an error term εi:

    Y=f(𝐗)+ε=f(X1,,Xp)+ε

    where p is the number of explanatory variables. f(𝐗) is called the systematic componentPlanetmathPlanetmathPlanetmath, and ε is the random error component.

  3. 3.

    the error component and the systematic component are independentPlanetmathPlanetmath

  4. 4.

    random error variables εi for the N observations are iid normal with mean 0 and varianceMathworldPlanetmath σ2

Any unknown variables appearing in the regression function f, other than the covariates, are called the regression coefficients.

Remarks

  • The conditional distribution of Y, given X is normal, or Gaussian, with mean μ=E[Y𝐗=𝒙]=E[YX1=x1,,Xp=xp] and variance σ2. In addition, the random variablesMathworldPlanetmath Yi corresponding to the reponses are independent.

  • Sometimes, Condition 4 above is skipped to encompass a wider class of regression models. Those models that observe Condition 4 is generally called a normal, or Gaussian regression model. Otherwise, they are classified under the non-linear regression model discussed below. Some well known non-normal regression models are the logistic regressionMathworldPlanetmath for binary data and the Poisson regression for count data.

  • A regression model can be classified by the number or explanatory variables. If there is only one explanatory variable, it is called a simple regression model. Otherwise, it is a multiple regression model.

  • A regression model can also be classified by the form of the regression function f. If f can be expressed as a linear combinationMathworldPlanetmath of the regression coefficients:

    f(𝐗)=β0z0(𝐗)++βkzk(𝐗),

    where the functions zi(𝐗) do not contain any regression coefficients, then the model is called a linear regression model. Two examples of linear regression models are:

    Y=β0+β1X1+β2X2+β3X1X2+ε

    and

    Y=β0+β1X++βkXk+ε

    The last one is called a polynomial regression model. Linear regression models belong to a more general class of statistical models called the general linear model, where explanatory variables are no longer restricted to be continuous ones only. When f can not be expressed linearly in terms of the regression coefficients, the model is known as a non-linear regression model. An example of a non-linear regression model is

    Y=β0+1β1+β2X+ε
  • The univariate regression model can be generalized to what is known as the multivariate regression model, where at least two response variables are considered.

Title regression model
Canonical name RegressionModel
Date of creation 2013-03-22 14:30:31
Last modified on 2013-03-22 14:30:31
Owner CWoo (3771)
Last modified by CWoo (3771)
Numerical id 10
Author CWoo (3771)
Entry type Definition
Classification msc 62J02
Classification msc 62J05
Synonym univariate regression model
Related topic LinearLeastSquaresFit
Defines regression function
Defines regression coefficient
Defines simple regression model
Defines multiple regression model
Defines linear regression model
Defines polynomial regression model
Defines non-linear regression model