PlanetMath (more info)
 Math for the people, by the people. Sponsor PlanetMath
Encyclopedia | Requests | Forums | Docs | Wiki | Random | RSS  
Login
create new user
name:
pass:
forget your password?
Main Menu
Revision difference : variance
Version current Version 8
\PMlinkescapeword{measure} \PMlinkescapeword{measure}
\PMlinkescapeword{variation} \PMlinkescapeword{variation}
\subsection*{Definition} \subsection*{Definition}
The \emph{variance} of a real-valued random variable $X$ is The \emph{variance} of a real-valued random variable $X$ is
\[ \[
\Var X = \E\bigl[ (X - m)^2 \bigr]\,, \quad m = \E X\,, \Var X = \E\bigl[ (X - m)^2 \bigr]\,, \quad m = \E X\,,
\] \]
provided that both expectations $\E X$ and $\E[(X-m)^2]$ exist. provided that both expectations $\E X$ and $\E[(X-m)^2]$ exist.
The variance of $X$ is often denoted by $\sigma^2(X)$, $\sigma^2_X$, The variance of $X$ is often denoted by $\sigma^2(X)$, $\sigma^2_X$,
or simply $\sigma^2$. or simply $\sigma^2$.
The exponent on $\sigma$ is put there so that the number The exponent on $\sigma$ is put there so that the number
$\sigma = \sqrt{\sigma^2}$ $\sigma = \sqrt{\sigma^2}$
is measured in the same units as the random variable $X$ is measured in the same units as the random variable $X$
itself. itself.
The quantity $\sigma = \sqrt{\Var X}$ is called the \emph{standard deviation} The quantity $\sigma = \sqrt{\Var X}$ is called the \emph{standard deviation}
of $X$; of $X$;
because of the compatibility of the measuring units, because of the compatibility of the measuring units,
standard deviation is usually the quantity that is quoted standard deviation is usually the quantity that is quoted
to describe an emprical probability distribution, rather than the variance. to describe an emprical probability distribution, rather than the variance.
\subsection*{Usage} \subsection*{Usage}
The variance is a measure of the dispersion or variation The variance is a measure of the dispersion or variation
of a random variable of a random variable
about its mean $m$. about its mean $m$.
It is not always the best measure of dispersion for all random variables, It is not always the best measure of dispersion for all random variables,
but compared to other measures, but compared to other measures,
such as the absolute mean deviation, $\E[ \abs{X-m} ]$, such as the absolute mean deviation, $\E[ \abs{X-m} ]$,
the variance is the most tractable analytically. the variance is the most tractable analytically.
The variance is closely related to the $\Le^2$ norm for The variance is closely related to the $\Le^2$ norm for
random variables over a probability space. random variables over a probability space.
\subsection*{Properties} \subsection*{Properties}
\begin{enumerate} \begin{enumerate}
\item \item
The variance of $X$ is the second moment of $X$ minus The variance of $X$ is the second moment of $X$ minus
the square of the first moment: the square of the first moment:
\[ \[
\Var X = \E[X^2] - \E[X]^2\,. \Var X = \E[X^2] - \E[X]^2\,.
\] \]
This formula is often used to calculate variance analytically. This formula is often used to calculate variance analytically.
\item \item
Variance is not a linear function. It scales quadratically, Variance is not a linear function. It scales quadratically,
and is not affected by shifts in the mean of the distribution: and is not affected by shifts in the mean of the distribution:
\[ \[
\Var[ aX + b ] = a^2 \Var X\,, \quad \text{ for any $a, b \in \real$.} \Var[ aX + b ] = a^2 \Var X\,, \quad \text{ for any $a, b \in \real$.}
\] \]
\item \item
A random variable $X$ is constant almost surely if and only A random variable $X$ is constant almost surely if and only
if $\Var X = 0$. if $\Var X = 0$.
\item \item
The variance can also be characterized as The variance can also be characterized as
the minimum of expected squared deviation of a random variable from any point: the minimum of expected squared deviation of a random variable from any point:
\[ \[
\Var X = \inf_{a \in \real} \E[(X-a)^2]\,. \Var X = \inf_{a \in \real} \E[(X-a)^2]\,.
\] \]
\item \item
For any two random variables $X$ and $Y$ whose variances exist, For any two random variables $X$ and $Y$ whose variances exist,
the variance of the linear combination $aX + bY$ the variance of the linear combination $aX + bY$
can be expressed in terms of their covariance: can be expressed in terms of their covariance:
\[ \[
\Var[aX+bY] = a^2 \Var X + b^2 \Var Y + 2ab \Cov[X,Y]\,, \Var[aX+bY] = a^2 \Var X + b^2 \Var Y + 2ab \Cov[X,Y]\,,
\] \]
where $\Cov[X,Y] = \E[(X-\E X)(Y-\E Y)]$, where $\Cov[X,Y] = \E[(X-\E X)(Y-\E Y)]$,
and $a, b \in \real$. and $a, b \in \real$.
\item \item
For a random variable $X$, with actual observations $x_1, \dotsc, x_n$, For a random variable $X$, with actual observations $x_1, \dotsc, x_n$,
its variance is often estimated its variance is often estimated
empirically with the \emph{sample variance}: empirically with the \emph{sample variance}:
\[ \[
\Var X \approx s^2 = \frac{1}{n-1} \sum_{i=1}^n (x_i - \bar{x})^2\,, \Var X \approx s^2 = \frac{1}{n-1} \sum_{i=1}^n (x_i - \bar{x})^2\,,
\quad \quad
\bar{x} = \frac{1}{n} \sum_{j=1}^n x_j\,. \bar{x} = \frac{1}{n} \sum_{j=1}^n x_j\,.
\] \]
\end{enumerate} \end{enumerate}