| Version current |
Version 8 |
| \PMlinkescapeword{measure} |
\PMlinkescapeword{measure} |
| \PMlinkescapeword{variation} |
\PMlinkescapeword{variation} |
|
|
|
|
| \subsection*{Definition} |
\subsection*{Definition} |
| The \emph{variance} of a real-valued random variable $X$ is |
The \emph{variance} of a real-valued random variable $X$ is |
| \[ |
\[ |
| \Var X = \E\bigl[ (X - m)^2 \bigr]\,, \quad m = \E X\,, |
\Var X = \E\bigl[ (X - m)^2 \bigr]\,, \quad m = \E X\,, |
| \] |
\] |
| provided that both expectations $\E X$ and $\E[(X-m)^2]$ exist. |
provided that both expectations $\E X$ and $\E[(X-m)^2]$ exist. |
|
|
| The variance of $X$ is often denoted by $\sigma^2(X)$, $\sigma^2_X$, |
The variance of $X$ is often denoted by $\sigma^2(X)$, $\sigma^2_X$, |
| or simply $\sigma^2$. |
or simply $\sigma^2$. |
| The exponent on $\sigma$ is put there so that the number |
The exponent on $\sigma$ is put there so that the number |
| $\sigma = \sqrt{\sigma^2}$ |
$\sigma = \sqrt{\sigma^2}$ |
| is measured in the same units as the random variable $X$ |
is measured in the same units as the random variable $X$ |
| itself. |
itself. |
|
|
| The quantity $\sigma = \sqrt{\Var X}$ is called the \emph{standard deviation} |
The quantity $\sigma = \sqrt{\Var X}$ is called the \emph{standard deviation} |
| of $X$; |
of $X$; |
| because of the compatibility of the measuring units, |
because of the compatibility of the measuring units, |
| standard deviation is usually the quantity that is quoted |
standard deviation is usually the quantity that is quoted |
| to describe an emprical probability distribution, rather than the variance. |
to describe an emprical probability distribution, rather than the variance. |
|
|
| \subsection*{Usage} |
\subsection*{Usage} |
|
|
| The variance is a measure of the dispersion or variation |
The variance is a measure of the dispersion or variation |
| of a random variable |
of a random variable |
| about its mean $m$. |
about its mean $m$. |
|
|
| It is not always the best measure of dispersion for all random variables, |
It is not always the best measure of dispersion for all random variables, |
| but compared to other measures, |
but compared to other measures, |
| such as the absolute mean deviation, $\E[ \abs{X-m} ]$, |
such as the absolute mean deviation, $\E[ \abs{X-m} ]$, |
| the variance is the most tractable analytically. |
the variance is the most tractable analytically. |
|
|
| The variance is closely related to the $\Le^2$ norm for |
The variance is closely related to the $\Le^2$ norm for |
| random variables over a probability space. |
random variables over a probability space. |
|
|
| \subsection*{Properties} |
\subsection*{Properties} |
|
|
| \begin{enumerate} |
\begin{enumerate} |
| \item |
\item |
| The variance of $X$ is the second moment of $X$ minus |
The variance of $X$ is the second moment of $X$ minus |
| the square of the first moment: |
the square of the first moment: |
| \[ |
\[ |
| \Var X = \E[X^2] - \E[X]^2\,. |
\Var X = \E[X^2] - \E[X]^2\,. |
| \] |
\] |
| This formula is often used to calculate variance analytically. |
This formula is often used to calculate variance analytically. |
|
|
| \item |
\item |
| Variance is not a linear function. It scales quadratically, |
Variance is not a linear function. It scales quadratically, |
| and is not affected by shifts in the mean of the distribution: |
and is not affected by shifts in the mean of the distribution: |
| \[ |
\[ |
| \Var[ aX + b ] = a^2 \Var X\,, \quad \text{ for any $a, b \in \real$.} |
\Var[ aX + b ] = a^2 \Var X\,, \quad \text{ for any $a, b \in \real$.} |
| \] |
\] |
|
|
| \item |
\item |
| A random variable $X$ is constant almost surely if and only |
A random variable $X$ is constant almost surely if and only |
| if $\Var X = 0$. |
if $\Var X = 0$. |
|
|
| \item |
\item |
| The variance can also be characterized as |
The variance can also be characterized as |
| the minimum of expected squared deviation of a random variable from any point: |
the minimum of expected squared deviation of a random variable from any point: |
| \[ |
\[ |
| \Var X = \inf_{a \in \real} \E[(X-a)^2]\,. |
\Var X = \inf_{a \in \real} \E[(X-a)^2]\,. |
| \] |
\] |
|
|
| \item |
\item |
| For any two random variables $X$ and $Y$ whose variances exist, |
For any two random variables $X$ and $Y$ whose variances exist, |
| the variance of the linear combination $aX + bY$ |
the variance of the linear combination $aX + bY$ |
| can be expressed in terms of their covariance: |
can be expressed in terms of their covariance: |
| \[ |
\[ |
| \Var[aX+bY] = a^2 \Var X + b^2 \Var Y + 2ab \Cov[X,Y]\,, |
\Var[aX+bY] = a^2 \Var X + b^2 \Var Y + 2ab \Cov[X,Y]\,, |
| \] |
\] |
| where $\Cov[X,Y] = \E[(X-\E X)(Y-\E Y)]$, |
where $\Cov[X,Y] = \E[(X-\E X)(Y-\E Y)]$, |
| and $a, b \in \real$. |
and $a, b \in \real$. |
|
|
| \item |
\item |
| For a random variable $X$, with actual observations $x_1, \dotsc, x_n$, |
For a random variable $X$, with actual observations $x_1, \dotsc, x_n$, |
| its variance is often estimated |
its variance is often estimated |
| empirically with the \emph{sample variance}: |
empirically with the \emph{sample variance}: |
| \[ |
\[ |
| \Var X \approx s^2 = \frac{1}{n-1} \sum_{i=1}^n (x_i - \bar{x})^2\,, |
\Var X \approx s^2 = \frac{1}{n-1} \sum_{i=1}^n (x_i - \bar{x})^2\,, |
| \quad |
\quad |
| \bar{x} = \frac{1}{n} \sum_{j=1}^n x_j\,. |
\bar{x} = \frac{1}{n} \sum_{j=1}^n x_j\,. |
| \] |
\] |
|
|
| \end{enumerate} |
\end{enumerate} |