## You are here

Homemutual information

## Primary tabs

# mutual information

Let $(\Omega,\mathcal{F},\mu)$ be a discrete probability space, and let $X$ and $Y$ be discrete random variables on $\Omega$.

The *mutual information* $I[X;Y]$, read as “the mutual information of $X$ and $Y$,” is defined as

$\displaystyle I[X;Y]$ | $\displaystyle=\sum_{{x\in\Omega}}\sum_{{y\in\Omega}}\mu(X=x,Y=y)\log\frac{\mu(% X=x,Y=y)}{\mu(X=x)\mu(Y=y)}$ | ||

$\displaystyle=D(\mu(x,y)||\mu(x)\mu(y)).$ |

where $D$ denotes the relative entropy.

Mutual information, or just information, is measured in bits if the logarithm is to the base 2, and in “nats” when using the natural logarithm.

# Discussion

The most obvious characteristic of mutual information is that it depends on both $X$ and $Y$. There is no information in a vacuum—information is always *about* something. In this case, $I[X;Y]$ is the information in $X$ about $Y$. As its name suggests, mutual information is symmetric, $I[X;Y]=I[Y;X]$, so any information $X$ carries about $Y$, $Y$ also carries about $X$.

The definition in terms of relative entropy gives a useful interpretation of $I[X;Y]$ as a kind of “distance” between the joint distribution $\mu(x,y)$ and the product distribution $\mu(x)\mu(y)$. Recall, however, that relative entropy is not a true distance, so this is just a conceptual tool. However, it does capture another intuitive notion of information. Remember that for $X,Y$ independent, $\mu(x,y)=\mu(x)\mu(y)$. Thus the relative entropy “distance” goes to zero, and we have $I[X;Y]=0$ as one would expect for independent random variables.

A number of useful expressions, most apparent from the definition, relate mutual information to the entropy $H$:

$\displaystyle 0\leq I[X;Y]$ | $\displaystyle\leq H[X]$ | (1) | ||

$\displaystyle I[X;Y]$ | $\displaystyle=H[X]-H[X|Y]$ | (2) | ||

$\displaystyle I[X;Y]$ | $\displaystyle=H[X]+H[Y]-H[X,Y]$ | (3) | ||

$\displaystyle I[X;X]$ | $\displaystyle=H[X]$ | (4) |

Recall that the entropy $H[X]$ quantifies our uncertainty about $X$. The last line justifies the description of entropy as “self-information.”

# Historical Notes

Mutual information, or simply information, was introduced by Shannon in his landmark 1948 paper “A Mathematical Theory of Communication.”

## Mathematics Subject Classification

94A17*no label found*

- Forums
- Planetary Bugs
- HS/Secondary
- University/Tertiary
- Graduate/Advanced
- Industry/Practice
- Research Topics
- LaTeX help
- Math Comptetitions
- Math History
- Math Humor
- PlanetMath Comments
- PlanetMath System Updates and News
- PlanetMath help
- PlanetMath.ORG
- Strategic Communications Development
- The Math Pub
- Testing messages (ignore)

- Other useful stuff
- Corrections