# mutual information

Let $(\Omega,\mathcal{F},\mu)$ be a discrete probability space, and let $X$ and $Y$ be discrete random variables on $\Omega$.

The mutual information $I[X;Y]$, read as “the mutual information of $X$ and $Y$,” is defined as

 $\displaystyle I[X;Y]$ $\displaystyle=\sum_{x\in\Omega}\sum_{y\in\Omega}\mu(X=x,Y=y)\log\frac{\mu(X=x,% Y=y)}{\mu(X=x)\mu(Y=y)}$ $\displaystyle=D(\mu(x,y)||\mu(x)\mu(y)).$

where $D$ denotes the relative entropy.

Mutual information, or just information, is measured in bits if the logarithm is to the base 2, and in “nats” when using the natural logarithm.

## Discussion

The most obvious characteristic of mutual information is that it depends on both $X$ and $Y$. There is no information in a vacuum—information is always about something. In this case, $I[X;Y]$ is the information in $X$ about $Y$. As its name suggests, mutual information is symmetric, $I[X;Y]=I[Y;X]$, so any information $X$ carries about $Y$, $Y$ also carries about $X$.

The definition in terms of relative entropy gives a useful interpretation of $I[X;Y]$ as a kind of “distance” between the joint distribution $\mu(x,y)$ and the product distribution $\mu(x)\mu(y)$. Recall, however, that relative entropy is not a true distance, so this is just a conceptual tool. However, it does capture another intuitive notion of information. Remember that for $X,Y$ independent, $\mu(x,y)=\mu(x)\mu(y)$. Thus the relative entropy “distance” goes to zero, and we have $I[X;Y]=0$ as one would expect for independent random variables.

A number of useful expressions, most apparent from the definition, relate mutual information to the entropy $H$:

 $\displaystyle 0\leq I[X;Y]$ $\displaystyle\leq H[X]$ (1) $\displaystyle I[X;Y]$ $\displaystyle=H[X]-H[X|Y]$ (2) $\displaystyle I[X;Y]$ $\displaystyle=H[X]+H[Y]-H[X,Y]$ (3) $\displaystyle I[X;X]$ $\displaystyle=H[X]$ (4)

Recall that the entropy $H[X]$ quantifies our uncertainty about $X$. The last line justifies the description of entropy as “self-information.”

## Historical Notes

Mutual information, or simply information, was introduced by Shannon in his landmark 1948 paper “A Mathematical Theory of Communication.”

 Title mutual information Canonical name MutualInformation Date of creation 2013-03-22 12:37:35 Last modified on 2013-03-22 12:37:35 Owner drummond (72) Last modified by drummond (72) Numerical id 4 Author drummond (72) Entry type Definition Classification msc 94A17 Synonym information Related topic RelativeEntropy Related topic Entropy Related topic ShannonsTheoremEntropy Related topic DynamicStream Defines information Defines mutual information