# mutual information

Let $(\mathrm{\Omega},\mathcal{F},\mu )$ be a discrete probability space^{}, and let $X$ and $Y$ be discrete random variables on $\mathrm{\Omega}$.

The *mutual information ^{}* $I[X;Y]$, read as “the mutual information of $X$ and $Y$,” is defined as

$I[X;Y]$ | $={\displaystyle \sum _{x\in \mathrm{\Omega}}}{\displaystyle \sum _{y\in \mathrm{\Omega}}}\mu (X=x,Y=y)\mathrm{log}{\displaystyle \frac{\mu (X=x,Y=y)}{\mu (X=x)\mu (Y=y)}}$ | ||

$=D(\mu (x,y)||\mu (x)\mu (y)).$ |

where $D$ denotes the relative entropy^{}.

Mutual information, or just information, is measured in bits if the logarithm is to the base 2, and in “nats” when using the natural logarithm.

## Discussion

The most obvious characteristic of mutual information is that it depends on both $X$ and $Y$. There is no information in a vacuum—information is always *about* something. In this case, $I[X;Y]$ is the information in $X$ about $Y$. As its name suggests, mutual information is symmetric^{}, $I[X;Y]=I[Y;X]$, so any information $X$ carries about $Y$, $Y$ also carries about $X$.

The definition in terms of relative entropy gives a useful interpretation^{} of $I[X;Y]$ as a kind of “distance” between the joint distribution^{} $\mu (x,y)$ and the product distribution $\mu (x)\mu (y)$. Recall, however, that relative entropy is not a true distance, so this is just a conceptual tool. However, it does capture another intuitive notion of information. Remember that for $X,Y$ independent^{}, $\mu (x,y)=\mu (x)\mu (y)$. Thus the relative entropy “distance” goes to zero, and we have $I[X;Y]=0$ as one would expect for independent random variables.

A number of useful expressions, most apparent from the definition, relate mutual information to the entropy^{} $H$:

$0\le I[X;Y]$ | $\le H[X]$ | (1) | ||

$I[X;Y]$ | $=H[X]-H[X|Y]$ | (2) | ||

$I[X;Y]$ | $=H[X]+H[Y]-H[X,Y]$ | (3) | ||

$I[X;X]$ | $=H[X]$ | (4) |

Recall that the entropy $H[X]$ quantifies our uncertainty about $X$. The last line justifies the description of entropy as “self-information.”

## Historical Notes

Mutual information, or simply information, was introduced by Shannon in his landmark 1948 paper “A Mathematical Theory of Communication.”

Title | mutual information |

Canonical name | MutualInformation |

Date of creation | 2013-03-22 12:37:35 |

Last modified on | 2013-03-22 12:37:35 |

Owner | drummond (72) |

Last modified by | drummond (72) |

Numerical id | 4 |

Author | drummond (72) |

Entry type | Definition |

Classification | msc 94A17 |

Synonym | information |

Related topic | RelativeEntropy |

Related topic | Entropy |

Related topic | ShannonsTheoremEntropy |

Related topic | DynamicStream |

Defines | information |

Defines | mutual information |