Mutual Information

In probability theory and information theory, the mutual information (MI) of two random variables is a measure of the mutual dependence between the two variables. More specifically, it quantifies the “amount of information” (in units such as shannons (bits), nats or hartleys) obtained about one random variable by observing the other random variable. The concept of mutual information is intimately linked to that of entropy of a random variable, a fundamental notion in information theory that quantifies the expected “amount of information” held in a random variable.

The mutual information can be defined as:

\begin{equation*} I(X;Y) = D_{KL}( P_{X,Y} || P_{X} P_{Y} ) \end{equation*}

where $D_{K L}$ is the KL Divergence.

If two variables are independent, $P_{X, Y} = P_{X} P_{Y}$ , which means we can not obtain any information about $Y$ by observing $X$ , and currently the KL divergence is 0. Thus mutual information is actually the distance between $P_{X, Y}$ and $P_{X} P_{Y}$ .

Specifically, For discrete distributions:

I (X; Y) = y \in Y \sum x \in X \sum P_{(X, Y)} (x, y) lo g (\frac{P _{(X, Y)} ( x , y )}{P _{X} ( x ) P _{Y} ( y )})

For continuous distributions:

I (X; Y) = \int_{Y} \int_{X} P_{(X, Y)} (x, y) lo g (\frac{P _{(X, Y)} ( x , y )}{P _{X} ( x ) P _{Y} ( y )}) d x d y

The relation between mutual information and Entropy and Conditional Entropy is:

I (X; Y) = H (X) - H (X ∣ Y) = H (Y) - H (Y ∣ X) = I (Y; X)

FF's Roam Notes

Explorer

Mutual Information

Graph View

Backlinks