# proof of Birkhoff-von Neumann theorem

First, we prove the following lemma:
Lemma:
A convex combination  of doubly stochastic matrices is doubly stochastic. Proof:
Let $\{A_{i}\}_{i=1}^{m}$ be a collection of $n\times n$ doubly-stochastic matrices, and suppose $\{\lambda_{i}\}_{i=1}^{m}$ is a collection of scalars satisfying $\sum_{i=1}^{m}\lambda_{i}=1$ and $\lambda_{i}\geq 0$ for each $i=1,\ldots,m$. We claim that $A=\sum_{i=1}^{m}\lambda_{i}A_{i}$ is doubly stochastic.
Take any $i\in\{1,\ldots,m\}$. Since $A_{i}$ is doubly stochastic, each of its rows and columns sum to 1. Thus each of the rows and columns of $\lambda_{i}A_{i}$ sum to $\lambda_{i}$.
By the definition of elementwise summation, given matrices $N=M_{1}+M_{2}$, the sum of the entries in the $i$th column of $N$ is clearly the sum of the sums of entries of the $i$th columns of $M_{1}$ and $M_{2}$ respectively. A similar result holds for the $j$th row.
Hence the sum of the entries in the $i$th column of $A$ is the sum of the sums of entries of the $i$th columns of $\lambda_{k}A_{k}$ for each $i$, that is, $\sum_{k=1}^{m}\lambda_{k}=1$. The sum of entries of the $j$th row of $A$ is the same. Hence $A$ is doubly stochastic.
Observe that since a permutation matrix  has a single nonzero entry, equal to 1, in each row and column, so the sum of entries in any row or column must be 1. So a permutation matrix is doubly stochastic, and on applying the lemma, we see that a convex combination of permutation matrices is doubly stochastic. This provides one direction of our proof; we now prove the more difficult direction: suppose $B$ is doubly-stochastic. Define a weighted graph $G=(V,E)$ with vertex set $V=\{r_{1},\ldots,r_{n},c_{1},\ldots,c_{n}\}$, edge set $E$, where $e_{ij}=(r_{i},c_{i})\in E$ if $B_{ij}\neq 0$, and edge weight $\omega$, where $\omega(e_{ij})=B_{ij}$. Clearly $G$ is a bipartite graph  , with partitions  $R=\{r_{1},\ldots,r_{n}\}$ and $C=\{c_{1},\ldots,c_{n}\}$, since the only edges in $E$ are between $r_{i}$ and $c_{j}$ for some $i,j\in\{1,\ldots,n\}$. Furthermore, since $B_{ij}\geq 0$, then $\omega(e)>0$ for every $e\in E$. For any $A\subset V$ define $N(A)$, the neighbourhood of $A$, to be the set of vertices $u\in V$ such that there is some $v\in A$ such that $(u,v)\in E$. We claim that, for any $v\in V$, $\sum_{u\in N(\{v\})}\omega(u,v)=1$. Take any $v\in V$; either $v\in R$ or $v\in C$. Since $G$ is bipartite, $v\in R$ implies $N(\{v\})\subset C$, and $v\in C$ implies $N(\{v\})\subset R$. Now,

 $\begin{array}[]{c}v=r_{i}\implies\sum_{u\in N(r_{i})}\omega(r_{% i},u)=\sum_{\lx@stackrel{{j=1,}}{{e_{ij}\in E}}}^{n}\omega(r_{i},% c_{j})=\sum_{\lx@stackrel{{j=1,}}{{B_{ij}\neq 0}}}^{n}B_{ij}=\sum% _{j=1}^{n}B_{ij}=1}\\ v=c_{j}\implies\sum_{u\in N(c_{j})}\omega(u,c_{j})=\sum_{% \lx@stackrel{{i=1,}}{{e_{ij}\in E}}}^{n}\omega(r_{i},c_{j})=\sum_% {\lx@stackrel{{i=1,}}{{B_{ij}\neq 0}}}^{n}B_{ij}=\sum_{i=1}^{n}B_% {ij}=1}\end{array$

since $B$ is doubly stochastic. Now, take any $A\subset R$. We have

 $\sum_{\lx@stackrel{{v\in A}}{{w\in N(A)}}}\omega(v,w)=\sum_{v\in A% }\sum_{w\in N(\{v\})}\omega(v,w)=\sum_{v\in A}1=|A|$

Let $B=N(A)$. But then clearly $A\subset N(B)$, by definition of neighbourhood. So

 $|N(A)|=|B|=\sum_{\lx@stackrel{{v\in B}}{{w\in N(B)}}}\omega(v,w)% \geq\sum_{\lx@stackrel{{v\in B}}{{w\in A}}}\omega(v,w)=\sum_{% \lx@stackrel{{w\in A}}{{v\in N(A)}}}\omega(v,w)=|A|$

So $|N(A)|\geq|A|$. We may therefore apply the graph-theoretic version of Hall’s marriage theorem  to $G$ to conclude that $G$ has a perfect matching. So let $M\subset E$ be a perfect matching for $G$. Define an $n\times n$ matrix $P$ by

 $P_{ij}=\left\{\begin{array}[]{ll}1&\text{if }e_{ij}\in M\\ 0&\text{otherwise}\end{array}\right.$

Note that $B_{ij}=0$ implies $P_{ij}=0$: if $B_{ij}=0$, then $(r_{i},c_{j})\not\in E$, so $(r_{i},c_{j})\not\in M$, which implies $P_{ij}=0$. Further, we claim that $P$ is a permutation matrix:
Let $i$ be any row of $P$. Since $M$ is a perfect matching of $G$, there exists $e_{0}\in M$ such that $r_{i}$ is an end of $e_{0}$. Let the other end be $c_{j}$ for some $i$; then $P_{ij}=1$.
Suppose $i_{i},i_{2}\in\{1,\ldots,n\}$ with $i_{1}\neq i_{2}$ and $P_{i_{1},j}=P_{i_{2},j}=1$ for some $j$. This implies $(r_{i_{1}},c_{j}),(r_{i_{2}},c_{j})\in M$, but this implies the vertex $i$ is the end of two distinct edges, which contradicts the fact that $M$ is a matching.
Hence, for each row and column of $P$, there is exactly one nonzero entry, whose value is 1. So $P$ is a permutation matrix. Define $\displaystyle{\lambda=\min_{i,j\in\{1,\ldots,n\}}\{B_{ij}\mid P_{ij}\neq 0\}}$. We see that $\lambda>0$ since $B_{ij}\geq 0$, and $P_{ij}\neq 0\implies B_{ij}\neq 0$. Further, $\lambda=B_{pq}$ for some $p,q$. Let $D=B-\lambda P$. If $D=0$, then $\lambda=1$ and $B$ is a permutation matrix, so we are done. Otherwise, note that $D$ is nonnegative; this is clear since $\lambda P_{ij}\leq\lambda\leq B_{ij}$ for any $B_{ij}\neq 0$. Notice that $D_{pq}=B_{pq}-\lambda P_{pq}=\lambda-\lambda\cdot 1=0$. Note that since every row and column of $B$ and $P$ sums to 1, that every row and column of $D=B-\lambda P$ sums to $1-\lambda$. Define $B^{\prime}=\frac{1}{1-\lambda}D$. Then every row and column of $B^{\prime}$ sums to 1, so $B^{\prime}$ is doubly stochastic. Rearranging, we have $B=\lambda P+(1-\lambda)B^{\prime}$. Clearly $B_{ij}=0$ implies that $P_{ij}=0$ which implies that $B^{\prime}_{ij}=0$, so the zero entries of $B^{\prime}$ are a superset of those of $B^{\prime}$. But notice that $B^{\prime}_{pq}=\frac{1}{1-\lambda}D_{pq}=0$, so the zero entries of $B^{\prime}$ are a strict superset of those of $B$. We have decomposed $B$ into a convex combination of a permutation matrix and another doubly stochastic matrix with strictly more zero entries than $B$. Thus we may apply this procedure repeatedly on the doubly stochastic matrix obtained from the previous step, and the number of zero entries will increase with each step. Since $B$ has at most $n^{2}$ nonzero entries, we will obtain a convex combination of permutation matrices in at most $n^{2}$ steps. Thus $B$ is indeed expressible as a convex combination of permutation matrices.

## References

G. Birkhoff, “Tres observaciones sobre el algebra lineal,” Univ. Nac. Tucumán Rev, Ser. A, no. 5, pp147–151. (1946)

Title proof of Birkhoff-von Neumann theorem ProofOfBirkhoffvonNeumannTheorem 2013-03-22 13:10:03 2013-03-22 13:10:03 Andrea Ambrosio (7332) Andrea Ambrosio (7332) 14 Andrea Ambrosio (7332) Proof msc 15A51