proof of Birkhoff-von Neumann theorem
First, we prove the following lemma:
Lemma:
A convex combination of doubly stochastic matrices is doubly stochastic.
Proof:
Let {Ai}mi=1 be a collection of n×n doubly-stochastic matrices, and suppose {λi}mi=1 is a collection of scalars satisfying
∑mi=1λi=1 and λi≥0 for each i=1,…,m.
We claim that A=∑mi=1λiAi is doubly stochastic.
Take any i∈{1,…,m}. Since Ai is doubly stochastic, each of its rows and columns sum to 1. Thus each of the rows and columns of λiAi sum to λi.
By the definition of elementwise summation, given matrices N=M1+M2, the sum of the entries in the ith column of N is clearly the sum of the sums of entries of the ith columns of M1 and M2 respectively. A similar result holds for the jth row.
Hence the sum of the entries in the ith column of A is the sum of the sums of entries of the ith columns of λkAk for each i, that is, ∑mk=1λk=1. The sum of entries of the jth row of A is the same. Hence A is doubly stochastic.
Observe that since a permutation matrix has a single nonzero entry, equal to 1, in each row and column, so the sum of entries in any row or column must be 1. So a permutation matrix is doubly stochastic, and on applying the lemma, we see that a convex combination of permutation matrices is doubly stochastic.
This provides one direction of our proof; we now prove the more difficult direction: suppose B is doubly-stochastic. Define a weighted graph G=(V,E) with vertex set V={r1,…,rn,c1,…,cn}, edge set E, where eij=(ri,ci)∈E if Bij≠0, and edge weight ω, where ω(eij)=Bij.
Clearly G is a bipartite graph
, with partitions
R={r1,…,rn} and C={c1,…,cn}, since the only edges in E are between ri and cj for some i,j∈{1,…,n}. Furthermore, since Bij≥0, then ω(e)>0 for every e∈E.
For any A⊂V define N(A), the neighbourhood of A, to be the set of vertices u∈V such that there is some v∈A such that (u,v)∈E.
We claim that, for any v∈V, ∑u∈N({v})ω(u,v)=1. Take any v∈V; either v∈R or v∈C. Since G is bipartite, v∈R implies N({v})⊂C, and v∈C implies N({v})⊂R. Now,
v=ri⟹∑u∈N(ri)ω(ri,u)=n∑j=1,eij∈Eω(ri,cj)=n∑j=1,Bij≠0Bij=n∑j=1Bij=1v=cj⟹∑u∈N(cj)ω(u,cj)=n∑i=1,eij∈Eω(ri,cj)=n∑i=1,Bij≠0Bij=n∑i=1Bij=1 |
since B is doubly stochastic. Now, take any A⊂R. We have
∑v∈Aw∈N(A)ω(v,w)=∑v∈A∑w∈N({v})ω(v,w)=∑v∈A1=|A| |
Let B=N(A). But then clearly A⊂N(B), by definition of neighbourhood. So
|N(A)|=|B|=∑v∈Bw∈N(B)ω(v,w)≥∑v∈Bw∈Aω(v,w)=∑w∈Av∈N(A)ω(v,w)=|A| |
So |N(A)|≥|A|. We may therefore apply the graph-theoretic version of Hall’s marriage theorem to G to conclude that G has a perfect matching.
So let M⊂E be a perfect matching for G. Define an n×n matrix P by
Pij={1if eij∈M0otherwise |
Note that Bij=0 implies Pij=0: if Bij=0, then
(ri,cj)∉E, so (ri,cj)∉M, which implies Pij=0.
Further, we claim that P is a permutation matrix:
Let i be any row of P. Since M is a perfect matching of G, there exists e0∈M such that ri is an end of e0. Let the other end be cj for some i; then Pij=1.
Suppose ii,i2∈{1,…,n} with i1≠i2 and Pi1,j=Pi2,j=1 for some j. This implies (ri1,cj),(ri2,cj)∈M, but this implies the vertex i is the end of two distinct edges, which contradicts the fact that M is a matching.
Hence, for each row and column of P, there is exactly one nonzero entry, whose value is 1. So P is a permutation matrix.
Define λ=mini,j∈{1,…,n}{Bij∣Pij≠0}. We see that λ>0 since Bij≥0, and Pij≠0⟹Bij≠0. Further, λ=Bpq for some p,q.
Let D=B-λP. If D=0, then λ=1 and B is a permutation matrix, so we are done. Otherwise, note that D is nonnegative; this is clear since λPij≤λ≤Bij for any Bij≠0.
Notice that Dpq=Bpq-λPpq=λ-λ⋅1=0.
Note that since every row and column of B and P sums to 1, that every row and column of D=B-λP sums to 1-λ. Define B′=11-λD. Then every row and column of B′ sums to 1, so B′ is doubly stochastic.
Rearranging, we have B=λP+(1-λ)B′. Clearly Bij=0 implies that Pij=0 which implies that B′ij=0, so the zero entries of B′ are a superset of those of B′. But notice that B′pq=11-λDpq=0, so the zero entries of B′ are a strict superset of those of B.
We have decomposed B into a convex combination of a permutation matrix and another doubly stochastic matrix with strictly more zero entries than B. Thus we may apply this procedure repeatedly on the doubly stochastic matrix obtained from the previous step, and the number of zero entries will increase with each step. Since B has at most n2 nonzero entries, we will obtain a convex combination of permutation matrices in at most n2 steps.
Thus B is indeed expressible as a convex combination of permutation matrices.
References
G. Birkhoff, “Tres observaciones sobre el algebra lineal,” Univ. Nac. Tucumán Rev, Ser. A, no. 5, pp147–151. (1946)
Title | proof of Birkhoff-von Neumann theorem |
---|---|
Canonical name | ProofOfBirkhoffvonNeumannTheorem |
Date of creation | 2013-03-22 13:10:03 |
Last modified on | 2013-03-22 13:10:03 |
Owner | Andrea Ambrosio (7332) |
Last modified by | Andrea Ambrosio (7332) |
Numerical id | 14 |
Author | Andrea Ambrosio (7332) |
Entry type | Proof |
Classification | msc 15A51 |