# simultaneous triangularisation of commuting matrices over any field

Let $\mathbf{e}_{i}$ denote the (column) vector whose $i$th position is $1$ and where all other positions are $0$. Denote by $[n]$ the set $\{1,\ldots,n\}$. Denote by $\mathrm{M}_{n}(\mathcal{K})$ the set of all $n\times n$ matrices over $\mathcal{K}$, and by $\mathrm{GL}_{n}(\mathcal{K})$ the set of all invertible elements of $\mathrm{M}_{n}(\mathcal{K})$. Let $d_{i}$ be the function which extracts the $i$th diagonal element of a matrix, i.e., $d_{i}(A)=\mathbf{e}_{i}^{\mathrm{T}}\!A\mathbf{e}_{i}$.

###### Theorem.

Let $\mathcal{K}$ be a field, let $A_{1},\ldots,A_{r}\in\mathrm{M}_{n}(\mathcal{K})$ be pairwise commuting matrices, and let $\mathcal{L}$ be a field extension of $\mathcal{K}$ in which the characteristic polynomials of all $A_{k}$ split (http://planetmath.org/SplittingField). Then there exists some $P\in\mathrm{GL}_{n}(\mathcal{L})$ such that

1. 1.

$P^{-1}A_{k}P$ is upper triangular for all $k=1,\ldots,r$, and

2. 2.

if $i,j,l\in[n]$ are such that $i\leqslant l\leqslant j$ and $d_{i}(P^{-1}A_{k}P)=d_{j}(P^{-1}A_{k}P)$ for all $k=1,\ldots,r$, then $d_{l}(P^{-1}A_{k}P)=d_{j}(P^{-1}A_{k}P)$ for all $k=1,\ldots,r$ as well.

The proof relies on two lemmas.

###### Lemma 1.

Let $\mathcal{K}$ be a field, let $A_{1},\ldots,A_{r}\in\mathrm{M}_{n}(\mathcal{K})$ be pairwise commuting matrices, and let $\mathcal{L}$ be a field extension of $\mathcal{K}$ in which the characteristic polynomials of all $A_{k}$ split. Then there exists some nonzero $\mathbf{u}\in\mathcal{L}^{n}$ which is an eigenvector of $A_{k}$ for all $k=1,\ldots,r$.

###### Lemma 2.

For any sequence $R_{1},\ldots,R_{r}\in\mathrm{M}_{n}(\mathcal{L})$ of upper triangular pairwise commuting matrices and every row index $i\in[n]$, there exists $\mathbf{v}\in\mathcal{L}^{n}\setminus\{0\}$ such that

 $R_{k}\mathbf{v}=d_{i}(R_{k})\mathbf{v}\quad\text{for all $$k\in[r]$$.}$
###### Proof.

This is by induction on $n$. The induction hypothesis is that given pairwise commuting matrices $A_{1},\ldots,A_{r}\in\mathrm{M}_{n}(\mathcal{L})$, whose characteristic polynomials all split in $\mathcal{L}$, and a sequence of arbitrary scalars $\mu_{1},\ldots,\mu_{r}\in\mathcal{L}$, there exists some $P\in\mathrm{GL}_{n}(\mathcal{L})$ such that:

1. 1.

$P^{-1}A_{k}P$ is upper triangular for all $k=1,\ldots,r$.

2. 2.

If some $i,j\in[n]$ are such that $i and $d_{j}(P^{-1}A_{k}P)=d_{i}(P^{-1}A_{k}P)$ for all $k\in[r]$, then $d_{i+1}(P^{-1}A_{k}P)=d_{i}(P^{-1}A_{k}P)$.

3. 3.

If some $j\in[n]$ is such that $d_{j}(P^{-1}A_{k}P)=\mu_{k}$ for all $k\in[r]$, then $d_{1}(P^{-1}A_{k}P)=\mu_{k}$ for all $k\in[r]$.

For $n=1$ this hypothesis is trivially fulfilled (all $1\times 1$ matrices are upper triangular). Assume that it holds for $n=m$ and consider the case $n=m+1$.

It is easy to see that condition 1 implies that $P\mathbf{e}_{1}$ must be an eigenvector that is common to all the matrices. If there exists a nonzero vector $\mathbf{u}_{1}\in\mathcal{L}^{n}$ such that $A_{k}\mathbf{u}_{1}=\mu_{k}\mathbf{u}_{1}$ for all $k=1,\ldots,r$ then this is such a common eigenvector, and in that case let $\lambda_{k}=\mu_{k}$ for all $k=1,\ldots,r$. Otherwise there by Lemma 1 exists a vector $\mathbf{u}_{1}\in\mathcal{L}^{n}\setminus\{\mathbf{0}\}$ such that $A_{k}\mathbf{u}_{1}=\lambda_{k}\mathbf{u}_{1}$ for some $\{\lambda_{k}\}_{k=1}^{r}\subseteq\mathcal{L}$. Either way, one gets a suitable candidate $\mathbf{u}_{1}$ for $P\mathbf{e}_{1}$ and eigenvalues $\lambda_{1},\ldots,\lambda_{r}$ that incidentally will satisfy $d_{1}(P^{-1}A_{k}P)=\lambda_{k}$ for all $k\in[r]$.

Let $\mathbf{u}_{2},\ldots,\mathbf{u}_{n}\in\mathcal{L}^{n}$ be arbitrary vectors such that $\{\mathbf{u}_{i}\}_{i=1}^{n}$ is a basis of $\mathcal{L}^{n}$. Let $U$ be the $n\times n$ matrix whose $i$th column is $\mathbf{u}_{i}$ for $1\leqslant i\leqslant n$.11By imposing extra conditions on the choice of the basis $\{\mathbf{u}_{i}\}_{i=1}^{n}$ (such as for example requesting that it is orthonormal) at this point, one can often prove a stronger claim where the choice of $P$ is restricted to some smaller group of matrices (for example the group of orthogonal matrices), but this requires assuming additional things about the fields $\mathcal{K}$ and $\mathcal{L}$. Then $U$ is invertible and for each $k$ the first column of $B_{k}=U^{-1}A_{k}U$ is

 $U^{-1}A_{k}U\mathbf{e}_{1}=U^{-1}A_{k}\mathbf{u}_{1}=\lambda_{k}U^{-1}\mathbf{% u}_{1}=\lambda_{k}\mathbf{e}_{1}\text{.}$

Furthermore

 $\displaystyle B_{j}B_{k}=U^{-1}A_{j}UU^{-1}A_{k}U=U^{-1}A_{j}A_{k}U=\\ \displaystyle=U^{-1}A_{k}A_{j}U=U^{-1}A_{k}UU^{-1}A_{j}U=B_{k}B_{j}$

for all $j$ and $k$.

Now let $A_{k}^{\prime}$ be the matrix formed from rows and columns $2$ though $n$ of $B_{k}$. Since $\det(A_{k}-\nobreak xI)=\det(B_{k}-\nobreak xI)=(\lambda_{k}-\nobreak x)\det(A% _{k}^{\prime}-\nobreak xI)$ by expansion (http://planetmath.org/LaplaceExpansion) along the first column, it follows that the characteristic polynomial of $A_{k}^{\prime}$ splits in $\mathcal{L}$. Furthermore all the $A_{k}^{\prime}$ have side $m=n-1$ and commute pairwise with each other, whence by the induction hypothesis there exists some $P^{\prime}\in\mathrm{GL}_{n-1}(\mathcal{L})$ such that every $P^{\prime-1}A_{k}^{\prime}P^{\prime}$ is upper triangular. Let $P=U\left(\begin{smallmatrix}1&0\\ 0&P^{\prime}\end{smallmatrix}\right)$. Then the submatrix consisting of rows and columns $2$ through $n$ of $P^{-1}A_{k}P$ is equal to $P^{\prime-1}A_{k}^{\prime}P^{\prime}$ and hence contains no nonzero subdiagonal elements. Furthermore the first column of $P^{-1}A_{k}P$ is equal to the first column of $B_{k}$ and thus the $P^{-1}A_{k}P$ are all upper triangular, as claimed.

It also follows from the induction hypothesis that $P$ can be chosen such that $d_{2}(P^{-1}A_{k}P)=d_{1}(P^{\prime-1}A_{k}^{\prime}P^{\prime})=\lambda_{k}=d_% {1}(P^{-1}A_{k}P)$ for all $k\in[r]$ if there is any $j\geqslant 2$ for which $d_{j}(P^{-1}A_{k}P)=d_{j-1}(P^{\prime-1}A_{k}^{\prime}P^{\prime})=\lambda_{k}=% d_{1}(P^{-1}A_{k}P)$ for all $k\in[r]$ and more generally if $2\leqslant i are such that $d_{j}(P^{-1}A_{k}P)=d_{i}(P^{-1}A_{k}P)$ for all $k\in[r]$ then similarly $d_{i+1}(P^{-1}A_{k}P)=d_{i}(P^{-1}A_{k}P)$ for all $k\in[r]$. This has verified condition 2 of the induction hypothesis. For the remaining condition 3, one may first observe that if there is some $i\in[n]$ such that $d_{i}(P^{-1}A_{k}P)=\mu_{k}$ for all $k\in[r]$ then by Lemma 2 there exists a nonzero $\mathbf{v}\in\mathcal{L}^{n}$ such that $P^{-1}A_{k}P\mathbf{v}=\mu_{k}\mathbf{v}$ for all $k\in[r]$. This means $P\mathbf{v}$ will fulfill the condition for choice of $\mathbf{u}_{1}$, and hence $d_{1}(P^{-1}A_{k}P)=\lambda_{k}=\mu_{k}$ as claimed.

The theorem now follows from the principle of induction. ∎

Title simultaneous triangularisation of commuting matrices over any field SimultaneousTriangularisationOfCommutingMatricesOverAnyField 2013-03-22 15:29:38 2013-03-22 15:29:38 lars_h (9802) lars_h (9802) 4 lars_h (9802) Theorem msc 15A21 CommutingMatrices