Velvet Star Monitor

Standout celebrity highlights with iconic style.

general

Determinant of a block lower triangular matrix

Writer Mia Lopez
$\begingroup$

I'm trying to prove the following: Let $A$ be a $k\times k$ matrix, let $D$ have size $n\times n$, and $C$ have size $n\times k$. Then,

$$\det\left(\begin{array}{cc} A&0\\ C&D \end{array}\right) = \det(A)\det(D).$$

Can I just say that $AD - 0C = AD$, and I'm done?

$\endgroup$ 8

7 Answers

$\begingroup$

If $A$ is singular, its rows are linearly dependent, hence the rows of the entire matrix are linearly dependent, hence boths sides of the equation vanish.

If $A$ is not singular, we have

$$\pmatrix{I&0\\-CA^{-1}&I}\pmatrix{A&0\\C&D}=\pmatrix{A&0\\0&D}\;.$$

The determinants of the two new matrices are perhaps easier to derive from the Laplace expansion than that of the entire matrix. They are $1$ and $\det A \det D$, respectively, and the result follows.

Another way to express this is that if $A$ is not singular you can get rid of $C$ by Gaussian elimination.

$\endgroup$ 6 $\begingroup$

As @user153012 is asking for a proof in full detail, here is a brute-force approach using an explicit expression of a determinant of an $n$ by $n$ matrix, say $A = (a[i,j])$, $$\det A = \sum_{\sigma\in S_n}\operatorname{sgn}\sigma \prod_i a[{i,\sigma(i)}],$$ where $S_n$ is the symmetric group on $[n] = \{1,\dots, n\}$ and $\operatorname{sgn}\sigma$ denotes the signature of $\sigma$.

In matrix $$B = \left(\begin{array}{cc} A&0\\ C&D \end{array}\right),$$ we have $$b[i,j] = \begin{cases}a[i,j] & \text{if }i \le k, j \le k;\\ d[i-k, j-k] & \text{if }i > k, j > k; \\ 0 & \text{if }i \le k, j > k; \\ c[i-k,j] & \text{otherwise}.\end{cases}$$ Observe in $$\det B = \sum_{\sigma\in S_{n+k}}\operatorname{sgn}\sigma\prod_i b[i, \sigma(i)],$$ if $\sigma(i) = j$ such that $i\le k$ and $j > k$, then the corresponding summand $\prod_i b[i,\sigma(i)]$ is $0$. Any permutation $\sigma\in S_{n+k}$ for which no such $i$ and $j$ exist can be uniquely "decomposed" into two permutations, $\pi$ and $\tau$, where $\pi\in S_k$ and $\tau\in S_n$ such that $\sigma(i) = \pi(i)$ for $i \le k$ and $\sigma(k+i) = k+\tau(i)$ for $i \le n$. Moreover, we have $\operatorname{sgn}\sigma = \operatorname{sgn}\pi\operatorname{sgn}\tau$. Denote the collection of such permutations by $S_n'$. Therefore, we can write $$\begin{eqnarray}\det B &=& \sum_{\sigma\in S_n'}\operatorname{sgn}\sigma\prod_{i=1}^k b[i,\sigma(i)]\prod_{i=k+1}^{k+n} b[i,\sigma(i)] \\ &=& \sum_{\sigma\in S_n'}\operatorname{sgn}\sigma\prod_{i=1}^k a[i,\sigma(i)]\prod_{i=1}^nd[i,\sigma(i+k)-k] \\ & = & \sum_{\pi\in S_k,\tau\in S_n}\operatorname{sgn}\pi\operatorname{sgn}\tau\prod_{i=1}^k a[i,\pi(i)]\prod_{i=1}^nd[i,\tau(i)] \\ &=& \sum_{\pi\in S_k}\operatorname{sgn}\pi\prod_{i=1}^k a[i,\pi(i)]\sum_{\tau\in S_{n}}\operatorname{sgn}\tau\prod_{i=1}^nd[i,\tau(i)] \\ & = & \det A\det D.\end{eqnarray}$$ QED.

Update As @Marc van Leeuwen mentioned in the comment, a similar formula holds for permanents.The proof is basically the same as the proof for determinant except one has to drop off all those signatures of permutations.

$\endgroup$ 0 $\begingroup$

This is a fundamental result about determinants, and like most of such results it holds for matrices with entries in any commutative (unitary) ring. It is therefore good to have a proof that does not rely on the coefficients being in a field; I will use the Leibniz formula as definition of the determinant rather than a characterisation as alternating $n$-linear form. (And my apologies to those who find that distasteful; for some purposes using the definition is really best.)

Writing for a permutation $\sigma$ and a matrix $M$ the abbreviation $\def\sg{\operatorname{sg}}M[\sigma]=\sg(\sigma)\prod_iM_{i,\sigma(i)}$, the Leibniz formula says, for any $m\times m$ matrix $M$ $$ \det(M)=\sum_{\sigma\in S_m}M[\sigma] $$ The result is based on the following simple fact about symmetric groups

The subgroup of $S_{k+n}$ of permutations permuting the first $k$ elements among each other is canonically isomorphic to $S_k\times S_n$, and if $\sigma\in S_{k+n}$ corresponds to $(\pi,\rho)\in S_k\times S_n$ then $\sg(\sigma)=\sg(\pi)\sg(\rho)$.

In fact this is basically just saying that if the first $k$ elements are permuted among each other, then so are the remaining $n$ elements, and the sign of the whole permutation is the product of the signs of its restrictions to those two subsets.

Now if $M=\bigl(\begin{smallmatrix}A&0\\C&D\end{smallmatrix}\bigr)$ note that $M[\sigma]=0$ unless $\sigma$ permutes first $k$ indices among each other. From this it follows that $$ \det(M)=\sum_{\sigma\in S_{k+n}}M[\sigma] =\sum_{(\pi,\rho)\in S_k\times S_n}A[\pi]D[\rho] =\left(\sum_{\pi\in S_k}A[\pi]\right)\left(\sum_{\rho\in S_n}D[\rho]\right) =\det A\det D. $$


Alternatively, if one is willing to assume the property $\det(MN)=\det(M)\det(N)$ (not really easier than the one to be proved here, but maybe better known), and if one considers the special cases where either $A=I$ or $D=I$ to be clear (because one can obtain the result in these cases by repeated Laplace expansion by rows respectively by columns), then one can employ any decomposition of the form $$ \pmatrix{A&0\\C&D} = \pmatrix{A&0\\C_0&I} \pmatrix{I&0\\C_1&D} \qquad\text{with $C=C_0+C_1$} $$ (for instance with one of $C_0,C_1$ equal to zero) to obtain the desired result.

$\endgroup$ $\begingroup$

Yet another proof, in the case of fields, can be obtained if you are willing to enlarge your field to an algebraically closed one. Then any matrix can be put in (lower) triangular form, with the eigenvalues on the diagonal. In particular, there are invertible matrices $S, T$ of appropriate sizes such that $S^{-1} A S$ and $T^{-1} D T$ are in lower triangular form. Then $$ \begin{bmatrix} S& 0\\ 0 & T\\ \end{bmatrix}^{-1} \begin{bmatrix} A&0\\ C&D \end{bmatrix} \begin{bmatrix} S& 0\\ 0 & T\\ \end{bmatrix} = \begin{bmatrix} S^{-1} A S&0\\ T^{-1} C S&T^{-1} D T \end{bmatrix} $$ is in lower triangular form, and the rest is more or less straightforward. Clearly I rely on the multiplicativity of determinants here, or on the fact that the determinant is invariant under conjugation.

$\endgroup$ 5 $\begingroup$

Here is an approach that does not rely on any explicit definition of the determinant, nor any concept of the inverse. Instead, we can start with the 3 basic properties that the determinant function should satisfy. These three properties are:

(1) Det(I) = 1

(2) The Det() function is multilinear in each of the rows (columns) individually, assuming all other rows (columns) are held constant

(3) If the matrix M is not full rank, Det(M)=0

Artin showed that these three properties alone uniquely determine the form of the determinant function (I don't prove this here). Property 3 I am using here is slightly more general than that used by Artin, but it is equally intuitive and allows me to skip a step. First, you can show that

$$ Det \begin{pmatrix} A & 0 \\ C & D \end{pmatrix} = Det \begin{pmatrix} A & 0 \\ 0 & D \end{pmatrix} $$

To show this, we can expand the determinant of the original matrix M as $$ (4) Det(M)= Det(M_1)+Det(M_2) $$ where $M_1$ is the same as the original matrix but with the first k entries of the $k+1^{th}$ row set to 0, and $M_2$ is the same as the original matrix but with the last n-k elements of the $k+1^{th}$ row seet to 0. Note that the $k+1^{th}$ row of $M_1$ and $M_2$ sum to the $k+1^{th}$ row of M, and therefore (4) holds according to property (2). Note that $ Det(M_1)=0 $ since the resulting matrix is clearly not full-rank (there are now k+1 rows that have non zero entries in only k columns). Therefore we have $Det(M) = Det(M_2)$. We can then repeat this process for each row to show that the above claim is true.

Now, let us denote $$ M^*=\begin{pmatrix} A & 0 \\ 0 & D \end{pmatrix} , A^*= \begin{pmatrix} A & 0 \\ 0 & I \end{pmatrix} , D^*= \begin{pmatrix} I & 0 \\ 0 & D \end{pmatrix} $$ Note that $ M^*=A^*B^* $ I claim that $ Det(M^*) = Det(A^*)*Det(D^*). $ To show this we can show that the function $$(5) F(D^*)=F(d^*_1,...,d^*_n)=\frac{Det(A^*D^*)}{Det(A^*)}$$ also satisfies properties (1)-(3). Clearly if $D=I$, then the RHS of (5) reduces to 1. To show (ii), We can write the j-th column of $M^*=A^**D^*$ as $A^*d_j$. Since we already know the determinant is multilinear in each column, and $A^*d^*_j$ is a linear function of d^*_j, it follows that $F(d^*_1,...,d^*_n)$ is also multininear in each of the columns ${d^*_1,...,d^*_n}$. Finally, to show (iii), we can note that if $d^*_i=d^*_j$ then $A^*d^*_i=A^*d^*_j$, so the numerator on the RHS would be 0. Therefore, $F(d^*_1,...,d^*_n)=Det(d^*_1,...,d^*_n)$, so $Det(A^**D^*)=Det(D^*)*Det(A^*)$, as desired.

In order to finish the proof, one final step is needed. We need to show that $Det(A^*)=Det(A)$ and $Det(D^*)=Det(D)$. We use the same approach as above, by defining a target function which we show to be equal to the determinant. We do this by showing that the function $C(A)=Det(A^*)$ also satisfies properties (1)-(3) as a function of the columns of A, and therefore must be equal to its determinant. The basic ideas are that if A is the identity then so is $A^*$, so property (1) follows; any inear operation on a row (column) of A is also a linear operation on a row (column) of $A^*$, so (2) holds, and if A is not full rank, then neither is $A^*$, so (3) holds. Therefore, the function C(A), which extends A to $A^*$ by adding n-k standard basis vectors and then takes the determinant of the expanded matrix, is in fact equal to $Det(A)$.

Given all of this, we immediately get the result, since $$Det(M)=Det(M^*)=Det(A^**D^*)=Det(A^*)*Det(D^*)=Det(A)*Det(D) $$

$\endgroup$ $\begingroup$

Here is a sketch of what I consider a simpler direct proof (assuming complete eigenspaces --- it can be adapted for degenerate eigenspaces). Let's assume there is no zero eigenvalue, otherwise it's easy to find the eigenvector that the full matrix sends to zero, and so the determinant must be zero.

We're going to prove that the eigenvalues are the same. Then the fact that the determinant is the product of the eigenvalues finishes the proof.

Consider the eigenvectors of $D$. Now (pre)pad those with zeros (that is create a longer vector with zeros in the first few entries). These are eigenvectors of the full matrix.

Now consider eigenvectors of $A$. We're going to postpad with something. We have to figure out what it is. Let $\lambda$ be an eigenvalue of $A$ and $v_1$ its eigenvector. We'll pad it with $v_2$. So $v$ is made up of the vectors $v_1$ and $v_2$. The full matrix times $v$ has its first entries being $\lambda v_1$. It's second set of entries is $C v_1 + D v_2$. Set this equal to $\lambda v_2$ and solve for $v_2$. Since $D$ does not have a zero eigenvalue, it's solvable. Now we have an eigenvector for the full matrix that corresponds to $\lambda$.

So now because the eigenvalues of the full matrix are the eigenvalues of $D$ and of $A$, the fact that the determinant of a matrix is the product of the eigenvalues finishes the proof.

$\endgroup$ 1 $\begingroup$

This is mostly a rephrasing of one of the other answers, trying to explain the result in simpler terms.

I will work having in mind Leibniz formula for the determinant: the determinant equals the sum over all permutations of the product of the elements in each row picked according to the permutation:$$\det(M)=\sum_\sigma \operatorname{sgn}(\sigma)\prod_j a_{j,\sigma(j)}.$$When applying this formula to the given matrix, consider what are the permutations $\sigma$ giving a nonvanishing contribution: all permutations $\sigma$ such that $\sigma(j)>k$ for some $j\le k$ do not contribute to the sum.

Indeed, these permutations correspond to including in the multiplication (the multiplication in Leibniz formula I mean) matrix elements picked from the upper-right block of the big matrix. But this block only contains zero elements, therefore all such permutations correspond to vanishing contributions.

Therefore, we need to only consider the permutations $\sigma$ such that $\sigma(j)\le k$ for all $j\le k$. But these permutations will then also necessarily satisfy $\sigma(j)>k$ for all $j>k$ ${}^{(1)}$! This means that the permutations contributing to $\det M$ are effectively all possible combinations of the permutations in the upper-left block and the permutations of the lower-right block. One can then factor out the permutations in the upper-left block and thus give the result.


(1) Otherwise, we would have $\sigma(j)\le k$ for some $j>k$. But then there would be two different $j_0\neq j_1$ such that $\sigma(j_0)=\sigma(j_1)$, which cannot be if $\sigma$ is a permutation.

$\endgroup$