Section 3.13 The LU-Factorization
The factorization problem for matrices has a number of distinct solutions. In this section we define and study the process of factoring \(\A\in\R^{n\times n}\) into a product \(\A=\L\U\) of an upper-triangular matrix \(\U\) and a lower-triangular matrix \(\L,\) which can be done for any \(\A\in\R^{n\times n}.\)
Matrix factorizations are typically called decompositions. In linear algebra, though, the decomposition of a vector into a linear combination of other vectors is a fundamental operation so as part of our effort to disambiguate notation and terminology we will call \(\A=\L\U\) the LU-factorization of \(\A.\) A quite different factorization, the QR-factorization, will be given in a later section.
\begin{equation*}
\L=\left( \begin{array}{ccccc} \ast \amp 0 \amp 0 \amp \cdots \amp 0 \\
\ast \amp \ast \amp 0 \amp \cdots \amp 0 \\ \ast \amp \ast \amp \ast \amp \ddots \amp \vdots \\ \vdots \amp \vdots \amp \vdots \amp \ddots \amp 0 \\
\ast \amp \ast \amp \cdots \amp \cdots \amp \ast \\
\end{array}\right).
\end{equation*}
Definition 3.13.3. The sets \(\boldsymbol{\mathcal{U}}^{n\times n}\) and \(\boldsymbol{\mathcal{L}}^{n\times n}\).
We define the sets \(\mathcal{U}^{n\times n}\subset\R^{n\times n}\) and \(\mathcal{L}^{n\times n}\subset\R^{n\times n}\) to contain all \(n\times n\)upper-triangular matrices and all \(n\times n\)lower-triangular matrices, respectively.Theorem 3.13.4. Diagonal matrices are triangular.
Let \(\D\) be an \(n\times n\) diagonal matrix. Then \(\D\) is both upper-triangular and lower-triangular.Theorem 3.13.5. Transposing triangular and diagonal matrices.
Let \(\U\in\R^{n\times n}\) be upper-triangular and let \(\L\in\R^{n\times n}\) be lower-triangular. Then \(\U^T\in\mathcal{L}^{n\times n}\) and \(\L^T\in\mathcal{U}^{n\times n}.\)Theorem 3.13.6.
The sets \(\mathcal{U}^{n \times n}\subset\R^{n\times n}\) of upper-triangular and \(\mathcal{L}^{n \times n}\subset\R^{n\times n}\) of lower-triangular matrices are CLC and are closed under matrix multiplication. Explicitly, if \(\U_1,\U_2\) and \(\L_1,\L_2\) are upper- and lower- triangular matrices, respectively and \(a,b\in\R\) are scalars, then \(\U_1\U_2\) and \(a\,\U_1+b\,\U_2\) are upper-triangular and \(\L_1\L_2\) and \(a\L_1+b\L_2\) are lower-triangular.Theorem 3.13.7. \(\boldsymbol{\mathcal{U}}^{n\times n}\) and \(\boldsymbol{\mathcal{L}}^{n\times n}\) are inverse-closed.
Let \(\U\in\R^{n\times n}\) and \(\L\in\R^{n\times n}\) be invertible upper- and lower- triangular matrices, respectively. Then \(\U^{-1}\) is upper-triangular and \(\L^{-1}\) is lower-triangular.Remark 3.13.8. Upper- and lower-triangular elementary matrices.
Revisiting Remark 3.7.9, we observe that- Every elementary matrix \(\E\) which changes only a row below the diagonal is a lower-triangular matrix, and
- Every elementary matrix \(\E\) which changes only a row above the diagonal is an upper-triangular matrix.
This leads to the following theorem.
Theorem 3.13.9. The LU-Factorization - Approach 1.
Let \(\A\in\R^{n \times n}\) eliminate to \(\U\) via left-multiplication by elementary matrices \(\E_1,\E_2,..,\E_t,\) which change only rows below the diagonal so that \(\E_t\cdots\E_2\E_1\A=\U.\) Denote \(\L=\E_1^{-1}\E_2^{-1}\cdots\E_t^{-1}.\) Then \(\A=\L\U\) where \(\L\) is lower-triangular and \(\U\) is upper-triangular.Theorem 3.13.10. The LU-Factorization exists for all square matrices, sort of.
Let \(\A\in\R^{n\times n}.\) Then there exists permutation matrices \(\P_1,\P_2\in\R^{n\times n},\) and \(\U\in\mathcal{U}^{n\times n}\) and \(\L\in\mathcal{L}^{n\times n}\) for which \(\P_1\A\P_2=\L\U.\) If \(A_{11}=0\) and some other entry of \(\A_{\ast 1}\) is nonzero then we may use \(\P_2=\I.\) If \(A_{11}\ne0\) then we may use \(\P_1=\P_2=\I.\)Proof.
Example 3.13.11. LU-factorization.
\begin{equation*}
\begin{array}{llc}\left(\begin{array}{rrr|rrr}1\amp -3\amp 11\amp 1\amp 0\amp 0\\2\amp -5\amp 19\amp 0\amp 1\amp 0\\1\amp -6\amp 7\amp 0\amp 0\amp 1 \end{array} \right) \amp \underset{(-1)R_1+R_3\rightarrow R_3}{\xrightarrow{-2R_1+R_2\rightarrow R_2}}\amp\left(\begin{array}{rrr|rrr}1\amp -3\amp 11\amp 1\amp 0\amp 0\\0\amp 1\amp -3\amp -2\amp 1\amp 0\\0\amp -3\amp -4\amp -1\amp 2\amp 1 \end{array} \right)\\
\amp \xrightarrow{3R_2+R_3\rightarrow R_3}\amp\left(\begin{array}{rrr|rrr}1\amp -3\amp 11\amp 1\amp 0\amp 0\\0\amp 1\amp -3\amp -2\amp 1\amp 0\\0\amp 0\amp -13\amp -7\amp 3\amp 1 \end{array} \right)\end{array}
\end{equation*}
so
\begin{align*}
\left(\begin{array}{rrr}1\amp 0\amp 0\\0\amp 1\amp 0\\0\amp 3\amp 1 \end{array} \right)\!\!\left(\begin{array}{rrr}1\amp 0\amp 0\\0\amp 1\amp 0\\-1\amp 0\amp 1 \end{array} \right)\!\!\left(\begin{array}{rrr}1\amp 0\amp 0\\-2\amp 1\amp 0\\0\amp 0\amp 1\end{array}\right)\!\! \left(\begin{array}{rrr}1\amp -3\amp 11\\2\amp -5\amp 19\\1\amp -6\amp 7 \end{array} \right)\amp\!\,=\!\!\left(\begin{array}{rrr}1\amp -3\amp 11\\0\amp 1\amp -3\\0\amp 0\amp -13 \end{array} \right).
\end{align*}
Each of the elementary matrices on the left are invertible and by Theorem 3.7.7 finding those inverses requires no calculation:
\begin{align*}
\left(\begin{array}{rrr}1\amp 0\amp 0\\-2\amp 1\amp 0\\0\amp 0\amp 1 \end{array} \right)^{-1}\amp =\,\,\left(\begin{array}{rrr}1\amp 0\amp 0\\2\amp 1\amp 0\\0\amp 0\amp 1 \end{array} \right)\\
\left(\begin{array}{rrr}1\amp 0\amp 0\\0\amp 1\amp 0\\-1\amp 0\amp 1 \end{array} \right)^{-1}\amp =\,\,\left(\begin{array}{rrr}1\amp 0\amp 0\\0\amp 1\amp 0\\1\amp 0\amp 1 \end{array} \right)\\
\left(\begin{array}{rrr}1\amp 0\amp 0\\0\amp 1\amp 0\\0\amp 3\amp 1 \end{array} \right)^{-1}\,\,\amp =\left(\begin{array}{rrr}1\amp 0\amp 0\\0\amp 1\amp 0\\0\amp -3\amp 1 \end{array} \right)
\end{align*}
Invoking Theorem 3.13.9 we have
\begin{align*}
\left(\begin{array}{rrr}1\amp -3\amp 11\\2\amp -5\amp 19\\1\amp -6\amp 7 \end{array} \right)\amp=\left(\begin{array}{rrr}1\amp 0\amp 0\\2\amp 1\amp 0\\0\amp 0\amp 1 \end{array} \right)\!\!\left(\begin{array}{rrr}1\amp 0\amp 0\\0\amp 1\amp 0\\1\amp 0\amp 1\end{array}\right)\!\!\left(\begin{array}{rrr}1\amp 0\amp 0\\0\amp 1\amp 0\\0\amp -3\amp 1 \end{array} \right)\!\!\left(\begin{array}{rrr}1\amp -3\amp 11\\0\amp 1\amp -3\\0\amp 0\amp -13 \end{array} \right)\nonumber\\
\amp \,\,=\,\,\left(\begin{array}{rrr}1\amp 0\amp 0\\2\amp 1\amp 0\\1\amp -3\amp 1 \end{array} \right)\left(\begin{array}{rrr}1\amp -3\amp 11\\0\amp 1\amp -3\\0\amp 0\amp -13 \end{array} \right)\,\,=\,\,\L\U\text{.}
\end{align*}
Remark 3.13.12.
Computing the LU-factorization via Theorem 3.13.9 is straightforward but it is always a great idea to view a process through a different lens as long as it doesn’t cost too much.An alternative approach to computing the LU-factorization is to compute \(\L\) directly from \(\L^{-1};\) it requires minimal work to find the inverse of a lower-triangular matrix (when it exists, of course). This approach is described in the following algorithm.
Algorithm 3.13.13. \(\L\U\)-Factorization: Approach 2.
- Use elimination on \((\A|\I)\) to produce \(\left(\U|\L^{-1}\right)\) (see the simplication in Remark 3.13.14 below).
- Use elimination on \(\left(\L^{-1}|\I\right)\) to produce \((\I|\L).\)
- Write \(\A=\L\U\) explicitly.
Remark 3.13.14. A small labor-saver.
Consider the three elementary matrices \(\E_{(a)21},\,\E_{(b)31},\,\E_{(c)32}\in\R^{3\times3},\) used for eliminating below the diagonal in \(\A\in\R^{3\times n}.\) We have
\begin{equation*}
\left(\begin{array}{rrr}1\!\amp \!\!\!0\!\!\amp 0\\0\!\amp \!\!\!1\!\!\amp 0\\0\!\amp \!\!\!c\!\!\amp 1 \end{array}\right)\!\!\left(\begin{array}{rrr}1\!\amp \!\!\!0\!\!\amp 0\\0\!\amp \!\!\!1\!\!\amp 0\\b\!\amp \!\!\!0\!\!\amp 1\end{array}\right)\!\!\left(\begin{array}{rrr}1\!\amp \!\!\!0\!\!\amp 0\\a\!\amp \!\!\!1\!\!\amp 0\\0\!\amp \!\!\!0\!\!\amp 1 \end{array} \right)\!\left(\begin{array}{rrr}A_{11}\amp \!\!A_{12}\!\!\amp A_{13}\\ A_{21}\amp \!\!A_{22}\!\!\amp A_{23}\\A_{31}\amp \!\!A_{32}\!\!\amp A_{33}\end{array} \right)\!=\!\left(\begin{array}{ccc}U_{11}\amp \!\!U_{12}\!\!\amp U_{13}\\0\amp \!\!U_{22}\!\!\amp U_{23}\\0\amp \!\!0\!\!\amp U_{33}\end{array} \right)
\end{equation*}
and, interestingly,
\begin{equation*}
\left(\begin{array}{r\!r\!r}1\amp 0\amp 0\\0\amp 1\amp 0\\0\amp c\amp 1 \end{array} \right)\left(\begin{array}{rrr}1\amp 0\amp 0\\0\amp 1\amp 0\\b\amp 0\amp 1 \end{array} \right) \left(\begin{array}{rrr}1\amp 0\amp 0\\a\amp 1\amp 0\\0\amp 0\amp 1 \end{array} \right)=\left(\begin{array}{rrr}1\amp 0\amp 0\\a\amp 1\amp 0\\b\amp c\amp 1 \end{array} \right)=\L^{-1}.
\end{equation*}
This property holds for \(m\gt3\) as well. So rather than eliminating \((\A|\I)\rightarrow\left(\U|\L^{-1}\right),\) we eliminate \(\A\rightarrow\U\) and simply write down \(\L^{-1}\) from the coefficients of the row operations we used. This allows us to avoid repeatedly rewriting the RHS of an augmented matrix.
But beware! This nice relation does not always hold if the order of row operations is changed (as one might expect, since the associated matrix multiplication is not commutative).
In a \(3\times3\) setting for example, if we perform \(cR_2+R_3\rightarrow R_3\) before \(aR_1+R_2\rightarrow R_2\) to eliminate \(A_{21},\) we have
\begin{equation*}
\left(\begin{array}{rrr}1\amp 0\amp 0\\0\amp 1\amp 0\\b\amp 0\amp 1 \end{array} \right)\left(\begin{array}{rrr}1\amp 0\amp 0\\a\amp 1\amp 0\\0\amp 0\amp 1\end{array}\right)\left(\begin{array}{rrr}1\amp 0\amp 0\\0\amp 1\amp 0\\0\amp c\amp 1 \end{array} \right)=\left(\begin{array}{ccc}1\amp 0\amp 0\\a\amp 1\amp 0\\ac+b\amp c\amp 1 \end{array} \right)
\end{equation*}
Remark 3.13.15. \(\L\D\boldsymbol{\widetilde{U}}\) factorization.
Any upper-triangular matrix \(\U\) can be factored as \(\U=\D\widetilde{\U}\) where \(\D\) is a diagonal matrix with
\begin{equation*}
D_{ii}=\begin{cases}U_{ii}\amp U_{ii}\ne0\\1\amp U_{ii}=0\end{cases}
\end{equation*}
and \(\widetilde{\U}\) is upper triangular with
\begin{equation*}
\widetilde{U}_{ii}=\begin{cases}1\amp U_{ii}\ne0\\0\amp U_{ii}=0\,.\end{cases}
\end{equation*}
If \(\U\) has a zero on one of the diagonal positions (say \(U_{ii}=0\)), then we set \(D_{ii}=1\) and \(\widetilde{\U}_{i\ast}=\U_{i\ast}.\)
So once we have an \(\L\U\)-factorization for \(\A\) we can immediately compute what we call its \(\L\D\widetilde{\U}\)-factorization. Proceeding from \(\L\U\) to \(\L\D\widetilde{\U}\) is easier in practice than we might expect from the notation above.
Example 3.13.16. \(\L\U\)-factorization and \(\L\D\boldsymbol{\widetilde{U}}\)-factorization.
\begin{align*}
\A\amp=\left(\begin{array}{rrr}3\amp -2\amp -4\\6\amp -6\amp -9\\-15\amp -2\amp 19\end{array}\right)=\left(\begin{array}{rrr}1\amp 0\amp 0\\2\amp 1\amp 0\\-5\amp 6\amp 1\end{array}\right)\left(\begin{array}{rrr}3\amp -2\amp -4\\0\amp -2\amp -1\\0\amp 0\amp 5\end{array}\right)=\L\U\\
\amp=\left(\begin{array}{rrr}1\amp 0\amp 0\\2\amp 1\amp 0\\-5\amp 6\amp 1\end{array}\right)\left(\begin{array}{rrr}3\amp 0\amp 0\\0\amp -2\amp 0\\0\amp 0\amp 5\end{array}\right)\left(\begin{array}{rrr}1\amp -2/3\amp -4/3\\0\amp 1\amp 1/2\\0\amp 0\amp 1\end{array}\right)\,\,=\,\,\L\D\widetilde{\U}
\end{align*}
\begin{align*}
\B\amp=\left(\begin{array}{rrr}2\amp 4\amp 9\\10\amp 17\amp 46\\6\amp 33\amp 20\end{array}\right)=\left(\begin{array}{rrr}1\amp 0\amp 0\\5\amp 1\amp 0\\3\amp -7\amp 1\end{array}\right)\left(\begin{array}{rrr}2\amp 4\amp 9\\0\amp -3\amp 1\\0\amp 0\amp 0\end{array}\right)=\L\U\\
\amp=\left(\begin{array}{rrr}1\amp 0\amp 0\\5\amp 1\amp 0\\3\amp -7\amp 1\end{array}\right)\left(\begin{array}{rrr}2\amp 0\amp 0\\0\amp -3\amp 0\\0\amp 0\amp 1\end{array}\right)\left(\begin{array}{rrr}1\amp 2\amp 9/2\\0\amp 1\amp -1/3\\0\amp 0\amp 0\end{array}\right)\,\,=\,\,\L\D\widetilde{\U}
\end{align*}