Section7.4Gram-Schmidt Orthogonalization and QR-Factorization
In this section we explore the notion of starting with a linearly independent set \(\{\vec{v_1},\vec{v_2},\ldots,\vec{v_n}\}\) and transforming it into an orthonormal basis \(\{\vec{q_1},\vec{q_2},\ldots,\vec{q_n}\}\)of span\(\{\vec{v_1},\vec{v_2},\ldots,\vec{v_n}\}.\) Along with the notion of projection we’ll need some additional tools first.
Theorem7.4.1.
The set \(\{\vec{q_1},\vec{q_2},\ldots,\vec{q_n}\}\in\R^m\) is an orthonormal set if
In light of this definition we may recast our problem in this section as that of creating an orthonormal basis from a linearly independent basis. An understanding of the process by which we will accomplish this requires an appreciation for one particularly interesting property of projection matrices, which are illustrated by the next two theorems.
Theorem7.4.6.
Let \(\P\) be a projection. Then col\((\I-\P)=N(\P).\)
Proof.
We want to demonstrate subset inclusion in each direction. First fix a vector of coefficients \(\b\in\R^m\) and form the linear combination \((\I-\P)\b\) of the columns of \(\I-\P.\) To see if this is in \(N(\P)\) we multiply on the left by \(\P:\)
so indeed col\((\I-\P)\subseteq N(\P).\) Next, to show that \(N(\P)\subseteq\) col\((\I-\P),\) note that if \(\b\in N(\P)\) so that \(\P\b=\vec{0},\) then \((\I-\P)\b=\b\) which tells us that \(\b\in\) col\((\I-\P)\) since there is a coefficient vector (namely \(\b\) itself) from which we can form \(\b\) as a linear combination of the columns of \((\I-\P).\) Thus \(N(\P)\) is a subset of col\((\I-\P)\) and hence col\((\I-\P)=N(\P)\) as desired.
Theorem7.4.7.
Let \(\P\) be a projection. Then \(N(\I-\P)=\) col\((\P).\)
Proof.
This follows by applying Theorem 7.4.6 to the projection matrix \(\I-\P.\)
Remark7.4.8.
Theorem 7.4.6 and Theorem 7.4.7 show that any projection matrix \(\P\) decomposes \(\R^m\) into two orthogonal subspaces col\((\P)\) and col\((\I-\P).\) For any \(\x\in\) col\((\P)\) and \(\y\in\) col \((\I-\P)\) we have \(\x\perp\y.\) This notion forms the basis for a process by which we take a linearly independent set \(\left\{\vec{v_1},\vec{v_2},\ldots,\vec{v_n}\right\}\) and transform it into an orthogonal basis \(\left\{\vec{q_1},\vec{q_2},\ldots,\vec{q_n}\right\}\) of span\(\left\{\vec{v_1},\vec{v_2},\ldots,\vec{v_n}\right\}.\) This process is called Gram-Schmidt orthogonalization.
Remark7.4.9.
Figure7.4.10.Step 2 of the Gram-Schmidt process
Suppose \(\left\{\vec{v_1},\vec{v_2},\ldots,\vec{v_n}\right\}\subset\R^m\) is a linearly independent set.
First we create an orthogonal (but not necessarily orthonormal) basis \(\left\{\vec{u_i}\right\}_{i=1}^n\) as follows.
We select the initial direction to be that of \(\vec{v_1}\) and set \(\vec{u_1}=\vec{v_1}\text{.}\)
We create \(\vec{u_2}\) by decomposing \(\vec{v_2}\) using Theorem 7.1.27:
Taking dot products shows \(\vec{u_3}\perp\vec{u_1}\) and \(\vec{u_3}\perp\vec{u_2}\text{.}\)
We continue in this manner until the set of linearly independent vectors is exhausted. The \(j^{\text{th}}\) such vector \(\vec{u_j}\) will be orthogonal to each of \(\vec{u_1},\ldots,\vec{u_{j-1}}.\)
Finally we normalize each of the \(\vec{u_j}\) to obtain the orthonormal vectors \(\vec{q_j}.\)
Theorem7.4.11.
Suppose \(\left\{\vec{v_1},\vec{v_2},\ldots,\vec{v_n}\right\}\subset\R^m\) is a linearly independent set in \(\R^m.\) Then the process of Gram-Schmidt orthogonalization, described above in Remark 7.4.9, creates an orthonormal basis \(\,\left\{\vec{q_1},\vec{q_2},\ldots,\vec{q_n}\right\}\) for span\(\left\{\vec{v_1},\vec{v_2},\ldots,\vec{v_n}\right\}.\)
is an orthonormal basis for span\(\left\{\vec{v_1},\vec{v_2},\ldots,\vec{v_n}\right\}.\)
A neat feature of the Gram-Schmidt process is that it provides another factorization of a matrix. How? We move the subtracted terms in (7.4.5) to the other side to get
which expresses each vector \(\vec{v_j}\) as a linear combination of the \(\displaystyle{\{\vec{q_i}\}_{i\le j}}\) whose coefficients are the dot products of \(\vec{v_j}\) with the unit direction vectors in each of the directions determined up to that point.
Remember the \(LU\)-factorization? We may now construct the \(QR\)-factorization in which we write \(\A\) as the product of orthogonal and upper-triangular matrices. This factorization is made explicit in the next theorem.
Theorem7.4.12.
Let \(\A=\begin{pmatrix}|\amp |\amp \cdots\amp |\\\vec{v_1}\amp \vec{v_2}\amp \cdots\amp \vec{v_n}\\|\amp |\amp \cdots\amp | \end{pmatrix}\in\R^{n\times n}\) be nonsingular. Then the \(\boldsymbol{Q}\boldsymbol{R}\)-factorization of \(\A\) is given by \(\A=\boldsymbol{Q}\boldsymbol{R}\) where
is the orthogonal matrix of unit column vectors obtained from the column vectors \(\A_{\ast j}\) of \(\A\) by the Gram-Schmidt process described in Theorem 7.4.11 and \(\boldsymbol{R}\) is the upper-triangular matrix