Section8.1Fundamentals of Eigenvalues and Eigenvectors
General square matrix-vector multiplication is expressible as a scaling followed by a rotation, though a proof of this fact is beyond the scope of this text. However, for any given square matrix \(A\) there are some vectors for which left-multiplication by \(A\) yields a rotation by \(0\) radians or by \(\pi\) radians. These vectors represent specific direction vectors inherent to \(A\) itself, and each spans a line in \(\R^n.\)
When left-multiplied by \(A\) all vectors lying on such a line will be scaled by, it turns out, the exact same scaling factor. This scaling factor is inherent to the direction, which means that each matrix is associated with one or more directions along which left-multiplication by \(A\) consists of positive or negative scaling only. The process of finding these direction vectors and associated scaling factors is the topic of this section.
Subsection8.1.1Solving the Eigenproblem
The problem of solving \(\A\x=\b\) is one of two very fundamental problems in linear algebra. The other fundamental linear algebra problem, which we will call the eigenproblem, is the following. Given \(\A,\) find solutions \(\left(\lambda,\x\right)\) to
For the matrix in Example 8.1.2 we have \(\A,\)\(\left(4,\left( \begin{array}{r} 1\\1 \end{array} \right)\right),\) is one such solution and another is \(\left(3,\left( \begin{array}{r} 1\\2 \end{array} \right)\right).\)
An analysis of the sizes of the LHS and RHS of Equation (8.1.1) yields that \(\A\) must be square; we’ll say \(\A\in\R^{n\times n}.\)
For any square \(\A\) and any \(\lambda\in\R,\) a trivial solution to \(\A\x=\lambda\x\) is \(\x=\vec{0}.\)
Definition8.1.1.
Let \(\A\in\R^{n\times n}\) and suppose for some \(\lambda\in\R\) and for some \(\x\in\R^n\setminus\left\{\vec{0}\right\}\) we have \(\A\x=\lambda\x.\) Then \(\lambda\) is called an eigenvalue of \(\A\) and \(\x\) is a corresponding eigenvector. Since eigenvalues and eignevectors are associated, if \(\lambda,\x\) satisfy \(A\x=\lambda\x\) we call \((\lambda,\x)\) an eigenpair for \(A.\)
Example8.1.2.Eigenvalues and eigenvectors - verification only.
Consider the following matrix \(\A\in\R^{2\times2}\)
This says that multiplying vectors of the form \(\left(\begin{array}{r} a \\a \end{array} \right),\,a\ne0\) by \(\A\) has the property of scaling that vector by a factor of \(4.\) This is not true for every vector though: Note
Interesting! Note that nonzero vectors of the form \(\left(\begin{array}{r} a \\a \end{array} \right)\) are not the only vectors scaled by left-multiplication by \(\A:\) we have
Since we are not interested in the solution \(\x=\vec{0},\) then by Theorem 3.10.3 we are only interested in values of \(\lambda\) for which \(\A-\lambda\I\) is singular. By Theorem 4.4.15 the matrix \(\A-\lambda \I\) is singular if and only if \(\det(\A-\lambda \I)=0.\) This motivates the following definition.
Definition8.1.5.
The characteristic polynomial of \(\A\in\R^{n\times n}\) is \(C_A(\lambda)=\det(\A-\lambda \I)\) and the characteristic equation of \(\A\) is \(C_A(\lambda)=0.\)
The roots of the characteristic polynomial (solutions of the characteristic equation) of a matrix \(\A \in \R^{n \times n}\) are precisely the eigenvalues for \(\A\) and the corresponding eigenvectors for \(A\) arise as basis vectors for the nullspace of \(\A-\lambda \I.\)
For a given \(\A,\) real eigenvectors may or may not exist but complex eigenvectors always exist. This is related to the fact that quadratic equations may or may not have real roots but always have complex roots.
Although \(\vec{0}\) is never an eigenvector, it does yield a solution to \((\A-\lambda \I)\x=\vec{0}\) it is reasonable to join \(\vec{0}\) with the set of all eigenvectors for a given eigenvalue. This creates a subspace of \(\R^n.\) So for convenience we will include \(\vec{0}\) from now on.
Definition8.1.6.
Given a matrix \(\A\in \R^{n \times n}\) and an eigenvalue \(\lambda\) for \(\A,\) the set \(S_{\lambda}\) of all corresponding eigenvectors together with \(\vec{0}\) is called the corresponding eigenspace for \(\A\) and \(\lambda.\)
Theorem8.1.7.
Let \(\A\in \R^{n \times n}\) and let \(\lambda \in \R\) be an eigenvalue for \(\A.\) Then the eigenspace for \(\A\) and \(\lambda\) is a subspace of \(\R^{n}.\)
Proof.
Since we include the zero vector in the eigenspace for \(\A\) and \(\lambda\) with the other eigenvectors then we see the eigenspace is equal to the nullspace of \(\A-\lambda \I\) which we know is a subspace of \(\R^n\) by Theorem 5.1.24.
We summarize the process of solving the eigenproblem for \(\A\x=\lambda\x\) in the following algorithm.
Remark8.1.8.
(The Eigenproblem Algorithm)
Find all solutions \(\lambda_1,\ldots,\lambda_n\) (there may be duplicates) of the characteristic equation \(\det(\A-\lambda \I)=0.\)
For each \(\lambda_i,\) find the eigenspace \(S_{\lambda_i}=N(\A-\lambda_i\I).\)
The solution to the eigenproblem is the set \(\left\{\left(\lambda_i, S_{\lambda_i}\right)\right\}\) of (eigenvalue, eigenspace) pairs.
Example8.1.9.
Find all eigenvalues and corresponding eigenvectors for the matrix \(\A=\left(\begin{array}{rr}13\amp 2\\-45\amp -6 \end{array} \right).\)
Solution: We first form the characteristic equation of \(\A:\)
Hence for the eigenvalue \(\lambda_2=4,\) the corresponding eigenvectors are
span\(\left\{\left(\begin{array}{r}-2\\9 \end{array} \right)\right\} \setminus \{\vec{0}\}.\) Thus the solution to our eigenproblem for this matrix \(\A\) is
\(\left(3,S_3=\left\{\left. a\left(\begin{array}{rr}-2\\10 \end{array} \right)\right|a\in\R\setminus\{\vec{0}\}\right\}\right)\) and \(\left(4,S_4=\left\{\left.a\left(\begin{array}{r}-2\\9 \end{array} \right)\right|a\in \R\setminus\{\vec{0}\}\right\}\right).\)
Example8.1.10.
Solve the eigenproblem for the matrix \(\A=\left(\begin{array}{rrr}5\amp 0\amp 6\\-15\amp -7\amp -3\\-15\amp -12\amp 8 \end{array} \right).\)
The characteristic equation yields (after checking all possible roots, finding one of them by substitution, and using long or synthetic division to factor and then use the quadratic formula to find the other two roots) yields \(\lambda_1=-1,\,\lambda_2=2,\,\lambda_3=5.\) Now
Solution: Since \(\A-\lambda \I = \left( \begin{array}{rrr} 1-\lambda \amp 0 \amp 0 \\ 0 \amp 1-\lambda \amp 0\\2 \amp 4 \amp 3-\lambda \end{array} \right)\) is a lower triangular matrix, by Theorem 4.4.5 we can find the determinant by taking the product of the diagonals. Thus our characteristic equation is:
We find a basis for \(N(\A-\I)\) to be \(B_{1}=\left\{ \left(\begin{array}{r} -4\\2\\0 \end{array} \right),\left(\begin{array}{r} -2\\0\\2 \end{array} \right) \right\}\)
In Example 8.1.11 you may have noticed that the eigenvalue \(1\) was a solution to the characteristic equation with multiplicity \(2\) and that the dimension of the corresponding eigenspace was \(2.\) It is not always true that the multiplicity of the eigenvalue as a zero of the characteristic equation is equal to the dimension of the eigenspace. For instance, if, in Example 8.1.11, row \(2\) was replaced by \((2,1,0),\) then the dimension of the eigenspace would only have been \(1\) (check this!). What is true is that the dimension of the eigenspace is always less than or equal to the multiplicity of the eigenvalue as a zero of the characteristic equation.
Theorem8.1.13.
(Cayley-Hamilton Theorem) Let \(\A\in\R^{n \times n}\) and suppose that \(C_\A(x)\) is the characteristic polynomial of \(\A.\) Then \(C_\A(\A)=0_{n\times n}.\)
Proof.
The proof is beyond the scope of this text. The interested reader may consult [1].
Subsection8.1.2Eigenvalues of Transposes
There is an interesting relationship between the characteristic equation of \(\A\) and that of \(\A^T.\)
Theorem8.1.14.
Let \(\A\in \R^{n \times n}.\) Then \(C_\A(\lambda) = C_{\A^T}(\lambda)\text{;}\) that is a matrix and its transpose have the same characteristic equation.
Proof.
Since \(\lambda \I\) is a diagonal matrix, it is symmetric so \((\lambda \I)^T = \lambda \I\) by Definition 2.10.35. So we get
It follows immediately from this theorem that \(\A\) and its transpose will have the same eigenvalues, as follows.
Theorem8.1.15.
Let \(\A \in \R^{n \times n}.\) Then \(\A\) and \(\A^T\) have the same eigenvalues.
Subsection8.1.3Eigenvectors and Linear Independence
Let us now suppose that \(\A \in \R^{n \times n}\) with distinct eigenvalues \(\lambda_1 \neq \lambda_2\) with corresponding eigenvectors \(\vec{x_1}\) and \(\vec{x_2}.\) We will show that \(\vec{x_1}\) and \(\vec{x_2}\) are linearly independent. Let \(a,b\in \R\) such that
\begin{equation}
a\vec{x_1} + b \vec{x_2}=\vec{0}.\tag{8.1.3}
\end{equation}
To show that \(\vec{x_1}\) and \(\vec{x_2}\) are linearly independent, we need to show that \(a=b=0\) is the only possible solution to (8.1.3). We apply the matrix \(\A\) to both sides of (8.1.3):
By definition \(\vec{x_2}\neq \vec{0}\) since \(\vec{0}\) is not an eigenvector. Further since \(\lambda_1 \neq \lambda_2\) then \(\lambda_1-\lambda_2 \neq 0.\) It follows that \(b=0.\) Substituting \(b=0\) into (8.1.3) we see that we must also have \(a=0.\) Thus \(\vec{x_1}\) and \(\vec{x_2}\) are linearly independent. This argument can be expanded to a finite set of eigenvalues to prove the following theorem.
Theorem8.1.16.
Let \(\A\in\R^{n\times n}\) and suppose \(\A\) has distinct eigenvalues \(\lambda_1,\ldots,\lambda_n.\) Then the set \(\left\{\vec{x_i}\right\}_{i=1}^n\) of associated eigenvectors is linearly independent.
Subsection8.1.4Diagonalization
The eigenvalues and eigenvectors of a matrix provide us with another matrix factorization, sometimes. Indeed, suppose the eigenvectors of \(\A\) form a linearly independent set (as in for example when \(\A\) has \(n\) distinct eigenvalues, by Theorem 8.1.16). Then for each eigenpair \(\left(\lambda_i,\vec{x_i}\right)\) we have \(\A\vec{x_i}=\lambda_i\vec{x_i},\) so we can form a square matrix \(\X\) by \(\X_{*i}=\vec{x_i}\) and form a diagonal matrix \(\Lambda\) by placing the \(\lambda_i\)’s in order along the diagonal to obtain
\begin{align}
\A \X\amp=\X\Lam\text{ or } \nonumber\tag{8.1.7}\\
\A\amp =\X\Lam \X^{-1}.\tag{8.1.8}
\end{align}
Sometimes we can obtain such an eigenvalue factorization even when \(\A\) does not have \(n\) distinct eigenvalues, as long as we can still obtain a linearly independent set of eigenvectors. This is illustrated in the following example.
Example8.1.17.
Let \(\A=\I_2\) so that the characteristic equation is \((\lambda-1)^2=0.\) This yields \(\lambda_1=\lambda_2=1\) but \(\A-1\I=\begin{pmatrix}{0}\amp 0\\0\amp 0 \end{pmatrix}\) whose nullspace is spanned by \(\left\{\begin{pmatrix}{1}\\0 \end{pmatrix} ,\begin{pmatrix}{0}\\1 \end{pmatrix} \right\}.\)
Definition8.1.18.
The factorization (8.1.8) is called the eigenvalue decomposition or diagonalization of \(\A.\) If \(\A\) has such a decomposition (that is, when \(\X^{-1}\) exists), we say that \(\A\) is diagonalizable.
The following theorem follows from Definition 8.1.18 and the discussion above.
Theorem8.1.19.
\(\A \in \R^{n \times n}\) is diagonalizable if and only if it has \(n\) linearly independent eigenvectors.
Example8.1.20.
Show that \(\A=\left(\begin{array}{rr} 18 \amp -12 \\ 20 \amp -13 \end{array} \right)\) is diagonalizable by finding its eigenvalue decomposition.
Solution:\(C_\A(\lambda)=\det(\A-\lambda \I) = (18-\lambda)(-13-\lambda) - (20)(-12)=\lambda^2 - 5 \lambda + 6.\)\(C_\A(\lambda) = 0\) when \(\lambda = 2,3\) so the eigenvalues of \(\A\) are \(2\) and \(3.\) It can be shown that \(\vec{x_1}=\left(\begin{array}{r} 3 \\ 4 \end{array} \right)\) and \(\vec{x_1}=\left(\begin{array}{r} 4 \\5 \end{array} \right)\) are eigenvectors for \(2\) and \(3,\) respectively.
Thus \(\X=\left(\begin{array}{cc} 3 \amp 4\\4 \amp 5 \end{array} \right),\,\X^{-1}=\left(\begin{array}{rr} -5 \amp 4\\ 4 \amp -3 \end{array} \right)\) and \(\Lambda=\left(\begin{array}{rr} 2 \amp 0 \\ 0 \amp 3 \end{array} \right)\) so that
Show that \(\A=\begin{pmatrix}{0}\amp 1\\0\amp 0 \end{pmatrix}\) is not diagonalizable.
Proof: We have \(\det(\A-\lambda \I)=\lambda^2.\) This yields eigenvalues \(\lambda_1=\lambda_2=0\) with only one eigenvector \(\vec{x_1}=\vec{x_2}=\begin{pmatrix}{1}\\0 \end{pmatrix}.\) Thus the matrix \(\X=\begin{pmatrix}{1}\amp 1\\0\amp 0 \end{pmatrix}\) which is not invertible, so by Definition 8.1.18\(\A\) is not diagonalizable.
Remark8.1.22.
We can use these ideas to easily construct a matrix with given eigenvalues and (linearly independent) eigenvectors. For example we construct a matrix \(\A \in \R^{n \times n}\) with eigenvalues \(\lambda_1,\lambda_2,...,\lambda_n\) and corresponding eigenvectors \(\vec{x_1},\vec{x_2},...,\vec{x_n}.\) Let \(\Lambda\in \R^{n \times n}\) be the matrix with \(\lambda_i\) on the diagonals and \(0\) elsewhere and let \(\X\in \R^{n \times n}\) be the matrix whose columns are \(\vec{x_i},\) in the same order as the corresponding \(\lambda_i\)’s. Since the \(\vec{x_i}\) are linearly independent, then \(\X\) is invertible. The matrix \(\A=\X\Lambda \X^{-1}\) has the desired properties.
Subsection8.1.5Eigenvalues and Determinants
There is an important relationship between the eigenvalues of \(\A\) and the determinant summarized by the following two theorems.
Theorem8.1.23.
\(\A\in\R^{n\times n}\) is singular if and only if \(\lambda=0\) is an eigenvalue of \(\A.\)
Proof.
First assume that \(\A\) is singular. Then by Theorem 4.4.15 we have that \(0=\det(\A)=\det(\A-0\I),\) so by Definition 8.1.1\(\lambda=0\) is an eigenvalue of \(\A.\)
Conversely, suppose \(\lambda=0\) is an eigenvalue of \(\A.\) Then by Definition 8.1.1 with \(\lambda=0,\,\A\x=\vec{0}\) has a nontrivial solution which means precisely that some linear combination of the columns of \(\A\) equals \(\vec{0}.\) Thus some linear combination of the rows of \(\A^T\) equals \(\vec{0}.\) Now by Item 6, \(\A^T\) is singular so by Theorem 2.10.34, \(\A\) is singular. We have proven that \(\A\in\R^{n\times n}\) is singular if and only if \(0\) is an eigenvalue of \(\A,\) as desired.
Let \(C_\A(\lambda)\) be the characteristic polynomial of \(\A\) and suppose \(\A\) has eigenvalues \(\lambda_1,\lambda_2,...,\lambda_n\) (there might be duplicates). Since the eigenvalues of \(\A\) are the zeros of \(C_\A(\lambda)\) then \(C_\A(\lambda)\) can be factored as
But \(C_\A(\lambda)=\det(\A-\lambda \I)\) so setting \(\lambda=0\) yields \(C_\A(0)=\det(\A).\) We conclude that \(\det(\A) = C_\A(0)=(-1)^n\prod_{i=1}^n(0-\lambda_i)=\prod_{i=1}^n \lambda_i,\) as desired.