Fundamentals of Linear Transformations

Section 9.1 Fundamentals of Linear Transformations

Subsection 9.1.1 Definitions and Examples

Definition 9.1.1. Linear transformations.

Let \(V\) and \(W\) be vector spaces over the field \(\Bbb{F}\) and let \(T\) be a function from \(V\) to \(W\) with the properties

\(T(\x+\y)=T(\x)+T(\y)\) for all \(\x,\y\in V\)
\(T(a\x)=aT(\x)\) for all \(\x\in V,\,a\in\R\text{.}\)

Then \(T\) is called a linear transformation from \(V\) to \(W\text{.}\)

Theorem 9.1.2. Matrix multiplication as a linear transformation.

Let \(V=\R^n\) and \(W=\R^m,\) and \(\A\in\R^{m\times n}\text{.}\) The function

\begin{equation} T_\A:\R^n \rightarrow \R^m\text{ defined by } T_\A(\x)=\A\x\tag{9.1.1} \end{equation}

is a linear transformation induced by \(\A\text{.}\) Checking the two properties yields:

Let \(\x,\y\in\R^n\text{.}\)

\begin{align*} T_\A(\x+\y)\amp =\A(\x+\y)\\ \amp = \A\x+ \A\y\text{ by }\knowl{./knowl/xref/linearity.html}{\text{Theorem 2.7.15}},\text{ with } c=d=1\\ \amp =T_\A\left(\x\right)+T_\A\left(\y\right)\text{.} \end{align*}
Let \(a\in\R\) and \(\x\in\R^n\text{.}\) Then

\begin{align*} T_\A(a\x)\amp =\A(a\x)\,=\,a\A\x\text{ by }\knowl{./knowl/xref/linearity.html}{\text{Theorem 2.7.15}} \text{ with } c=a,d=0\\ \amp =aT_\A\left(\x\right)\text{.} \end{align*}

By Definition 9.1.1 we conclude that \(T_\A\) is a linear transformation, and since \(\A\) was arbitrary we may infer that any matrix \(\A\) induces a linear transformation \(T_\A\text{.}\)

Example 9.1.3.

As a particular case of Theorem 9.1.2, the function \(T\) from \(\R^3 \rightarrow \R^2\) defined by

\begin{equation*} T\left[ \left( \begin{array}{r} x\\y\\z \end{array} \right) \right] = \left(\begin{array}{rrr} 1 \amp 1\amp 0 \\ -1 \amp 1 \amp -2 \end{array} \right) \left( \begin{array}{r} x\\y\\z \end{array} \right) = \left(\begin{array}{c} x+y \\ -x+y-2z \end{array} \right) \end{equation*}

is a linear transformation.

Example 9.1.4. Linear transformations on the set of polynomials with bounded degree.

Recall from Example 5.1.9 in Chapter 5 that the set \(\mathcal{P}_n :=\{f(x)= a_0 + a_1x+a_2x^2 + \cdots+ a_nx^n\mid a_i \in \R\}\) of polynomials of degree less than or equal to \(n\) with coefficients in \(\R\) under polynomial addition and scalar multiplication is a vector space. The function \(T:\mathcal{P}_n \rightarrow \mathcal{P}_n\) given by \(T(f(x))=f'(x)\) is a linear transformation since from rules of derivatives we have:

For all \(f,g \in \mathcal{P}_n, T(f+g)=(f+g)' = f'+g',\) and
For all \(f \in \mathcal{P}_n,\,\alpha \in \R,\) \(T(\alpha f)=(\alpha f)' = \alpha f'\text{.}\)

Example 9.1.5. Non-linear transformation.

Determine whether or not the mapping \(T:\R^2 \rightarrow \R^2\) given by \(\displaystyle{T\left[ \left( \begin{array}{r} x\\y \end{array} \right) \right] = \left(\begin{array}{r} x^2 \\y^2 \end{array} \right)}\) is a linear transformation.

Solution: First let’s check Property 1: Let \(\left( \begin{array}{r} x_1\\y_1 \end{array} \right), \left( \begin{array}{r} x_2\\y_2 \end{array} \right) \in \R^2,\)

\begin{equation} T\left[ \left( \begin{array}{r} x_1\\y_1 \end{array} \right)+\left( \begin{array}{r} x_2\\y_2 \end{array} \right)\right] =T\left[ \left( \begin{array}{r} x_1+x_2\\y_1+y_2 \end{array} \right)\right]=\left( \begin{array}{r} (x_1+x_2)^2\\ (y_1+y_2)^2 \end{array} \right)\tag{9.1.2} \end{equation}

On the other hand,

\begin{equation} T\left[\left( \begin{array}{r} x_1\\y_1 \end{array} \right)\right]+T\left[\left( \begin{array}{r} x_2\\y_2 \end{array} \right)\right]= \left( \begin{array}{r} x^2_1\\y^2_1 \end{array} \right)+\left( \begin{array}{r} x^2_2\\y^2_2 \end{array} \right)=\left( \begin{array}{r} x^2_1 + x^2_2\\y^2_1+y^2_2 \end{array} \right)\tag{9.1.3} \end{equation}

It seems evident that the RHS of (9.1.2) is not equal to the RHS of (9.1.3). But let’s be sure - all we need to do is find specific \(\left( \begin{array}{r} x_1\\y_1 \end{array} \right), \left( \begin{array}{r} x_2\\y_2 \end{array} \right) \in \R^2\) for which, say, \((x_1+x_2)^2\ne x_1^2+x_2^2\text{.}\) For this we may use \(x_1=1,x_2=-1\text{;}\) since \((x_1+x_2)^2=0\ne4=x_1^2+x_2^2\) we see that Property 1 does not hold and so this \(T\) is not a linear transformation.

Example 9.1.6. Market share.

Adapted from Linear Algebra by Michael O’nan.

Suppose that three cell phone providers, \(B,\) \(T,\) and \(V\) completely control the market. Each year the three companies retain some of their own customers and entice others to switch to them, thus stealing customers from the other companies. Suppose that \(B\) retains \(\frac{1}{2}\) of its customers, but \(\frac{1}{4}\) switch to \(T\) and \(\frac{1}{4}\) switch to \(V\text{.}\) Company \(T\) also retains \(\frac{1}{2}\) of its customers and \(\frac{1}{3}\) switch to \(B\) and \(\frac{1}{6}\) switch to \(V\text{.}\) Finally \(V\) retains \(\frac{2}{3}\) of their customers but \(\frac{1}{6}\) switch to \(B\) and \(\frac{1}{6}\) switch to \(T\text{.}\) The matrix

\begin{equation*} \A=\left(\begin{array}{rrr} 1/2 \amp 1/3 \amp 1/6\\ 1/4 \amp 1/2 \amp 1/6 \\ 1/4 \amp 1/6 \amp 2/3 \end{array} \right) \end{equation*}

represents the transformation of the customer vector as customers move between providers from year to year. That is, if \(\left(\begin{array}{r} B\\T\\V \end{array} \right)\) is a vector giving the number of customers in a current year, then \(\A\left(\begin{array}{r} B\\T\\V \end{array} \right)=\left(\begin{array}{r} B'\\T'\\V' \end{array} \right)\) gives the redistribution of the marketshare after one year. Since the situation is represented by a matrix we can see that this process is a linear transformation.

We might ask the question in this situation, is there a distribution of customers that is stable? A stable situation would mean \(\A\left(\begin{array}{r} B\\T\\V \end{array} \right)=\left(\begin{array}{r} B\\T\\V \end{array} \right)\text{.}\) This is true if and only if \(1\) is an eigenvalue for \(\A\) and if \(\left(\begin{array}{r} B\\T\\V \end{array} \right)\) is an eigenvector. We know from Chapter 8 that we can check that \(1\) is an eigenvalue for \(\A\) by finding the nullspace of \(\A-(1)I:\)

\begin{equation*} \A-(1)I=\left(\begin{array}{ccc} 1/2 -1 \amp 1/3 \amp 1/6\\ 1/4 \amp 1/2-1 \amp 1/6 \\ 1/4 \amp 1/6 \amp 2/3 -1 \end{array} \right)=\left(\begin{array}{rrr} -1/2 \amp 1/3 \amp 1/6\\ 1/4 \amp -1/2 \amp 1/6 \\ 1/4 \amp 1/6 \amp -1/3 \end{array} \right) \end{equation*}

Performing elimination we see

\begin{equation*} \left(\begin{array}{rrr} -1/2 \amp 1/3 \amp 1/6\\ 1/4 \amp -1/2 \amp 1/6 \\ 1/4 \amp 1/6 \amp -1/3 \end{array} \right)\rightarrow \left(\begin{array}{rrr} -1/2 \amp 1/3 \amp 1/6\\ 0 \amp -1/3 \amp 1/4 \\ 0 \amp 0 \amp 0 \end{array} \right) \end{equation*}

Thus \(U\x=\overrightarrow{0}\) has one free variable and \(N(\A)=N(U)\) is nontrivial; it follows that \(1\) is indeed an eigenvalue for \(\A\text{.}\) The nullspace for \(\A-I,\) or the eigenspace for \(1\) is

\begin{equation*} \left\{ \left. a \left(\begin{array}{c}10 \\ 9 \\12 \end{array} \right) \right| a \in\R\right\}\text{.} \end{equation*}

In the context of our problem we require the components of our vector to sum to \(1,\) so we normalize by dividing each component by \(10+9+12=31\text{.}\) Thus the eigenvector we seek, exhibiting the relative market shares at equilibrium, is \(\left(\begin{array}{c}10/31 \\ 9/31 \\12/31 \end{array} \right)\text{.}\) That is, if \(B\) has \(\displaystyle{\frac{10}{31}}\) of the market, \(T\) has \(\displaystyle{\frac{9}{31}}\) of the market, and \(V\) has \(\displaystyle{\frac{12}{31}}\) of the market, then each company will maintain its respective marketshare, as expressed by \(\A\left(\begin{array}{c}10/31 \\ 9/31 \\12/31 \end{array} \right)=\left(\begin{array}{c}10/31 \\ 9/31 \\12/31 \end{array} \right)\text{.}\)

Subsection 9.1.2 Properties of Linear Transformations

Theorem 9.1.7. Basic Properties of Linear Transformations.

Let \(V\) and \(W\) be real vector spaces and \(T:V\rightarrow W\) a linear transformation. Then for any \(\x_1,\x_2,...,\x_n \in V,\,a_1,...,a_n \in \R\) we have

\(\displaystyle{T(\vec{0})=\vec{0}},\) and
\(T(a_1\vec{x_1}+a_2\vec{x_2})=a_1T(\vec{x_1})+a_2T(\vec{x_2}),\) which extrapolates to
\(\displaystyle \displaystyle{T\left(\sum_{i=1}^n a_i\vec{x_i}\right)=\sum_{i=1}^n a_i T(\vec{x_i})}\)

Proof.

\(T\left(\vec{0}\right)=T\left(0\cdot \vec{0}\right)=0 \cdot T(\vec{0}) = \vec{0}\) by Definition 9.1.1 Property 2.
It follows that

\begin{align*} T\left(a_1\x_1+a_2\x_2\right)\amp =T\left(a_1\x_1\right)+T\left(a_2\x_2\right)\qquad\text{ (by }\knowl{./knowl/xref/ltdefinition.html}{\text{Def. 9.1.1}}, \text{ Prop. 1) } \\ \amp =a_1T\left(\x_1\right)+a_2T\left(\x_2\right)\qquad\text{ (by }\knowl{./knowl/xref/ltdefinition.html}{\text{Def. 9.1.1}}, \text{ Prop. 2) } \end{align*}
This is proven by induction on \(n\text{.}\) For the base case of \(n=1\) we appeal to Property 2 of Definition 9.1.1. Now for our induction hypothesis suppose that Part 3 above holds for \(i=n-1\) and note that

\begin{align*} T\left(\sum_{i=1}^{n} a_i \x_i\right)\amp =T\left(\sum_{i=1}^{n-1}a_i\x_i \right)+T\left(a_n\x_n\right)\qquad\text{ (by }\knowl{./knowl/xref/ltdefinition.html}{\text{Def. 9.1.1}}, \text{ Prop. 1) } \\ \amp =\sum_{i=1}^{n-1}a_iT\left(\x_i\right)+ a_nT\left(\x_n\right)\qquad\text{ (by }\knowl{./knowl/xref/ltdefinition.html}{\text{Def. 9.1.1}}, \text{ Prop. 2) } \\ \amp = \sum_{i=1}^n a_iT\left(\x_i\right). \end{align*}

Remark 9.1.8.

The astute reader will note the parallels between the properties in Theorem 9.1.7 and those in et \(T: \R^n \rightarrow \R^m\) be a linear transformation. Then there exists a matrix \(\A \in \R^{m\times n}\) so that \(T(\x)=\A\x\) for all \(\x\in \R^n\text{.}\)

Example 9.1.9. Another non-linear transformation.

Consider \(T:\R^3\rightarrow\R^2\) given by \(T\left[\left(\begin{array}{r}x\\y\\z\end{array}\right)\right]=\left(\begin{array}{r} 2x+3y\\x-5y+1\end{array} \right)\text{.}\) Since \(T(\vec{0})=\left(\begin{array}{r}0\\1\end{array} \right)\ne\vec{0}\text{,}\) by Theorem 9.1.7 \(T\) is not linear.

Theorem 9.1.10. Every linear transformation from \(\R^n\) to \(\R^m\) can be evaluated by matrix multiplication.

Let \(T: \R^n \rightarrow \R^m\) be a linear transformation. Then there exists a matrix \(\A \in \R^{m\times n}\) so that \(T(\x)=\A\x\) for all \(\x\in \R^n\text{.}\)

Proof.

Let \(\{\wh{e}_1, \wh{e}_2,\ldots,\wh{e}_n\}\) be the canonical basis for \(\R^n\) and denote the canonical basis for \(\R^m\) by \(\{\wh{e}{\hspace{.04in}'}_{\hspace{-.02in}1}, \wh{e}{\hspace{.04in}'}_{\hspace{-.02in}2},\ldots, \wh{e}{\hspace{.04in}'}_{\hspace{-.02in}m}\}\text{.}\) Then \(T(\wh{e}_j) \in \R^m\) so \(T(\wh{e}_j)\) can be expressed as \(\displaystyle{T(\wh{e}_j)=\sum_{i=1}^m a_{ij}\wh{e}{\hspace{.04in}'}_{\hspace{-.02in}i}}\) for some \(a_{ij} \in \R\text{.}\) For example

\begin{equation*} T(\wh{e}_1)=T\left[\left(\begin{array}{c} 1\\0\\ \vdots \\ 0 \end{array} \right)\right]=\left(\begin{array}{c}a_{11}\\a_{21} \\ \vdots \\ a_{m1} \end{array} \right) = \sum_{i=1}^m a_{i1}\wh{e}{\hspace{.04in}'}_{\hspace{-.02in}i}\text{.} \end{equation*}

Now let \(\x \in\R^n\) and express \(\x\) as the linear combination of canonical basis elements: \(\displaystyle{\x=\sum_{j=1}^nx_j\wh{e}_j}\text{.}\) We have

\begin{align*} T(\x)\amp =T\left(\sum_{j=1}^nx_j\wh{e}_j\right)\\ \amp =\sum_{j=1}^nx_j T(\wh{e}_j) = \sum_{j=1}^n x_j \left(\sum_{i=1}^m a_{ij}\wh{e}{\hspace{.04in}'}_{\hspace{-.02in}i}\right)\qquad\text{(by Prop. 3 of }\knowl{./knowl/xref/propsoflts.html}{\text{Theorem 9.1.7}})\\ \amp =\sum_{j=1}^n\sum_{i=1}^ma_{ij}x_j\wh{e}{\hspace{.04in}'}_{\hspace{-.02in}i}=\sum_{i=1}^m\left(\sum_{j=1}^n a_{ij}x_j\right)\wh{e}{\hspace{.04in}'}_{\hspace{-.02in}i}\text{.} \end{align*}

This exhibits the \(i^{\text{ th } }\) component of \(T(\x)\) as \(\displaystyle{\sum_{j=1}^na_{ij}x_j}\) and we know that the \(i^{\text{ th } }\) component of \(\A\x\) is also \(\displaystyle{\sum_{j=1}^na_{ij}x_j}\text{.}\) We conclude that all corresponding components of \(T\left(\x\right)\) and \(\A\x\) are equal so for the matrix \(\A=(A_{ij})\) we have \(\A\x=T(\x)\) as desired.

Theorem 9.1.11. Uniqueness of linear transformations given a basis.

Given a linear transformation \(T:\R^n\rightarrow\R^m\) the matrix \(\A\) in Theorem 9.1.10 is unique.

Proof.

Let \(T\) be a linear transformation, let \(\boldsymbol{A_1}\) and \(\boldsymbol{A_2}\) be two matrices which represent \(T\) with respect to the canonical basis and let \(\x\in\R^n\text{.}\) Then since \(\boldsymbol{A_1}\x=\boldsymbol{A_2}\x\) we have

\begin{equation*} \vec{0}=\A_1\x-A_2\x\,=\,(\boldsymbol{A_1}-\boldsymbol{A_2})\x \end{equation*}

and since \(\x\) is arbitrary, \(\boldsymbol{A_1}-\boldsymbol{A_2}\) equals the zero matrix. Hence \(\boldsymbol{A_1}=\boldsymbol{A_2}\) and we conclude that there is only one such \(\A\text{.}\)

Example 9.1.12.

The operator \(T_\theta\) which rotates vectors in the plane counterclockwise by \(\theta\) radians can be accomplished by matrix multiplication, (see Example 2.7.19), so \(T_\theta\) is a linear operator with associated matrix

\begin{equation} \A=\left(\begin{array}{rr} \cos\theta \amp -\sin\theta\\ \sin\theta \amp \cos\theta \end{array} \right)\text{.}\tag{9.1.4} \end{equation}

Figure 9.1.13. Rotation of canoncial basis vectors by \(\theta\)

For example a counterclockwise rotation by \(\pi/2\) radians is achieved via the matrix

\begin{equation*} \A=\left(\begin{array}{rr} 0 \amp -1\\ 1 \amp 0 \end{array} \right) \end{equation*}

by plugging \(\pi/2\) in for \(\theta\) in (9.1.4).

Definition 9.1.14. Image and range.

If \(T(\x)=\y\) we call \(\y\) the image of \(\x\) under \(T\text{.}\) The set of all such images is the range of \(T\) and denoted ran\((T)\text{:}\)

\begin{equation*} \text{ ran } (T)=\biggl\{\y\in \R^m \left|\,\exists\,\x \in \R^n \ni T(\x)=\y\right.\biggr\}. \end{equation*}

Theorem 9.1.15. The range of \(\boldsymbol{T}\) is the column space of \(\A\).

When \(T:\R^n\rightarrow\R^m\) is represented by a matrix \(\A\in\R^{m\times n}\) the range of \(T\) is equal to the column space of \(\A\text{.}\)

Proof.

We have

\begin{align*} \text{ ran } (T)\amp =\biggl\{\y\in \R^m \mid \exists \x \in \R^n \ni T(\x)=\y\biggr\}\\ \amp =\biggl\{\y\in \R^m \mid \exists \x \in \R^n \ni \A\x=\y\biggr\}\\ \amp =\text{ col }(\A). \end{align*}

Definition 9.1.16. The kernel of \(\boldsymbol{T}\).

The set of all \(\x\in \R^n\) that \(T\) maps to \(\vec{0}\) is called the kernel of \(T\) and denoted ker\((T)\text{:}\)

\begin{equation*} \text{ ker } (T)=\biggl\{\x\in \R^n \left|\,T(\x)=\vec{0}\right.\biggr\}. \end{equation*}

Theorem 9.1.17. The kernel of \(\boldsymbol{T}\) and \(\boldsymbol{N(\A)}\).

Let \(T:\R^n \rightarrow \R^m\) be represented by a matrix \(\A\in\R^{m\times n}\text{.}\) Then ker\((T)=N(\A)\text{.}\)

Proof.

Suppose \(T:\R^n \rightarrow \R^m\) is represented by a matrix \(\A\in\R^{m\times n}\text{.}\) Then

\begin{align*} \text{ ker } (T)\amp =\left\{\x\in\R^n \mid T(\x)=\vec{0}\right\}\\ \amp =\left\{\x\in\R^n \mid \A\x=\vec{0}\right\}\\ \amp =N(\A)\text{.} \end{align*}

It turns out that any linear transformation from \(\R^n\) to \(\R^m\) can be represented by a matrix \(\A \in \R^{m\times n}\text{.}\) To prove that, we use the canonical basis of \(\R^n\) defined in Theorem 5.4.31.

Definition 9.1.18. Injective\(\,/ \, \) 1-1 linear transformations.

A linear transformation \(T:\R^n\rightarrow\R^m\) is called the injective if for all \(\x,\,\y\in\R^n\) with \(\x\ne\y\) one has \(T(\x)\ne T(\y)\text{.}\)

Theorem 9.1.19. Injectivity and the kernel.

A linear transformation \(T:\R^n \rightarrow \R^m\) is injective if and only if ker\(T=\{\vec{0}\}\text{.}\)

Proof.

The proof is a homework exercise.

Theorem 9.1.20. Injectivity of \(\boldsymbol{T}\) and the columns of \(\A\).

A linear transformation \(T:\R^n \rightarrow \R^m\) is injective if and only if the set of columns of \(\A\) is linearly independent.

Proof.

The proof is a homework exercise.

Definition 9.1.21. Surjective / onto.

A linear transformation \(T:\R^n\rightarrow\R^m\) is called the surjective if for all \(\z\in\R^m,\) there exists \(\x\in\R^n\) for which \(T(\x)=\z\text{.}\)

Definition 9.1.22. Linear operators.

A linear tranformation from \(\R^n\) to \(\R^n\) is called a linear operator on \(\R^n\text{.}\)

Definition 9.1.23. Invertibility of \(\boldsymbol{T}\).

Let \(T\) be a linear operator on \(\R^n\) and let \(\A\) be the associated matrix. If \(\A\) is invertible, then we say that \(T\) is invertible and define the inverse \(T^{-1}\) of \(T\) on \(\R^n\) by

\begin{equation*} T^{-1}(\x)=\A^{-1}\x. \end{equation*}

Theorem 9.1.24. For linear operators, invertibility is injectivity.

A linear operator \(T:\R^n \rightarrow \R^n\) is invertible if and only if it is injective.

Proof.

\((\Rightarrow)\) Suppose \(T\) is an invertible linear operator on \(\R^n,\) let \(\A\) be the associated invertible matrix. Let \(\b\in\) ran \((T)\) so that \(T(\x)=\b\) and hence \(\A\x=\b\) has a solution. Then by Item 4 in Theorem 3.10.25, that solution to \(T(\x)=\b\) is unique. This is precisely the condition for injectivity stated in Definition 9.1.18, so \(T\) is indeed injective.

\((\Leftarrow)\) Suppose \(T\) is injective on \(\R^n\) and let \(\A\) be the associated matrix. Then by Theorem 9.1.17 \(N(\A)=\{\vec{0}\},\) so by Theorem 3.10.3 \(\A\) is invertible and hence by Definition 9.1.23 \(T\) is invertible.

Theorem 9.1.25. For linear operators, invertibility is surjectivity.

A linear operator \(T:\R^n \rightarrow \R^n\) is invertible if and only if it is surjective.

Proof.

\((\Rightarrow)\) Suppose \(T\) is invertible on \(\R^n,\) let \(\A\) be the associated invertible matrix. Let \(\b\in\) ran \((T)\) so that \(T(\x)=\b\) and hence \(\A\x=\b\) has a solution. Then by Item 4 in Theorem 3.10.25, that solution to \(T(\x)=\b\) is unique. This is precisely the condition for injectivity stated in Definition 9.1.18, so \(T\) is indeed injective.

\((\Leftarrow)\) Suppose the linear operator \(T\) is surjective on \(\R^n,\) let \(\A\) be the associated matrix and fix \(\b\in\R^n\text{.}\) Since \(T\) is surjective we know there exists \(T(\x)=\b\text{;}\) fix such a \(\x\text{.}\) Since \(T(\x)=\A\x\) we now know that \(\A\x=\b\) has a solution, so by Item 3 in Theorem 3.10.25 \(\,\A\) is invertible and hence by Definition 9.1.23 \(T\) is invertible.

Though a general linear tranformation may be injective but not surjective or vice-versa, we have proven that injectivity and surjectivity are equivalent conditions for linear operators. Neat!

Theorem 9.1.26. For linear operators, injectivity is surjectivity.

A linear operator \(T:\R^n\rightarrow\R^n\) is injective if and only if it is surjective.

Theorem 9.1.26 is nice because we know lots of things about matrices. We can see from this proof of Theorem 9.1.10 that to determine the matrix that can be used to represent the linear transformation all we need to know is how the transformation acts on the canonical basis vectors.

Example 9.1.27. An invertible linear operator.

The linear operator in Example 9.1.12 is invertible since the matrix in Example 2.7.19 is invertible with inverse

\begin{align*} \A^{-1}\amp =\frac{1}{\cos^2\theta+\sin^2\theta}\left(\begin{array}{rr} \cos(\theta) \amp \sin(\theta)\\ -\sin(\theta) \amp \cos(\theta) \end{array} \right)\\ \\ \amp =\left(\begin{array}{rr} \cos(-\theta) \amp -\sin(-\theta)\\ \sin(-\theta) \amp \cos(-\theta) \end{array} \right) \end{align*}

by Theorem 2.10.23. It is conforting to see that the inverse of a linear operator which rotates vectors in \(\R^2\) counterclickwise by \(\theta\) radians is the linear operator which rotates vectors in \(\R^2\) counterclockwise by \(-\theta\) radians.

In fact we can use any basis — not just the canonical basis — to construct the unique matrix representation \(\A\) of a given linear transformation \(T,\) with an appropriate adjustment. We show the proof for the \(2\times2\) case for illustrative purposes, but the result extends in the obvious way as stated in the subsequent theorem.

Theorem 9.1.28. Determining the associated matrix \(\A\in\R^{2\times2}\) from the images under \(\boldsymbol{T}\) of a set of two linearly independent vectors in \(\R^2\).

Let \(\A \in \R^{2 \times 2}\) represent a linear transformation \(T:\R^2\rightarrow\R^2\text{.}\) Suppose that \(\x=\left(\begin{array}{r}x_1\\x_2 \end{array} \right)\) and \(\vec{x'}=\left(\begin{array}{r}x'_1\\x'_2 \end{array} \right)\) are linearly independent and that \(T(\x)=\y=\left(\begin{array}{r}y_1\\y_2 \end{array} \right)\) and \(T(\vec{x'})=\vec{y'}=\left(\begin{array}{r}y'_1\\y'_2 \end{array} \right),\) then

\begin{equation*} \A=\left(\begin{array}{rr} a \amp b \\ c \amp d \end{array} \right)=\left(\begin{array}{rr} y_1 \amp y'_1 \\ y_2 \amp y'_2 \end{array} \right)\left(\begin{array}{rr} x_1 \amp x'_1 \\ x_2 \amp x'_2 \end{array} \right)^{-1}. \end{equation*}

Proof.

Let \(T:\R^2\rightarrow\R^2\) be a linear transformation and let \(\A\) represent \(T\) with respect to the canonical basis. Since \(T(\x)=\A\x=\y\) and \(T(\vec{x'})=\A\vec{x'}=\vec{y'}\) then

\begin{equation*} \A\left(\begin{array}{rr} x_1 \amp x'_1 \\ x_2 \amp x'_2 \end{array} \right)=\left(\begin{array}{rr} y_1 \amp y'_1 \\ y_2 \amp y'_2 \end{array} \right) \end{equation*}

Since \(\x\) and \(\vec{x'}\) are linearly independent, the matrix \(\left(\begin{array}{rr} x_1 \amp x'_1 \\ x_2 \amp x'_2 \end{array} \right)\) has linearly independent columns, so by Theorem 5.4.8, \(\left(\begin{array}{rr} x_1 \amp x'_1 \\ x_2 \amp x'_2 \end{array} \right)\) is invertible. Hence

as desired.

Theorem 9.1.29. \(\A\in\R^{n\times n}\) from the images under \(\boldsymbol{T}\) of a set of \(n\) linearly independent vectors in \(\R^n\).

Let \(\A \in \R^{m\times n}\) represent a linear transformation \(T:\R^n\rightarrow\R^m\text{.}\) Suppose that \(\vec{x_1}, \vec{x_2}, ... , \vec{x_n}\) are a set of \(n\) linearly independent vectors in \(\R^n\text{.}\) Denote by \(\vec{y_1}=T(\vec{x_1}), \vec{y_2}=T(\vec{x_2}), ..., \vec{y_n}=T(\vec{x_n})\text{.}\) Then

\begin{equation*} \A=\left( \begin{array}{cccc} | \amp | \amp \amp |\\ \vec{y_1} \amp \vec{y_2} \amp \cdots \amp \vec{y_n}\\ | \amp | \amp \amp | \end{array} \right) \left( \begin{array}{cccc} | \amp | \amp \amp |\\ \vec{x_1} \amp \vec{x_2} \amp \cdots \amp \vec{x_n}\\ | \amp | \amp \amp | \end{array} \right)^{-1}. \end{equation*}

Example 9.1.30.

Consider \(T:\R^2\rightarrow\R^2\) given by \(T\left[\left(\begin{array}{r}x\\y \end{array} \right)\right]=\left(\begin{array}{r} x+y\\x-y \end{array} \right)\text{.}\)

Show \(T\) is a linear operator on \(\R^2\text{.}\)
Find a matrix \(\A \in \R^{2 \times 2}\) so that \(T(\x)=\A\x\text{.}\)
Find ran\((T)\) and write it in element form notation.
Find ker\((T)\) and write it in element form notation.
Determine whether \(T\) is injective.
Determine whether \(T\) is surjective.

Solution:

To show \(T\) is a linear transformation we must show that both properties in Definition 9.1.1 hold. Let \(\x_1=\left(\begin{array}{r} x_1 \\y_1 \end{array} \right), \x_2=\left(\begin{array}{r} x_2 \\y_2 \end{array} \right)\text{.}\) First we show Property 1: \(T(\x_1+\x_2)=T(\x_1)+T(\x_2)\text{:}\)

\begin{align*} T\left[ \left(\begin{array}{c} x_1\\ y_1 \end{array}\right)+\left(\begin{array}{c} x_2\\ y_2 \end{array}\right)\right]\amp =T\left[\left(\begin{array}{c} x_1+x_2\\ y_1+y_2 \end{array}\right)\right]\\ \amp =\left(\begin{array}{c} (x_1+x_2)+(y_1+y_2)\\ (x_1+x_2)-(y_1+y_2) \end{array}\right)\\ \amp =\left(\begin{array}{c} x_1+y_1\\ x_1-y_1 \end{array}\right)+\left(\begin{array}{c} x_2+y_2\\ x_2-y_2 \end{array}\right)\\ \amp =T\left[ \left(\begin{array}{c} x_1\\ y_1 \end{array}\right)\right]+T\left[\left(\begin{array}{c} x_2\\ y_2 \end{array}\right)\right]\text{.} \end{align*}

To show Property 2: Let \(a\in \R\text{.}\) Then

\begin{align*} T\left[a\left(\begin{array}{r} x_1\\ y_1 \end{array}\right)\right]\amp =T\left[\left(\begin{array}{r} ax_1\\ ay_1 \end{array}\right)\right]=\left(\begin{array}{r} ax_1+ay_1\\ ax_1-ay_1 \end{array}\right)\\ \amp =\left(\begin{array}{r} a(x_1+y_1)\\ a(x_1-y_1) \end{array}\right)=a\left(\begin{array}{r} x_1+y_1\\ x_1-y_1 \end{array}\right)=aT\left[\left(\begin{array}{r} x_1\\ y_1 \end{array}\right)\right]. \end{align*}
To find the matrix \(\A\) that represents \(T\) we can follow the proof of Theorem 9.1.10. The columns of \(\A\) are just \(T(\wh{e}_1)\) and \(T(\wh{e}_2)\) where \(\wh{e}_1,\wh{e}_2\) is the canonical basis for \(\R^2\text{:}\)

\begin{align*} T\left[\left(\begin{array}{r}1\\ 0\end{array}\right)\right]\amp =\left(\begin{array}{r}1+0\\ 1-0\end{array}\right)\,=\,\left(\begin{array}{r}1\\ 1\end{array}\right)\\ T\left[\left(\begin{array}{r}0\\ 1\end{array}\right)\right]\amp =\left(\begin{array}{r}0+1\\ 0-1\end{array}\right)\,=\,\left(\begin{array}{r}1\\ -1\end{array}\right) \end{align*}

Thus the corresponding matrix is

\begin{equation*} \A=\left(\begin{array}{rr} 1 \amp 1 \\ 1 \amp -1 \end{array} \right). \end{equation*}

We can check that this is the correct \(\A\text{:}\)

\begin{equation*} \A\x=\left(\begin{array}{rr} 1 \amp 1 \\ 1 \amp -1 \end{array} \right)\left(\begin{array}{r}x\\y \end{array} \right)=\left(\begin{array}{r}x+y\\x-y \end{array} \right)=T(\x)\text{.} \end{equation*}
By Theorem 9.1.15 we know that to find the range of \(T\) we can find the column space of \(\A\text{.}\) Since \(\det(\A)=(1)(-1)-(1)(1)=-2\ne0\) we know that the columns are linearly independent so

\begin{equation*} \text{ ran } (T)=\text{ col } (\A)= \left\{\left. a\left(\begin{array}{r}1\\1 \end{array} \right)+b\left(\begin{array}{r}1\\-1 \end{array} \right)\,\right|\,\,a,b, \in \R\right\}\,=\,\R^2. \end{equation*}
By Theorem 9.1.17 we know that the kernel of \(T\) is the nullspace of \(\A\text{.}\) However since \(\det(\A)\ne0\) we know \(\A\) is nonsingular and so

\begin{equation*} \text{ ker } (T)=N(\A)=\left\{ \vec{0}\right\}. \end{equation*}
Since ker\(T=\{\vec{0}\}\text{,}\) the operator \(T\) is injective by Theorem 9.1.19.
Since the operator \(T\) is injective, it is surjective by Theorem 9.1.26.

Example 9.1.31.

Consider \(T:\R^2\rightarrow\R^2\) given by \(T\left[\left(\begin{array}{r}x\\y \end{array} \right)\right]=\left(\begin{array}{c} 2x+5y\\0\end{array}\right)\text{.}\)

Show \(T\) is a linear operator on \(\R^2\text{.}\)
Find a matrix \(\A \in \R^{2 \times 2}\) so that \(T(\x)=\A\x\text{.}\)
Find ran\((T)\) and write it in element form notation.
Find ker\((T)\) and write it in element form notation.
Determine whether \(T\) is injective.
Determine whether \(T\) is surjective.

Solution:

To show \(T\) is a linear operator we must show that both properties in Definition 9.1.1 hold. Let \(\x_1=\left(\begin{array}{r} x_1 \\y_1 \end{array} \right), \x_2=\left(\begin{array}{r} x_2 \\y_2 \end{array} \right)\text{.}\) First we show Property 1: \(T(\x_1+\x_2)=T(\x_1)+T(\x_2)\text{:}\)

\begin{align*} T\left[ \left(\begin{array}{c} x_1\\ y_1 \end{array}\right)+\left(\begin{array}{c} x_2\\ y_2 \end{array}\right)\right]\amp =T\left[\left(\begin{array}{c} x_1+x_2\\ y_1+y_2 \end{array}\right)\right]\\ \amp =\left(\begin{array}{c} 2(x_1+x_2)+5(y_1+y_2)\\ 0\end{array}\right)\\ \amp =\left(\begin{array}{c} 2x_1+5y_1\\ 0\end{array}\right)+\left(\begin{array}{c} 2x_2+5y_2\\ 0\end{array}\right)\\ \amp =T\left[ \left(\begin{array}{c} x_1\\ y_1 \end{array}\right)\right]+T\left[\left(\begin{array}{c} x_2\\ y_2 \end{array}\right)\right]\text{.} \end{align*}

To show Property 2: Let \(a\in\R\text{.}\) Then

\begin{align*} T\left[a\left(\begin{array}{r} x_1\\ y_1 \end{array}\right)\right]\amp =T\left[\left(\begin{array}{r} ax_1\\ ay_1 \end{array}\right)\right]=\left(\begin{array}{c}2ax_1+5ay_1\\ 0\end{array}\right)\\ \amp =\left(\begin{array}{c} a(x_1+y_1)\\ a0 \end{array}\right)=a\left(\begin{array}{c}2x_1+5y_1\\ 0\end{array}\right)=aT\left[\left(\begin{array}{r} x_1\\ y_1 \end{array}\right)\right]. \end{align*}
The matrix \(\A\) that represents \(T\) is

\begin{equation*} \A=\left(\begin{array}{rr} 2 \amp 5 \\ 0 \amp 0 \end{array} \right). \end{equation*}
By Theorem 9.1.15 we know that to find the range of \(T\) we can find the column space of \(\A\text{.}\) Since \(\A\) is already in row-echelon form we note that the first column of \(\A\) is the only pivot column so it may serve as the lone basis element for col\((\A)\text{.}\) Thus

\begin{equation*} \text{ ran } (T)=\text{ col } (\A)= \left\{\left. a\left(\begin{array}{r}1\\0 \end{array} \right)\,\right| a\in\R\right\}\,\ne\,\R^2. \end{equation*}
By Theorem 9.1.17 we know that the kernel of \(T\) is the nullspace of \(\A\text{,}\) which in this case is nontrivial. We have

\begin{equation*} \text{ ker } (T)=N(\A)=\left\{\left.\left(\begin{array}{r}x\\y\end{array}\right)\,\right|\,2x+5y=0\right\}. \end{equation*}
Since ker\((T)\ne\{\vec{0}\}\text{,}\) the operator \(T\) is not injective by Theorem 9.1.19.
The operator \(T\) is not surjective because, for example, \(\left(\begin{array}{r}0\\1\end{array}\right)\not\in\) ran\((T)\text{.}\)

Example 9.1.32.

Consider \(T:\R^2\rightarrow\R^3\) given by \(T\left[\left(\begin{array}{r}x\\y\end{array} \right)\right]=\left(\begin{array}{c} x\\-x+3y\\y\end{array}\right)\text{.}\)

Show \(T\) is a linear traqnsformation.
Find a matrix \(\A \in \R^{3 \times 2}\) so that \(T(\x)=\A\x\text{.}\)
Find ran\((T)\) and write it in element form notation.
Find ker\((T)\) and write it in element form notation.
Determine whether \(T\) is injective.
Determine whether \(T\) is surjective.

Solution:

To show \(T\) is a linear operator we must show that both properties in Definition 9.1.1 hold. Let \(\x_1=\left(\begin{array}{r} x_1 \\y_1 \end{array} \right), \x_2=\left(\begin{array}{r} x_2 \\y_2 \end{array} \right)\text{.}\) First we show Property 1: \(T(\x_1+\x_2)=T(\x_1)+T(\x_2)\text{:}\)

\begin{align*} T\left[ \left(\begin{array}{c} x_1\\ y_1 \end{array}\right)+\left(\begin{array}{c} x_2\\ y_2 \end{array}\right)\right]\amp =T\left[\left(\begin{array}{c} x_1+x_2\\ y_1+y_2 \end{array}\right)\right]\\ \amp =\left(\begin{array}{c} x_1+x_2\\ -(x_1+x_2)+3(y_1+y_2)\\ y_1+y_2\end{array}\right)\\ \amp =\left(\begin{array}{c} x_1\\ -x_1+3y_1\\ y_1\end{array}\right)+\left(\begin{array}{c}x_2\\ -x_2+3y_2\\ y_2\end{array}\right)\\ \amp =T\left[ \left(\begin{array}{c} x_1\\ y_1 \end{array}\right)\right]+T\left[\left(\begin{array}{c} x_2\\ y_2 \end{array}\right)\right]\text{.} \end{align*}

To show Property 2: Let \(a\in\R\text{.}\) Then

\begin{align*} T\left[a\left(\begin{array}{r} x_1\\ y_1 \end{array}\right)\right]\amp =T\left[\left(\begin{array}{r} ax_1\\ ay_1 \end{array}\right)\right]=\left(\begin{array}{c} ax_1\\ -ax_1+3ay_1\\ ay_1\end{array}\right)=a\left(\begin{array}{c}x_1\\ -x_1+3y_1\\ y_1\end{array}\right)\\ \amp =aT\left[\left(\begin{array}{r} x_1\\ y_1 \end{array}\right)\right]. \end{align*}
The matrix \(\A\) that represents \(T\) is

\begin{equation*} \A=\left(\begin{array}{rr} 1 \amp 0 \\ -1 \amp 3 \\ 0 \amp 1 \end{array} \right). \end{equation*}
By Theorem 9.1.15 we know that to find the range of \(T\) we can find the column space of \(\A\text{.}\) Putting \(\A\) in row-echelon form shows (try it!) that both columns of \(\A\) are associated with pivot columns of \(\U\) so both of the columns of \(\A\) may serve as basis elements for col\((\A)\text{.}\) Thus

\begin{equation*} \text{ ran } (T)=\text{ col } (\A)= \left\{\left. a\left(\begin{array}{r}1\\-1\\0\end{array}\right)+b\left(\begin{array}{r}0\\3\\1\end{array} \right)\,\right|\,a,\,b\in\R\right\}\,\ne\,\R^3. \end{equation*}
By Theorem 9.1.17 we know that the kernel of \(T\) is the nullspace of \(\A\text{.}\) Now because the columns of \(\A\) are not nonzero scalar multiples of each other, the set of columns of \(\A\) are linearly independent. We have

\begin{equation*} \text{ ker } (T)=N(\A)=\left\{\vec{0}\right\}. \end{equation*}
Since ker\((T)=\{\vec{0}\}\text{,}\) the operator \(T\) is injective by Theorem 9.1.19.
The operator \(T\) is not surjective because, for example, \(\left(\begin{array}{r}1\\1//3\end{array}\right)\not\in\) ran\((T)\text{.}\)

Example 9.1.33.

Consider \(T:\R^3\rightarrow\R^2\) given by \(T\left[\left(\begin{array}{r}x\\y\\z\end{array}\right)\right]=\left(\begin{array}{c}x-z\\x+y+2z\end{array}\right)\text{.}\)

Show \(T\) is a linear operator on \(\R^2\text{.}\)
Find a matrix \(\A \in \R^{2 \times 2}\) so that \(T(\x)=\A\x\text{.}\)
Find ran\((T)\) and write it in element form notation.
Find ker\((T)\) and write it in element form notation.
Determine whether \(T\) is injective.
Determine whether \(T\) is surjective.

Solution:

To show \(T\) is a linear operator we must show that both properties in Definition 9.1.1 hold. Let \(\x_1=\left(\begin{array}{r} x_1 \\y_1\\z_1\end{array}\right), \x_2=\left(\begin{array}{r}x_2\\y_2\\z_2\end{array}\right)\text{.}\) First we show Property 1: \(T(\x_1+\x_2)=T(\x_1)+T(\x_2)\text{:}\)

\begin{align*} T\left[ \left(\begin{array}{c} x_1\\ y_1\\ z_1\end{array}\right)+\left(\begin{array}{c} x_2\\ y_2\\ z_2\end{array}\right)\right]\amp =T\left[\left(\begin{array}{c} x_1+x_2\\ y_1+y_2\\ z_1+z_2\end{array}\right)\right]\\ \amp =\left(\begin{array}{c}(x_1+x_2)-(z_1+z_2)\\ (x_1+x_2)+(y_1+y_2)+2(z_1+z_2)\end{array}\right)\\ \amp =\left(\begin{array}{c}x_1-z_1\\ x_1+y_1+2z_1\end{array}\right)+\left(\begin{array}{c}x_2-z_2\\ x_2+y_2+2z_2\end{array}\right)\\ \amp =T\left[ \left(\begin{array}{c} x_1\\ y_1 \end{array}\right)\right]+T\left[\left(\begin{array}{c} x_2\\ y_2 \end{array}\right)\right]\text{.} \end{align*}

To show Property 2: Let \(a\in\R\text{.}\) Then

\begin{align*} T\left[a\left(\begin{array}{r} x_1\\ y_1 \end{array}\right)\right]\amp =T\left[\left(\begin{array}{r} ax_1\\ ay_1 \end{array}\right)\right]=\left(\begin{array}{c}ax_1-az_1\\ ax_1+ay_1+2az_1\end{array}\right)\\ \amp =\left(\begin{array}{c} a(x_1-z_1)\\ a(x_1+y_1+2z_1) \end{array}\right)=a\left(\begin{array}{c}x_1-z_1\\ x_1+y_1+2z_1\end{array}\right)\\ \amp=aT\left[\left(\begin{array}{r} x_1\\ y_1 \end{array}\right)\right]. \end{align*}
The matrix \(\A\) that represents \(T\) is

\begin{equation*} \A=\left(\begin{array}{rr} 1 \amp 0 \amp -1 \\ 1 \amp 1 \amp 2 \end{array} \right). \end{equation*}
By Theorem 9.1.15 we know that to find the range of \(T\) we can find the column space of \(\A\text{.}\) Putting \(\A\) in row-echelon form we note that the first and second columns of \(\A\) may serve as basis elements for col\((\A)\text{.}\) Thus

\begin{equation*} \text{ ran } (T)=\text{ col } (\A)= \left\{\left. a\left(\begin{array}{r}1\\1 \end{array} \right)+b\left(\begin{array}{r}0\\1 \end{array} \right)\,\right|\,a,\,b\in\R\right\}\,=\,\R^2. \end{equation*}
By Theorem 9.1.17 we know that the kernel of \(T\) is the nullspace of \(\A\text{,}\) which in this case is nontrivial since \(z\) is a free variable. We have \(x=z\) and \(y=-z\) so a basis element for \(N(\A)\) is \(\left(\begin{array}{r}1\\-1\\1\end{array}\right)\text{.}\) Now

\begin{equation*} \text{ ker } (T)=N(\A)=\left\{\left.a\left(\begin{array}{r}1\\-1\\1\end{array}\right)\,\right|\,a\in\R\right\}. \end{equation*}
Since ker\((T)\ne\{\vec{0}\}\text{,}\) the operator \(T\) is not injective by Theorem 9.1.19.
The operator \(T\) is surjective because for any \(\b=\left(\begin{array}{r}b_1\\b_2\end{array}\right)\text{,}\) the vector \(\left(\begin{array}{c}b_1\\b_2-b_1\\0\end{array}\right)\) is mapped to \(\b\) by \((T)\text{.}\)

Prev Top Next