5. extra

Basis

Intuitively, a basis is any set of vectors that can be used as a coordinate system for a vector space. You are certainly familiar with the standard basis for the x*y-plane that is made up of two orthogonal axes: the x-axis and the y-axis. A vector v can be described as a coordinate pair ((v_x)*(v_y)) with respect to these axes, or equivalently as v=(v_x)*i+(v_y)*j, where i≡(1,0) and j≡(0,1) are unit vectors that point along the x-axis and y-axis respectively. However, other coordinate systems are also possible.

A basis for a n-dimensional vector space S is any set of n linearly independent vectors that are part of S.

Any set of two linearly independent vectors {(e_1),(e_2)} can serve as a basis for ℝ2. We can write any vector v∈ℝ2 as a linear combination of these basis vectors v=(v_1)*(e_1)+(v_2)*(e_2).

Note the same vector v corresponds to different coordinate pairs depending on the basis used: v=((v_x),(v_y)) in the standard basis (B_s)≡{i,j}, and v=((v_1),(v_2)) in the basis (B_e)≡{(e_1),(e_2)}. Therefore, it is important to keep in mind the basis with respect to which the coefficients are taken, and if necessary specify the basis as a subscript, e.g., ((v_x),(v_y))(B_s) or ((v_1)*(v_2))(B_e).

Converting a coordinate vector from the basis (B_e) to the basis (B_s) is performed as a multiplication by a change of basis matrix:

[v]=[1](B_s)(B_e)*[v](B_e)⇔[[(v_x)],[(v_y)]]=[[i⋅(e_1),i⋅(e_2)],[j⋅(e_1),j⋅(e_2)]]*[[(v_1)],[(v_2)]]

Note the change of basis matrix is actually an identity transformation. The vector v remains unchanged--it is simply expressed with respect to a new coordinate system. The change of basis from the (B_s)-basis to the (B_e)-basis is accomplished using the inverse matrix: (B_e)*[1](B_s)=((B_s)*[1](B_e))(-1).

Matrix representations of linear transformations

Bases play an important role in the representation of linear transformations T:ℝn→ℝm. To fully describe the matrix that corresponds to some linear transformation T, it is sufficient to know the effects of T to the n vectors of the standard basis for the input space. For a linear transformation T:ℝ2→ℝ2, the matrix representation corresponds to

(M_T)=[[|,|],[T(i),T(j)],[|,|]]∈ℝ(2×2)

As a first example, consider the transformation Π which projects vectors onto the x-axis. For any vector v=((v_x),(v_y)), we have (Π_x)(v)=((v_x),0). The matrix representation of Π is

(M_(Π_x))=[[(Π_x)(([1],[0])),(Π_x)(([0],[1]))]]=[[1,0],[0,0]]

As a second example, let's find the matrix representation of (R_θ), the counterclockwise rotation by the angle θ:

(M_(R_θ))=[[(R_θ)(([1],[0])),(R_θ)(([0],[1]))]]=[[cos(θ),-sin(θ)],[sin(θ),cos(θ)]]

The first column of (M_(R_θ)) shows that (R_θ) maps the vector i≡1∠0 to the vector 1∠θ=(cos(θ),sin(θ))⊺. The second column shows that (R_θ) maps the vector j=1∠π/2 to the vector 1∠(π/2+θ)=(-sin(θ),cos(θ))⊺.

Dimension and bases for vector spaces

The dimension of a vector space is defined as the number of vectors in a basis for that vector space. Consider the following vector space

S= span{(1,0,0),(0,1,0),(1,1,0)}.

Seeing that the space is described by three vectors, we might think that S is 3-dimensional. This is not the case, however, since the three vectors are not linearly independent so they don't form a basis for S. Two vectors are sufficient to describe any vector in S; we can write

S= span{(1,0,0),(0,1,0),(1,1,0)}, and we see these two vectors are linearly independent so they form a basis and dim (S)=2.

There is a general procedure for finding a basis for a vector space. Suppose you are given a description of a vector space in terms of m vectors

V= span{(v_1),(v_2),⋯,(v_m)} and you are asked to find a basis for V and the dimension of V. To find a basis for V, you must find a set of linearly independent vectors that span V. We can use the Gauss-Jordan elimination procedure to accomplish this task. Write the vectors (v_i) as the rows of a matrix M. The vector space V corresponds to the row space of the matrix M. Next, use row operations to find the reduced row echelon form RREF of the matrix M. Since row operations do not change the row space of the matrix, the row space of reduced row echelon form of the matrix M is the same as the row space of the original set of vectors. The nonzero rows in the RREF of the matrix form a basis for vector space V and the numbers of nonzero rows is the dimension of V.

Row space, columns space, and rank of a matrix

Recall the fundamental vector spaces for matrices that we defined in Section II-E: the column space C(A), the null space N(A), and the row space R(A). A standard linear algebra exam question is to give you a certain matrix A and ask you to find the dimension and a basis for each of its fundamental spaces.

In the previous section we described a procedure based on Gauss-Jordan elimination which can be used "distill" a set of linearly independent vectors which form a basis for the row space R(A). We will now illustrate this procedure with an example, and also show how to use the RREF of the matrix A to find bases for C(A) and N(A). Consider the following matrix and its reduced row echelon form:

A=[[1,3,3,3],[2,6,7,6],[3,9,9,10]] rref(A)=[[1,3,0,0],[0,0,1,0],[0,0,0,1]]

The reduced row echelon form of the matrix A contains three pivots. The locations of the pivots will play an important role in the following steps.

The vectors {(1,3,0,0),(0,0,1,0),(0,0,0,1)} form a basis for R(A).

To find a basis for the column space C(A) of the matrix A we need to find which of the columns of A are linearly independent. We can do this by identifying the columns which contain the leading ones in rref(A). The corresponding columns in the original matrix form a basis for the column space of A. Looking at rref(A) we see the first, third, and fourth columns of the matrix are linearly independent so the vectors {(1,2,3)⊤,(3,7,9)⊤,(3,6,10)⊤} form a basis for C(A).

Now let's find a basis for the null space, N(A)≡{x∈ℝ4|A*x=O}. The second column does not contain a pivot, therefore it corresponds to a free variable, which we will denote s. We are looking for a vector with three unknowns and one free variable ((x_1),s,(x_3),(x_4))⊤ that obeys the conditions:

[[1,3,0,0],[0,0,1,0],[0,0,0,1]]⋅[[(x_1)],[s],[(x_3)],[(x_4)]]=[[0],[0],[0]] ⇒ 1([1*(x_1)+3*s],[1*(x_3)],[1*(x_4)])=([0],[0],[0])1

Let's express the unknowns (x_1), (x_3), and (x_4) in terms of the free variable s. We immediately see that (x_3)=0 and (x_4)=0, and we can write (x_1)=-3*s. Therefore, any vector of the form (-3*s,s,0,0), for any s∈ℝ, is in the null space of A. We write N(A)={(-3,1,0,0)⊤}.

Observe that the dim (C(A))= dim (R(A))=3, this is known as the rank of the matrix A. Also, dim (R(A))+dim (N(A))=3+1=4, which is the dimension of the input space of the linear transformation (T_A).

Invertible matrix theorem

There is an important distinction between matrices that are invertible and those that are not as formalized by the following theorem.

For an n⨯n matrix A, the following statements are equivalent:

A is invertible
The RREF of A is the n⨯n identity matrix
The rank of a matrix is n
The row space of A is ℝn
The column space of A is ℝn
A doesn't have a null space (only the zero vector N(A)={O} )
The determinant of A is a nonzero det(A)≠0

For a given matrix A, the above statements are either all true or all false. An invertible matrix A corresponds to a linear transformation (T_A) which maps the n-dimensional input vector space ℝn to the n-dimensional output vector space ℝn such that there exists an inverse transformation (T_A)(-1) that can faithfully undo the effects of (T_A).

On the other hand, an n×n matrix B that is not invertible maps the input vector space ℝn to a subspace C(B)⊊ℝn and has a nonempty null space. Once (T_B) sends a vector w∈N(B) to the zero vector, there is no (T_B)(-1) that can undo this operation.

Eigenvalues and eigenvectors

The set of eigenvectors of a matrix is a special set of input vectors for which the action of the matrix is described as a simple scaling. When a matrix is multiplied by one of its eigenvectors the output is the same eigenvector multiplied by a constant A*(e_λ)=λ*(e_λ). The constant λ is called an eigenvalue of A.

To find the eigenvalues of a matrix we start from the eigenvalue equation A*(e_λ)=λ*(e_λ), insert the identity 1, and rewrite it as a null-space problem:

A*(e_λ)=λ*(e_λ)⇒(A-λ*1)*(e_λ)=O

This equation will have a solution whenever |A-λ*1|=0. The eigenvalues of A∈ℝ(n×n), denoted {(λ_1),(λ_2),⋯,(λ_n)}, are the roots of the p(λ)=|A-λ*1|. The eigenvectors associated with the eigenvalue (λ_i) are the vectors in the null space of the matrix (A-(λ_i)*1).

Certain matrices can be written entirely in terms of their eigenvectors and their eigenvalues. Consider the matrix Λ that has the eigenvalues of the matrix A on the diagonal, and the matrix Q constructed from the eigenvectors of A as columns:

Λ=[[(λ_1),⋯,0],[⋯,⋯,0],[0,0,(λ_n)]], Q=[[|,,|],[(e_(λ_1)),⋯,(e_(λ_n))],[|,,|]], then A=Q*Λ*Q(-1)

Matrices that can be written this way are called diagonalizable.

More Cheat Sheets ->

5. *extra*

Basis

Matrix representations of linear transformations

Dimension and bases for vector spaces

Row space, columns space, and rank of a matrix

Invertible matrix theorem

Eigenvalues and eigenvectors

5. *extra*

Basis

Matrix representations of linear transformations

Dimension and bases for vector spaces

Row space, columns space, and rank of a matrix

Invertible matrix theorem

Eigenvalues and eigenvectors

5. extra

5. extra