Math 110 - Fall 05 - Lectures notes # 11 - Sep 23 (Friday)

The next goal is to make explicit the connection between
matrices, familiar from Math 54, and linear transformations
T: V -> W between finite dimensional vectors spaces.
They are not quite the same, because the matrix that represents
T depends on the bases you choose to span V and W, and
the order of these bases:

Def: Let V be a finite dimensional vector space. An
ordered basis of V is a basis for V with an order:
{v_1,...,v_n}, where n=dim(V).

Ex: Let e_i = i-th standard basis vector (1 in i-th entry, 0 elsewhere)
Then the bases {e_1,e_2,e_3} and {e_2,e_3,e_1} are the same 
(order in a set does not matter), but the ordered bases
{e_1,e_2,e_3} and {e_2,e_3,e_1} are different.

Def: For V = F^n, {e_1,...,e_n} is the standard ordered basis.
For P_n(F), {1,x,x^2,...,x_n} is the standard ordered basis.

Given ordered bases for V and W, we can express both vectors 
in V and W, and linear transformations T:V->W as vectors and 
matrices with respect to these ordered bases:

Def: Let beta = {v_1,...,v_n} be an ordered basis for V. 
For any x in V, let x = sum_{i=1 to n} a_i*v_i be the unique
linear combination representing x. The coordinate vector
of x relative to beta, denoted [x]_beta, is 
   [x]_beta = [ a_1 ]
              [ a_2 ]
              [ ... ]
              [ a_n ]

ASK & WAIT: What is [v_i]_beta ?
ASK & WAIT: Let V = P_5(F), beta = {1,x,x^2,x^3,x^4,x^5}, 
 and v = 3-6x+x^3.  What is [v]_beta?
ASK & WAIT: If beta = {x^5, x^4, x^3, x^2, x, 1}?

Lemma: The mapping Beta: V -> F^n that maps x to [x]_beta is linear.
Proof: if x = sum_i a_i*v_i and y = sum_i b_i*v_i, then
         [x]_beta = [a_1;...;a_n], [y]_beta=[b_1;...;b_n] and
       [x+y]_beta = [sum_i (a_i+b_i)*v_i]_beta   ... by def of x+y
                  = [a_1+b_1 ; ... ; a_n+b_n ]   ... by def of []_beta
                  = [a_1 ; ... ; a_n] + [b_1; ... ; b_n]
                  = [x]_beta + [y]_beta          ... by def of []_beta
        Similarly, [c*x]_beta = c*[x]_beta
 
We need this representation of vectors in V as 
coordinate vectors of scalars in order to apply T: V -> W
as multiplication by a matrix. We will also need to
represent vectors in W the same way.

Let beta = {v_1,...,v_n} and gamma = {w_1,...,w_m}
be ordered bases of V and W, resp. Let T: V -> W be linear.
Then there are unique scalars a_{ij} such that
    T(v_j) = sum_{i=1 to m} a_{ij}*w_i
These scalars will be the entries of the matrix representing T:

Def: Let T: V -> W be linear, V and W finite dimensional.
Using the above notation, the m x n matrix A with 
entries a_{ij}, is the matrix representation of T in the ordered
bases beta and gamma. We write A = [T]_beta^gamma.
If V = W and beta = gamma, we write simply A = [T]_beta

Note that column j of A is [a_{1j};...;a_{mj] = [T(v_j)]_gamma

To see why we call A the matrix representation of T, let us use
it to compute y = T(x).
Suppose x = sum_{j=1 to n} x_j*v_j, so [x]_beta = [x_1;...;x_n]
is the coordinate vector for x. We claim the coordinate
vector for y is just gotten by multiplying by A:
  [y]_gamma = A * [x]_beta
To confirm this we compute:
     y = T(x) = T(sum_{j=1 to n} x_j*v_j)  ... by def of x
              = sum_{j=1 to n} x_j*T(v_j)  ... since T is linear
              = sum_{j=1 to n} x_j*(sum_{i=1 to m} a_{ij}*w_i)
                                           ... by def of T(v_j)
              = sum_{j=1 to n} sum_{i=1 to m} a_{ij}*x_j*w_i
                                           ... move x_j into sum
              = sum_{i=1 to m} sum_{j=1 to n} a_{ij}*x_j*w_i
                                           ... reverse order of sums
              = sum_{i=1 to m} w_i * (sum_{j=1 to n} a_{ij}*x_j)
                                           ... pull w_i out of inner sum
so 
[y]_gamma = [ sum_{j=1 to n} a_{1j}*x_j ] = A * [ x_1 ] = A*[x]_beta
            [ sum_{j=1 to n} a_{2j}*x_j ]       [ x_2 ]   as desired
            [            ...            ]       [ ... ]
            [ sum_{j=1 to n} a_{mj}*x_j ]       [ x_n ]

Ex: T:R^2 -> R^4, T((x,y) = (x-y, 3*x+2*y, -2*x, 7*y)
    beta = standard basis for R^2, gamma = standard basis for R^4,
    so  T((1,0)) = (1;3;-2;0) and T((0,1)) = (-1;2;0;7), so
    A = [ 1 -1 ] (for brevity in these notes, we will sometimes use
        [ 3  2 ]  "Matlab notation": T = [ 1 -1 ; 3 2 ; -2 0 ; 0 7 ] )
        [-2  0 ]
        [ 0  7 ]

ASK & WAIT: What if beta = {e2,e1} and gamma = {e3 e4 e1 e2}?

Ex (continued): Suppose x = 3*e1 - e2; what is T(x)? 
   what is [T(x)]_gamma, using standard bases?
   T(x) = T(3,-1) = (4,7,-6,-7)
   [T(x)]_gamma = A * [3;-1] = [4;7;-6;-7]

Ex: T: P_3(R) -> P_2(R), T(f(x)) = f'(x),
beta = {1, 1+x, x^2, x^3 }, gamma = {2 , x , x^2}
Then T(1) = 0, T(1+x) = 1 = (1/2)*2; T(x^2) = 2*x ; T(x^3) = 3*x^2
So T = [ 0 1/2 0 0 ]
       [ 0  0  2 0 ]
       [ 0  0  0 3 ]
ASK & WAIT: What is T if beta = { 1, x, x^2, x^3 }? If gamma={1,x,x^2}?

Having identified matrices with linear transformations between
two finite dimensional spaces with ordered bases, and recalling
that mxn matrices form a vector space, we will not be surprised
that all the linear transformations between any two vector
spaces is also a vector space:

Def: Let T and U be linear transformations from V -> W.
Then we define the new function T+U: V -> W by (T+U)(v) =T(v)+U(v)
and the new function c*T: V -> W by (c*T)(v) = c*T(v)

Thm: Using this notation, we have that
  (1) For all scalars c, c*T+U is a linear transformation
  (2) The set of all linear transformation from V -> W,
      is itself a vector space, using
      the above definitions of addition and multiplication by scalars
Proof:
  (1) (c*T+U)(sum_i a_i*v_i) 
           = (c*T)(sum_i a_i*v_i) + U(sum_i a_i*v_i) ... by def of c*T+U
           = c*(T(sum_i a_i*v_i)) + U(sum_i a_i*v_i) ... by def of c*T
           = c*(sum_i a_i*T(v_i)) + sum_i a_i*U(v_i) ... since T,U linear
           = sum_i a_i*c*T(v_i) + sum_i a_i*U(v_i)   
           = sum_i a_i*(c*T(v_i)+U(v_i))             
           = sum_i a_i*(c*T+U)(v_i)                  ... by def of c*T+U
  (2) We let T_0, defined by T_0(v) = 0_W for all v, be the 
      "zero vector" in L(V,W). It is easy to see that all the
      axioms of a vector space are satisfied. (homework!)

Def: L(V,W) is the vector space of all linear transformations
     from V -> W. If V=W, we write L(V) for short.

Given ordered bases for finite dimensional V and W, we get a 
matrix [T]_beta^gamma for every T in L(V,W). It is natural
to expect that the operations of adding vectors in L(V,W)
(adding linear transformations) should be the same as adding
their matrices, and that multiplying a vector in L(V,W) by a
scalar should be the same as multiplying its matrix by a scalar:

Thm: Let V and W be finite dimensional vectors spaces with
ordered bases beta  and gamma, resp. Let T and U be in L(V,W).
Then
   (1) [T+U]_beta^gamma =   [T]_beta^gamma + [U]_beta^gamma
   (2) [c*T]_beta^gamma = c*[T]_beta^gamma 
In other words, the function []_beta^gamma: L(V,W) -> M_{m x n}(F)
is a linear transformation.

Proof: (1) We compute column j of matrices on both sides
           and comfirm they are the same. Let beta = {v_1,...,v_n}
           and gamma = {w_1,...,w_m}. Then
       (T+U)(v_j) = T(v_j) + U(v_j) so
      [(T+U)(v_j)]_gamma = [T(v_j)]_gamma + [U(v_j)]_gamma
       by the above Lemma that shows the mapping x -> [x]_gamma
       was linear. 
        (2) Similarly (c*T)(v_j) = c*T(v_j) so
            [(c*T)(v_j)]_gamma = [c*T(v_j)]_gamma
                               = c*[T(v_j)]_gamma by the Lemma
            so the j-th columns of both matrices are the same

To summarize, what we have so far is this: 
Given two ordered bases beta for V and gamma for W, we have
a one-to-one correspondence between 
all linear transformations in L(V,W) and all matrices in M_{m x n}(F).
Furthermore, this 1-to-1 correspondence "preserves" operations
in each set:
  adding linear transformations <=> adding matrices
        [ T + U <=> [T]_beta^gamma + [U]_beta^gamma ]
  multiplying linear transformations by scalars 
          <=> multiplying matrices by scalars
        [  c*T  <=> c*[T]_beta^gamma ]
  applying linear transformation to a vector
          <=> multiplying matrix times a vector
        [ y = T(x)  <=>  [y]_gamma = [T]_beta^gamma * [x]_beta  ]

There are several other important properties of and operations
on linear transformations, and it is natural to ask what they
mean for matrices as well (we write A = [T]_beta^gamma for short)
  null space N(T)    <=> set of vectors x such that A*x = 0
  range space R(T)   <=> all vectors of the form A*x
                     <=> all linear combinations of columns of A
  T being one-to-one <=> Ax = 0 only if x=0
  T being onto       <=> span(columns of A) = F^m
  T being invertible <=> A square and Ax=0 only if x=0

What about composition? If T: V -> W and U: W -> Z, then
consider UT: V -> Z, defined by UT(v) = U(T(v))

Thm: If T is in L(V,W) and U is in L(W,Z), then UT is in L(V,Z),
i.e. UT is linear.
Proof: UT( sum_i a_i*v_i ) = U(T( sum_i a_i*v_i ))  ... by def of UT
                           = U( sum_i a_i*T(v_i))   ... since T is linear
                           = sum_i a_i*U(T(v_i))    ... since U is linear
                           = sum_i a_i*UT(v_i)      ... by def of UT

Thm: 
  (1) If T in L(V,W) and U1 and U2 are in L(W,Z), then
      T(U1+U2) = TU1 + TU2  (distributibity of addition over composition)
  (2) If T in L(V,W), U in L(W,Z), and A in L(Z,Q), then
      T(UA) = (TU)A is in L(V,Q)   (associativity of composition)
  (3) If T in L(V,W), I_V the identity in L(V,V) and I_W the identity in L(W,W), 
      then T I_V = I_W T = T   (composition with identity)
  (4) If T in L(V,W) and U in L(W,Z), and c a scalar, then
      c(TU) = (cT)U = T(cU)  (commutativity and associativity of 
      composition with scalar multiplication)
Proof: homework!

Now that we understand the properties of composition of linear transformations,
we can ask what operation on matrices it corresponds to:

Suppose T: V -> W and U: W -> Z are linear so UT: V -> Z
Let beta  = {v_1,...,v_n} be an ordered basis for V
    gamma = {w_1,...,w_m} be an ordered basis for W
    delta = {z_1,...,z_p} be an ordered basis for Z
with A = [U]_gamma^delta the p x m matrix for U,
     B = [T]_beta^gamma the m x n matrix for T,
     C = [UT]_beta^delta the p x n matrix for UT
Our question is what is the relationship of C to A and B?
We compute C as follows:
   column j of C = [UT(v_j)]_delta        ... by def of C
                 = [U(T(v_j))]_delta      ... by def of UT
                 = [U(sum_{k=1 to m} B_kj*w_k)]_delta  ... by def of T(v_j)
                 = [sum_{k=1 to m} B_kj*U(w_k)]_delta  ... by linearity of U
                 = [sum_{k=1 to m} B_kj* sum_{i=1 to p} A_ik * z_i]_delta  
                         ... by def of U(w_k)
                 = [sum_{k=1 to m} sum_{i=1 to p} A_ik * B_kj * z_i]_delta  
                         ... move B_kj inside summation
                 = [sum_{i=1 to p} sum_{k=1 to m} A_ik * B_kj * z_i]_delta  
                         ... reverse order of summation
                 = [sum_{i=1 to p} z_i * (sum_{k=1 to m} A_ik * B_kj)]_delta  
                         ... pull z_i out of inner summation
                 = [sum_{k=1 to m} A_1k * B_kj]   ... by def of []_delta
                   [sum_{k=1 to m} A_2k * B_kj]
                   [            ...           ]
                   [sum_{k=1 to m} A_mk * B_kj]
said another way, C_ij = sum_{k=1 to m} A_ik * B_kj

Def: Given the p x m matrix A, and the m x n matrix B we define their 
matrix-matrix product (or just product for short) to be the p x n matrix C
with C_ij = sum_{k=1 to m} A_ik * B_kj

Thm: Using the above notation, 
         [UT]_beta^delta = [U]_gamma^delta * [T]_beta^gamma

In other words, another correspondence between linear transformations
and matrices is:
    composition of linear transformations <=> matrix-matrix multiplication