Math 110 - Fall 05 - Lectures notes # 18 - Oct 10 (Monday)

Quick review of main ideas of last week, then sec 2.6 (not sec 2.7).

The goal of sec 2.4 is to study inverses of linear transformations.

Def: Let T: V -> W be linear. A function U: W -> V is called
an inverse of U of T if 
(1) UT: V -> V is the identity function UT = I_V on V,
    i.e. I_V(v) = v for all v in V 
(2) TU: W -> W is the identity function TU = I_W on W,
    i.e. I_W(w) = w for all w in W
If T has an inverse, we call T invertible, and write U = T^{-1}

Main properties of inverses of linear transformations:
   The inverse of a linear transformation is a linear transformation
   The inverse is unique.
   If T: V -> W and S: W -> Z are invertible,
      then ST: V -> Z is invertible and (ST)^{-1} = T^{-1} S^{-1}
   (T^{-1})^{-1} = T
   Suppose T: V -> W with V and W finite dimensional. 
      Then T is invertible if and only if rank(T) = dim(W) = dim(V)

Def: Let A be an n by n matrix. Then A is invertible if there is another
     n by n matrix B with AB = BA = I. B is called the inverse of A,
     and written A^{-1}

Main properties of inverses of matrices (analogous to above)
     The inverse is unique.
ASK & WAIT: Why?
     If A and B are n by n and invertible, 
         then AB is invertible too and (AB)^{-1} = B^{-1} A^{-1}
ASK & WAIT: Why?
     (A^{-1})^{-1} = A
ASK & WAIT: Why?
    If A and B are n by n, then AB = I if and only if BA = I (homework!)

Natural question (see chap 3): Given A, how do you compute A^{-1}?

Connections between inverses of linear transformations and their matrices:
  Let T: V -> W with beta an ordered basis of V and gamma an ordered basis of W.
  Let A = [T]_beta^gamma be the matrix representation of T,
  relative to beta and gamma. Then A is invertible if and only T is invertible,
  with [T^{-1}]_gamma^beta = A^{-1}

Proof: Use Thm 2.11 to write
    I = [I_V]_beta = [T^{-1} T]_beta = [T^{-1}]_gamma^beta * [T]_beta^gamma

Def: Two vector spaces V and W are called isomorphic if there is an invertible
     linear transformation T: V -> W.

The goal of sec 2.5 is to study introduce "change of basis" or "similarity"

Let T: V -> V be a linear transformation, with n = dim(V).
Let beta and gamma be two ordered bases for T. 
Then A = [T]_beta and B = [T]_gamma are (in general) different
n by n matrices. But both represent the same linear transformation T, 
so they are closely related; we call this relationship a similarity.

To see what the relationship is, we start with
   T = I_V T = T I_V
and apply Theorem 2.11, to get
   [I_V T]_beta^gamma = [I_V]_beta^gamma * [T]_beta^beta = Q * [T]_beta
 = [T I_V]_beta^gamma = [T]_gamma^gamma * [I_V]_beta^gamma = [T]_gamma * Q
or
   Q*[T]_beta = [T]_gamma*Q
Assuming Q is invertible for a moment, we get
   [T]_gamma = Q * [T]_beta * Q^{-1}
Q is called a "change of basis from beta to gamma" and is invertible because
   I = [I_V]_beta^beta = [I_V I_V]_beta^beta
     = [I_V]_gamma^beta * [I_V]_beta^gamma
     = [I_V]_gamma^beta * Q
and similarly Q * [I_V]_gamma^beta = I, so Q^{-1} = [I_V]_gamma^beta

Def: We call n by n matrices A and B similar if there is an invertible
     n by n matrix Q such that A = Q*B*Q^{-1}

Note: A = Q*B*Q^{-1} if and only if Q^{-1}*A*Q = B
ASK & WAIT: Why?
Lemma: If A and B are similar, and B and C are similar, then A are C are similar
Proof: A and B similar => A = Q*B*Q^{-1} for some Q
       B and C similar => B = R*C*R^{-1} for some R
so A = Q*(R*C*R^{-1})*Q^{-1} = (Q*R)*C*(R^{-1}*Q^{-1})
                             = (Q*R)*C*(QR)^{-1}

Look ahead: Similar matrices A and B represent the same linear transformation
in different bases, so they have a lot of common properties:
   rank(A) = rank(B) and nullity(A) = nullity(B)
   A is invertible if and only if B is invertible
   A and B have the same trace = sum of diagonal entries (homework)
   A and B have the same eigenvalues
       (algorithms for eigenvalues of A try to find a similar B whose
        eigenvalues are "easy" to find, eg diagonal B)
   A and B have eigenvectors related by multiplying by Q

Dual spaces (sec 2.6):

Let V be a vector space over F. Recall that F is itself a 1-dimensional
vector space over itself. 

Def: A linear transformation from V to F is called a linear functional.
The set of all linear functionals L(V,F) is called the dual space of V,
written V*.

Notation: we write linear functionals using lower case letters, eg f(v).

Ex: Let V = F^n, v = [v_1,...,v_n]. Then for any i,
    f_i(v) = v_i is a linear functional, called the i-th coordinate function.

Ex: Let V, v be as above. Let y = [y_1;...;y_n] be a fixed n-tuple in F^n.
    Then y*(v) = sum_{k=1 to n} y_i*v_i is a linear functional

Ex: Let V be finite dimensional, beta = {x_1,...,x_n} an ordered basis, 
    recall [v]_beta = [v_1,...,v_n] are the coordinates of v relative to beta.
    Then f_i(v) = v_i is a linear functional, the i-th coordinate function
      relative to beta.

ASK&WAIT: What is f_i(x_j)?

Ex: Let V, beta be as above. Then any linear functional f can we written as
    f(v) = f( sum_{i=1 to n} x_i*v_i )
         = sum_{i=1 to n} f(x_i)*v_i
         = sum_{i=1 to n} f(x_i)*f_i(v)
    or f = sum_{i=1 to n} f(x_i)*f_i
    i.e. any f in L(V,F) can be written as a linear combination of 
       coordinate functions. In other words,
    beta* = {f_1,...,f_n} is an ordered basis for V*.

Def: In above notation, beta* = {f_1,...,f_n} is called the dual basis of beta.

Ex: Let V = F^n with the standard ordered basis. 
    Then all linear functionals are of the form
    f(v) = sum_{i=1 to n} y_i*v_i for some n-tuple y=[y_1,...,y_n]
    We will denote this linear functional as y*(v)

Ex: Since f_i: V -> F, we can compute its matrix representation relative to V:
    [f_i]_beta^1 = [0,...,0,1,0,...,0], with a 1 in the i-th location 
    (here, 1 is the basis for F itself)

Ex: Let V = F^n, beta = standard ordered basis, and
    y*(v) = sum_{i=1 to n} y_i*v_i as above. Then
    [y*]_beta^1 = [y_1,...,y_n] = [y_1;...;y_n]^t
    This is a 1 by n matrix, also called a row-vector
    If we think of v in F^n as a column vector [v_1;...;v_n], then
    we can write 
        y*(v) = sum_{i=1 to n} y_i*v_i = y^t * v.

Ex: Let V = R^2, beta = {[1;2],[-1;3]}. We compute beta* as follows:
    f_1([1;2])  = 1 means  f_1(e_1) + 2*f_1(e_2) = 1
    f_1([-1;3]) = 0 means -f_1(e_1) + 3*f_1(e_2) = 0
    solving these 2 equations in 2 unknowns for f_1(e_1) and f_1(e_2) yields
      f_1(e_2) = 1/5 and f_1(e_1) = 3/5
    Similarly f_2(e_1) + 2*f_2(e_2) = 0 and -f_2(e_1) + 3*f_2(e_2) = 1 imply
      f_2(e_2) = 1/5 and f_2(e_1) = -2/5

Ex: Let V = continous functions in [0, 2*pi]. Pick some g(t) in V.
    Then f(v) = (1/2 pi) * integral_{0 to 2*pi} g(t)*v(t) dt is a 
    linear functional (why?). When g(t) = sin(nt) or cos(nt), f(v)
    is called the nth Fourier coefficient of v.

Recall our definition of a transpose of a matrix: if A is m by n then
B = A^t  is n by m and defined by B_ij = A_ji. We relate transposes and
dual spaces as follows. 

First note that if v and y are in F^n, we may think of v and y as n by 1 
matrices, or column vectors. Then y^t is a 1 by n matrix, and
(*)   y^t * v = sum_{i=1 to n} y_i * v_i = y*(v)
is the linear functional y*(v) defined above.

Lemma: (A * B)^t = B^t * A^t for any two matrices where the product is defined
  Proof: See section 2.3

Now, in the language of matrices, suppose A is m by n,
y is m by 1, and x is n by 1. Then A*x is m by 1, and so
   y* (A * x) = y^t * (A * x)   ... by  (*)
              = (y^t * A) * x   ... by associativity
              = (A^t * y)^t * x   ... by Lemma
              = (A^t * y)* (x)   ... by (*)
In other words, the mapping from y to A^t*y take a linear functional on
F^m and converts it into a linear functional on F^n.

We say the same thing more abstractly:
Suppose T: V -> W and g: W -> F, i.e. g is in the dual space W* of W. then
gT: V -> F, i.e. gT is in the dual space V* of V. In other words the
mapping from g to gT maps W* to V*. We denote this mapping
by T*: W* -> V*, with T*(g) = gT. 

Ex: Suppose T: R^2 -> R^2 is matrix-vector multiplication by [1 -1;0 2].
    and suppose g: R^2 -> R is g((x,y)) = x - y
    Then gT((x,y)) = g((x-y;2*y)) = (x-y) - (2*y) = x-3*y

Thm: Let  beta = {x_1,...,x_n} be an ordered basis for V and
     let gamma = {y_1,...,y_m} be an ordered basis for W.
     Let beta* and gamma* be their dual bases, ie
         beta* = {x*_1,...,x*_n} with x*_i(x_j) = 1 if i=j and 0 otherwise
        gamma* = {y*_1,...,y*_n} with y*_i(y_j) = 1 if i=j and 0 otherwise
     Given T: V -> W, define T*: W* -> V* as above.
     Then [T*]_gamma*^beta* = ([T]_beta^gamma)^t
     i.e. the matrix representing T* is the transpose of the 
     matrix representing T.
Proof: We note that [x*_i]_beta^1 is the 1 by n matrix [ 0,...,0,1,0,...,0]
       with a 1 in the i-th entry,  and
       similarly for the 1 by m matrix [y*_i]_gamma^1
    
       Then    T*(y*_j) = y*_j T    ... by def of T*
            => T*(y*_j)(x_i)  = y*_j T(x_i)    ... applying both to x_i
       But y*_j T(x_i) = y*_j(sum_{k=1 to m} ([T]_beta^gamma)_ki * y_k))  
                            ... by def of T relative to beta, gamma
                       = sum_{k=1 to m} ([T]_beta^gamma)_ki * y*_j(y_k)   
                            ... since y*_j linear
                       = [T]_beta^gamma_ji    
                            ... since y*_j(y_k) = 1 if j=k, 0 otherwise
    Also T*(y*_j)(x_i) = (sum_{k=1 to n} ([T*]_gamma*^beta*)_kj * x*_k)(x_i)
                            ... by def of T* relative to gamma*, beta*
                       = sum_{k=1 to n} ([T*]_gamma*^beta*)_kj * x*_k(x_i)
                            ... by linearity
                       = ([T*]_gamma*^beta*)_ij
                            ... since x*_k(x_i) = 1 if k=i, 0 otherwise