Math 110 - Fall 05 - Lectures notes # 8 - Sep 16 (Friday) Homework due Thursday, Sep 22: Sec 2.1: 1 (justify your answers), 3, 4, 5, 6, 7, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 25, 35, 38 Start reading chapter 2. Just as Chapter 1 generalized ideas about vectors familiar from math 54, Chapter 2 will generalize ideas about matrices: Ma54 => Ma110 real numbers => fields vectors of real numbers => vector spaces [ Ex: (x1,x2) => also P_n(F), Func(R,R), etc ] lines and planes through 0=> subspaces [ Ex: a*(x1,x2) => span({s1,s2,...,sn}) ] lines are 1D, planes 2D...=> dimension of any vector space matrices => linear transformations [ Ex: [ 1 2 ] => also differentiation, integration etc ] [ 3 4 ] multiplying matrix*vector => applying a linear transformation to a vector [ Ex: [ 1 2 ]*[x1]=[ x1+2*x2] => also T(f(x)) = f'(x) ] [ 3 4 ] [x2] [3*x1+4*x2] multiplying matrix*matrix => composing linear transformations forming inv(Q)*M*Q => changing the basis of a linear transformation Def: Let V and W be vector spaces over F. A function T: V -> W is a linear transformation from V to W (or just linear) if for all x,y in V and c in F (a) T(x+y) = T(x) + T(y) (b) T(c*x) = c*T(x) (note: (a) => (b) if F = Q, but in general need both) Lemma: T(0_V) = O_W T linear if and only if T(c*x+y) = c*T(x) + T(y) T linear implies T(x-y)=T(x) - T(y) T linear if and only if T(sum_i c_i*v_i) = sum_i c_i*T(v_i) Ex: V = R^2, W = R, T((x,y)) = x+2*y ASK & WAIT: Why is T linear? Ex: V = W = R^2, T((x;y)) = (x+y; -3*x+2*y) ASK & WAIT: Why is T linear? This is also written as the matrix-vector muliplication T([ x ]) = [ 1 1 ] * [ x ] = [ x+y ] ([ y ]) [-3 2 ] [ y ] [ -3*x+2*y ] Indeed, T: F^n -> F^m where T is multiplying an n-vector by an m x n matrix to get an m-vector is a linear transformation. In section 2.2, we will systematically write *all* linear transformations from F^n to F^m as multiplying an n-vector by an m x n matrix, to get an m-vector. But these are not the only linear transformations. Ex: V = {differentiable functions from [0,1] to R} W = {functions from [0,1] to R} T(f(x)) = f'(x) + integral_0^x t^2*f(t) dt ASK & WAIT: Why is T linear? Is T: V -> V? Ex: V = W = R^2, T_theta((x,y)) = vector (x,y) rotated clockwise by theta ASK & WAIT: Why is T linear? Def: V = W, T(x) = x called identity transformation, written I_V or I Def: T(x) = 0_W called zero transformation Def: Let T: V -> W be linear. Then the null space of T, N(T), is the set of all vectors v in V such that T(v) = 0_W. The range space (or just range) of T, R(T), is the set of all vectors w = T(v) for all v in V Thm 1: N(T) and R(T) are subspaces of V and W, resp. Proof: Consider N(T): clearly 0_V in N(T), since T(0_V) = 0_W. Since T is linear, v1 and v2 in N(T) implies T(a*v1 + b*v2) = T(a*v1)+T(b*v2) = a*T(v1) + b*T(v2) = a*0_V + b*0_V = 0_V, so N(T) is closed under + and *, and so a subspace. Next consider R(T): clearly 0_W in R(T), since T(0_V) = 0_W. If w1 and w2 in R(T), then there are v1 and v2 so that w1=T(v1) and w2=T(v2) and so a*w1+b*w2 = a*T(v1)+b*T(v2) = T(a*v1)+T(b*v2) = T(a*v1+b*v2) is in R(T) too, so R(T) is closed under + and *, and so a subspace. Ex: V = R^2, W = R, T((x;y)) = x+2*y N(T) = {(x;y): x+2*y = 0} = {f*(2, -1) for all reals f} R(T) = {w: w=x+2*y for some x,y} = R ASK & WAIT: Why is R(T) = R? Ex: V = W = R^2, T((x;y)) = (x+y; -3*x+2*y) N(T) = {(x;y): x+y=0 and -3*x + 2*y = 0} solving these 2 equations in 2 unknowns => x = y = 0, so N(T) = {0_V} R(T) = {(w,z): w = x+y and z = -3*x+2*y for some (x;y)} solving these 2 equations in 2 unknowns => y = (1/5)*z + (3/5)*w and x = (-1/5)*z + (2/5)*w, so any (w;z) in range Thm 2: If S = {v_1,...,v_n} is a basis for V, then R(T) = span(T(v_1),...,T(v_n)} Proof: if S is a basis all v in V are of the form sum_i a_i*v_i, so all w in R(T) are of the form T(sum_i a_i*v_i) = sum_i T(a_i*v_i) = sum_i a_i*T(v_i), so {T(v_1),...,T(v_n)} is a spanning set. ASK & WAIT: Is {T(v_1),...,T(v_n)} necessarily a basis? Ex: V = R^2, W = R^3, T((x;y)) = (x+y; -3*x+2*y, -x + 4*y) To compute N(T), note x+y=0 and -3*x+2*y=0 implies x=y=0 from before, so N(T) = {0_V} as before. R(T) = span(T((1;0)), T((0;1))) = span((1;-3;-1),(1;2;4)) Def: Let T: V -> W be a linear transformation with null space N(T) and range R(T). If N(T) is finite dimensional, we call its dimension the nullity of T, written "nullity(T)". If R(T) is finite dimensional, we call its dimension the rank of T, written "rank(T)". Thm 3 (Dimension Theorem): Let T: V -> W be a linear transformation. If V is finite dimensional, then dim(V) = rank(T) + nullity(T) Ex: Just as in chapter 1 where we asked what our theorems about vector spaces meant for R^3 or R^n, in chapter 2 we can ask what our theorems about linear transformation mean for matrices, especially simple matrices like diagonal ones. Consider V = R^n, W = R^m and T is multiplication by an m x n matrix. Suppose T is diagonal with 0s and 1s on the diagonal, i.e. only some T_{ii} can be nonzero, such as T = [ 1 0 0 0 ]; we will show that rank(T) = # nonzero columns of T [ 0 1 0 0 ] nullity(T) = # zero columns of T [ 0 0 0 0 ] so rank(T) + nullity(T) = # nonzero columns of T + # zero columns of T = # columns of T = n = dim(V) as claimed by Dimension Theorem For simplicity we suppose T_11 = T_22 = ... = T_rr = 1, and the rest are 0. So T has r nonzero columns and n-r zero columns. Then it is easy to see T(x) = T([ x_1 ]) = [ x_1 ] ([ ... ]) [ ... ] ([ x_r ]) [ x_r ] ([ x_{r+1} ]) [ 0 ] ([ ... ]) [ ... ] ... there are m-r zeros ([ x_n ]) [ 0 ] where there is one zero in the result vector for every zero row in T. So we can see that that R(T) is the space spanned by all vectors of the form on the right above, which has dimension r = # nonzero columns of T, and N(T) is all vectors of the form [ 0; ...; 0; x_{r+1};...; x_n ], i.e a space of dimension n-r = # zero columns of T as desired. Proof of Dimension Theorem: Since N(T) is a suspace of the finite dimensional space V, it is also finite dimensionsal, and has a basis {v_1,...,v_k}. This basis can be extended to a basis of V (Replacement Theorem), call it {v_1,...,v_n}. We claim {T(v_{k+1}),...,T(v_n)} is a basis of R(T). Assuming this for a moment, since it contains n-k vectors, we get dim(V) = n = k + (n-k) = dim(N(T)) + dim(R(T)) as desired. To see {T(v_{k+1}),...,T(v_n)} is a basis, we have to show it spans R(T) and is independent. But since {v_1,...,v_n} is a basis of V, R(T) = span(T(v_1),...,T(v_k),T(v_{k+1}),...,T(v_n)) = span( 0_W ,..., 0_W ,T(v_{k+1}),...,T(v_n)) = span(T(v_{k+1}),...,T(v_n)) so it spans R(T). To see it is independent, suppose it is not, and seek a contradiction: Write 0_W = sum_{i=k+1 to n} a_i*T(v_i) where not all a_i = 0 = T( sum_{i=k+1 to n} a_i*v_i) since T is linear so sum_{i=k+1 to n} a_i*v_i is a vector in N(T), i.e. sum_{i=k+1 to n} a_i*v_i = sum_{j=1 to k} b_j*v_j where at least one coefficient (a_i or b_j) is nonzero. But this contradicts the independence of the basis {v_1,...,v_n}. Natural questions to ask about any function T, not just linear ones, are (1) Is T one-to-one, i.e. does T(x)=T(y) imply x=y ? (2) Is T onto, i.e. for all w in W, is there a v in V such that w=T(V)? These are important ideas because T is one-to-one and onto if and only if T is invertible, i.e. an inverse function inv(T): W -> V exists. These are easy questions to answer for linear T, given the rank and nullity: Thm: Let T: V -> W where V and W are finite dimensional. Then (1) T is one-to-one if and only if nullity(T) = 0, i.e. N(T) = {0_V} (2) T is onto if and only if rank(T) = dim(W) In particular, since rank + nullity = dim(V), T is invertible if and only if nullity(T)=0 and dim(V)=dim(W). Ex: Consider T: F^n -> F^m to be multiplication by an m x n matrix, where T is diagonal with all T_ii = 1. We ask whether T is one-to-one and/or onto. There are 3 cases, depending on m and n: (1) m < n. Then T*(x_1;...;x_n) = (x_1;...;x_m). So R(T) = F^m and T is onto. But T*(0;...;0;x_{m+1};...;x_n) [m leading zeros] = 0_W, so nullity(T) = n-m > 0, and T is not one-to-one We will see that no T can be one-to-one if m < n. (2) m = n. So T is onto for the same reason as above and T*(x_1;...;x_n) = (x_1;...;x_n) = 0_W if and only if x=0_V, so nullity(T)=0 and so T is one-to-one. T is the identify function. (3) m > n. Then T*(x_1;...;x_n) = (x_1;...;x_n;0;...;0) [m-n trailing zeros] So rank(T) = n < m = rank(W) and T is not onto. But T(v)=0_W only if v=0_V, so nullity(T) = 0 and T is one-to-one. We will see that no T can be onto if m > n. Proof: (1) Suppose N(T) = {0_V}. Then T(x)=T(y) => T(x-y)=0 ... since T is linear => x-y=0_V ... since x-y in N(T) = {0_V} so T is one-to-one. Conversely, T one-to-one => T(v)=0_W only if v=0_V => N(T) = {0_V} (2) T onto => R(T) = W => rank(T) = dim(W). Conversely rank(T) = dim(R(T)) = dim(W), so since R(T) is a subspace of W, we must have R(T) = W, i.e. T is onto Corollary: Suppose T: V -> W and dim(V) = dim(W) is finite. Then the following are equivalent: (1) T is one-to-one (2) T is onto (3) T is invertible (4) rank(T) = dim(V) Proof: To prove them "equivalent", we need to show that if any one of them is true, then all of them are true. To do this we will prove that (1) <=> (2), ((1) and (2)) <=> (3), and (4) <=> (1). (1) <=> (2) because (1) => nullity(T)=0 => rank(T) = dim(V) - nullity(T) = dim(V) = dim(W) => (2); now note that all the implications work in the opposite direction too. ((1) and (2)) <=> (3) by the definition of invertibility (1) <=> nullity(T)=0 <=> rank(T) = dim(V) - nullity(T) = dim(V) We need that the dimensions are finite for this to be true, as you will see on homework. Ex: T: P_2(R) -> R^3 is defined by T(a2*x^2 + a1*x + a0) = (a2+a1,a1+a0,a0). Now dim(P_2(R)) = dim(R^3) = 3, so we can apply the Corollary. Now (a2+a1,a1+a0,a0)=(0,0,0) implies a0=0 => 0 = a1+a0 = a1 => 0 = a2+a1 = a2 so T is one-to-one, and hence is onto and invertible. ASK & WAIT: What is its inverse? The next theorem says that a linear transformation T is uniquely determined if we know what it does to a basis: Thm: V, W be vectors spaces over F, and let {v_1,..,v_n} be a basis for V. Given any subset {w_1,...,w_n} of n vectors from W, there is exactly one linear transformation T: V -> W such that T(v_i) = w_i. Proof: Since {v_1,...,v_n} is a basis, for any v in V there is a unique linear combination v = sum_{i=1 to n} a_i*v_i. We can then define T(v) = sum_{i=1 to n} a_i*w_i. We need to prove 3 things about T: (1) T is linear: T( vb + va ) = T(vb) + T(va) because T(sum_i b_i*v_i + sum_i a_i*v_i) = T( sum_i (b_i+a_i)*v_i ) = sum_i (b_i+a_i)*w_i = sum_i b_i*w_i + sum_i a_i*w_i = T( sum_i b_i*v_i) + T(sum_i a_i*v_i) T(c * vb) = c*T(vb) because T(c* sum_i b_i*v_i) = T( sum_i (c*b_i)*v_i ) = sum_i (c*b_i)*w_i = c* sum_i b_i*w_i = c* T( sum_i b_i*v_i) (2) T(v_k) = w_k because T(v_k) = T(sum_i a_i*v_i) where a_k=1 and the rest are zero = sum_i a_i*w_i where a_k=1 and the rest are zero = w_k (3) T is unique because if U: V -> W also satisfies U(v_i) = w_i then for all v = sum_i a_i*v_i we get U(v) = U( sum_i a_i*v_i) = sum_i U(a_i*v_i) by linearity of U = sum_i a_i*U(v_i) by linearity of U = sum_i a_i*w_i by def of U(v_i) = sum_i a_i*T(v_i) by def of T(v_i) = sum_i T(a_i*v_i) by linearity of T = T( sum_i a_i*v_i) by linearity of T = T( v )