Math 128a - Spring 2002 - Notes on Numerical Linear Algebra Linear Algebra Read sections 4.0 through 4.4 and 4.8. This is another area in which excellent software is available in Matlab and on netlib, which you should again use unless your problem is very difficult or special. We will begin with a tour of the basic kinds of linear algebra problems people solve, and then concentrate on the first kind, solving linear systems of equations. Let's remember some definitions first: Matrix-vector multiplication: Suppose A is m-by-n and x is n-by-1. Then y=A*x is defined as the m-by-1 vector y(i) = sum (from j=1 to n) A(i,j)*x(j) 1) Solving linear systems. We use the notation A*x=b where A is an n-by-n matrix of given numbers. b is an n-by-1 vector (a "column vector") of given numbers. b is often called the "right-hand-side", since we traditionally write A*x=b and not b = A*x. x is an n-by-1 vector of unknowns to solve for. Written out longhand, A*x=b means sum (from i=1 to n) A(i,j)*x(j) = b(i) for i = 1 to n This means we have n equations and n unknowns. The next definition we need is closely related, namely matrix-matrix multiplication: If A if m-by-n and B is n-by-k then C=A*B is defined as the m-by-k matrix with entries C(i,j) = sum (from m=1 to n) A(i,m)*B(m,j) Where does this definition come from? Proposition: This is the only definition that makes multiplication associative: A*(B*x) = (A*B)*x for all x Note: we need associativity to do most algebraic manipulations There is a theorem from Math 54 we recall below (but do not prove) about when we expect a solution to A*x=b. We need the following notation: I = "identity matrix", with 1s on the diagonal and 0s elsewhere A^T = "tranpose of A" is the matrix with entries (A^T)(i,j) = A(j,i) Theorem: Suppose A is n-by-n. The following are equivalent: 1) The matrix A has an "inverse" inv(A) with the properties A*inv(A) = inv(A)*A = I. 2) The system A*x = b has a unique solution for any b. 3) The determinant of A, Det(A), is nonzero. 4) A*x = 0 implies that x=0. (Another way to say this is that "A has full rank" or "the nullspace (or kernel) of A is the zero vector" If any one of these is true, then we say A is "invertible" or "nonsingular" and the following properties also hold: 1) the solution of A*x = b is b = inv(A)*b 2) the (i,j) entry of inv(A) is (-1)^(i+j)*Det(A')/Det(A), where A' is the (n-1)-by-(n-1) matrix gotten by dropping the j-th row and i-th column of A. This is also called Cramer's Rule. We will NOT solve A*x=b in practice by Cramer's Rule, which is numerically unstable and expensive, but rather by other algorithms, such as Gaussian elimination. This is very easy to express in Matlab: x = A\b 2) The second common kind of linear algebra problem involves trying to solve A*x=b when A is m-by-n with m>n, b is m-by-1 and x is n-by-1, i.e. there are more equations than unknowns. In this case we cannot expect an exact solution in general, and instead settle for making the residual vector r = A*x-b as small as possible in some sense. Ex: A = [1;2], b = [1;1], In this case x is 1-by-1 and r = [1*x-1;2*x-1] The standard way to choose x is to make the length of r as small as possible, i.e. sqrt(sum (from i=1 to n) r(i)^2). This is usually written as follows: Def: the 2-norm of a vector r is ||r||_2 = sqrt(sum (from i=1 to n) r(i)^2) Ex: With A and b as above, ||r||_2 = sqrt( (x-1)^2 + (2*x-1)^2). We can minimize this by writing it as ||r||_2 = sqrt(5*x^2 - 6*x + 2), and differentiating to find the minimum at x = .6, which means ||r||_2 = sqrt(.2) ~ .447 Def: Picking x to minimize ||r||_2 = ||A*x-b||_2 is called solving a least squares problem. The Matlab expression x = A\b solves this too (if A has more rows than columns, Matlab assumes you want to solve a least squares problem and uses the right algorithm.) Ex: Polynomial approximation. Suppose we want to find a linear polynomial p(x) (i.e. a straight line) that approximates another function f(x), and that we know b_i = f(a_i) at n points a_1,...,a_n. If n=2, we just do interpolation and we are done. But if n>2, what do we do? We can try to find the coefficients x1 and x2 defining a straight line b=a*x1+x2 in the (a,b) plane such that the line passes as "near" to all the points (a_i,b_i), i=1,...,n as possible. One way to do this is to minimize the root-sum-of-squares ||r||_2 of the "errors" r(i) = ( a(i)*x1 + x2 ) - b(i) or r = A*x-b = [ 1 a(1) ] * [ x1 ] - [ b(1) ] [ 1 a(2) ] [ x2 ] [ b(2) ] ... ... [ 1 a(m) ] [ b(m) ] We can write this as minimizing ||r||_2 = ||A*x-b||_2 = and so as a least squares problem. Ex: Here is a Matlab example of (linear) polynomial approximation n=10; a=(1:n)'; b = 2*a+1 + a.*(2*rand(n,1)); plot(a,b,'b',a,b,'x') A = [ones(n,1),a]; format short e % same matlab notation for least squares and solving a linear system x = A\b; plot(a,b,'b',a,b,'x',a,A*x,'r') Finding the best quadratic fit meant finding x=[x1 x2 x3] to minimize ||r||_2 = ||A*x-b||_2 where r = A*x-b = [ 1 a(1) a(1)^2 ] * [ x1 ] - [ b(1) ] [ 1 a(2) a(2)^2 ] [ x2 ] [ b(2) ] [ 1 a(3) a(3)^2 ] [ x3 ] [ b(3) ] ... ... [ 1 a(m) a(m)^2 ] [ b(m) ] Ex: Here is a Matlab example of (quadratic) polynomial approximation b2 = ((a-5).^2 + 2*(a-3) + 1 ) .* (1+rand(n,1)); A2 = [ones(n,1),a,a.^2]; x2 = A2\b2; plot(a,b2,'b',a,b2,'x',a,A2*x2,'r') There is a nice theorem that reduces the solution of this problem to solving a system of equations, the so-called "normal equations": Theorem: Suppose A is m-by-n with m >= n, and that A*x=0 implies x=0. (in other words, A has "full rank"). Then for any m-by-1 vector b, the least squares problem of picking x to minimize ||r||_2 = ||A*x-b||_2 has a unique solution x, which is the unique solution of the n-by-n linear system of equations ( A^T * A ) * x = A^T * b These are called the normal equations. The usual methods for solving least squares problems are mathematically equivalent to the normal equations but generally do not form A^T*A explicitly, because it can be less accurate to do so (losing up to twice as many digits due to roundoff). Matlab's x = A\b uses a good algorithm. What is the "right thing" for Matlab to do with x = A\b if A has fewer rows than columns? What does this problem mean? 3) The third common kind of linear algebra problem involves finding eigenvalues and eigenvectors of a square matrix: Def: If A*x = lambda*x where A is n-by-n, x is n-by-1 and nonzero, and lambda is a scalar, then x is called an eigenvector and lambda an eigenvalue. We call (lambda,x) an eigenvalue/eigenvector pair of A. Theorem: Every matrix has n eigenvalues. They are the zeros of the characteristic polynomial p(lambda) = det(A - lambda*I). Each distinct eigenvalue has (at least) one eigenector. Ex: Consider a linear differential equation x'(t) = A*x(t) where A is n-by-n and constant, and x(t) is n-by-1. A common question in applications is "will x(t) stay bounded for all t>0, no matter what x(0) is?" because it says whether the system the ODE describes "blows up" or is "stable". Now suppose that A*x_j = lambda_j x_j, so that (lambda_j,x_j) is the j-th eigenvalue/eigenvector pair of A. Then we claim that x(t) = x_j * exp(lambda(j)*t) is a solution of the ODE. We can verify this as follows: x'(t) = x_j * lambda(j)*exp(lambda(j)*t) = A*x_j * exp(lambda(j)*t) = A*x(t) It is easy to see if this x(t) is bounded for all t: || x(t) ||_2 = |exp(lambda(j)*t| * ||x_j||_2 = |exp(rl(j)*t + i*il(j)*t)| * ||x_j||_2 where lambda(j) = rl(j) + i*il(j), and i=sqrt(-1) i.e. rl(j) is the real part of lambda(j) and il(j) is the imaginary part of lambda(j) = exp(rl(j)*t) * ||x_j||_2 which is bounded for all t>0 if and only if rl(j) <= 0. Extending this technique lets us prove: Theorem: Suppose all the eigenvalues lambda(j) = rl(j) + i*il(j) of A are distinct. Then x(t) is bounded for all t and for all x(0) if and only if rl(j) <= 0 for all j. More concretely, if the ODE describes some mechanical system, then the eigenvalues of A are the frequencies at which the system "likes to vibrate" or "resonates", and the eigenvectors describes how the system vibrates (the "shape" of the vibration). In the case of earthquake analysis, a civil engineer might write down an ODE describing a building, compute its eigenvalues, and see if those eigenvalues match the frequencies at which the ground tends to vibrate during earthquakes (this information is obtained from seismographic recordings of old earthquakes). If any frequencies (eigenvalues) at which the building vibrates are close to the frequencies at which the earth vibrates, the building will resonate with the earth, and be more likely to fall down. Finding eigenvalues and eigenvectors in matlab is easy: D = eig(A) returns a diagonal matrix D with the eigenvalues on the diagonal. [X,D] = eig(A) returns the diagonal matrix D of eigenvalues and a matrix X whose columns are the corresponding eigenvectors. Given this overview of linear algebra problems, we return to solving linear systems of equations A*x=b. The most basic method, which we will concentrate on, is Gaussian elimination with partial pivoting (GEPP). If A is n-by-n, solving A*x=b costs (2/3)*n^3 operations. But there are a lot of other algorithms, because depending on what properties A has, one can go a lot faster. Ex: People frequently come to my office asking for advice on solving A*x=b. Here is a typical conversation: Student: I need to solve A*x=b for my research. JD : That will cost you (2/3)n^3, using Gaussian elimination. A good implementation can be found in the LAPACK library, at www.netlib.org/lapack. Student: But my matrix is very big (n is large), and I can't wait that long. JD : Then tell me more about your matrix. Student: It is symmetric, and all its eigenvalues are positive. JD : That is good news. We call such matrices symmetric positive definite (spd), and there is a variation of Gaussian elimination called Cholesky that only costs half as much, (1/3)n^3. It can also be found in LAPACK. Student: But that is still too expensive. JD : So tell me more. Student: A lot of the matrix entries are zero. In fact all the nonzeros lie near the diagonal of the matrix, in columns within sqrt(n) to the left or right of the main diagonal. JD : That is good news. We call such matrices banded, because all the nonzeros appear in a band around the diagonal. The cost of Cholesky can be reduced (by not storing or doing arithmetic on all thoses zeros) to O(n^2) from (1/3)n^3, a savings of a factor of n. So when n~1000, you can hope to go O(1000) times faster. It is also in LAPACK. Student: I am greedy, that is still too slow. JD : So tell me more. Student: There are actually very few nonzero entries, at most 4 per row. and I know a little about the eigenvalues of the matrix, that that ratio of the largest to smallest is about n. JD : That is good news. The ratio of the largest to smallest eigenvalue of an spd matrix is called its condition number, which roughly says how hard it is to solve accurately. A condition number of n is not large, so you can use an algorithm called Conjugate Gradients (CG). CG is similar to Newton since it gives you a sequence of approximate solutions converging to the true solution. It will cost you O(n^(3/2)), or O(sqrt(n)) faster than before. CG is very simple, but you will have to write some of it yourself, because it depends on how you stored the matrix. Look at www.netlib.org/templates/Templates.html Student: I am still not satisfied. JD : So tell me more. Student: I am really trying to find the temperature distribution in the middle of a metal square, given the temperature at the boundary. JD : Oh, why didn't you say so! You are trying to solve an equation called the Poisson equation. There is an algorithm called Multigrid that, like CG, gives you a sequence of solutions converging to the true solution. It is so fast that it costs only O(n). It's harder to point to a definitive implementation: www.cs.berkeley.edu/~demmel/ma221/Matlab/MG_README.html has some sample matlab code. www.netlib.org/pltmg has a much more general version of multigrid, for problems besides the Poisson equation. Student: I am still not satified. JD : Then you are out of luck, because with n unknowns in your linear system, any algorithm has to cost at least O(n), just to write out the solution! So O(n) is optimal. Perhaps you need to pick a different problem to solve... The lesson is that the more you can exploit the structure in your problem, the faster you can go. Some structures (like being banded, or the Poisson equation) are so common and well understood, that good software exists for them. With a little luck your problem will fits into one of these well-understood categories too, if your problem is so large or has to be solved so many times that a special method is important. These same comments apply to least squares problems and finding eigenvalues. In particular, LAPACK contains good software for least squares and eigenvalue problems where the matrix is general, symmetric, or symmetric and banded. An on-line book giving guidance to choosing a numerical method for A*x=b may be found at www.netlib.org/templates/Templates.html. A similar book for eigenvalue problems is at www.cs.ucdavis.edu/~bai/ET/contents.html The graduate class Math 221 discusses these topics and other linear algebra topics in much more detail. Now we begin with Gaussian elimination. We begin with the simplest kind of linear system to solve: a triangular one. Def: A lower triangular matrix has nonzeros only on the main diagonal and below. An upper triangular matrix has nonzeros only on the main diagonal and above. Fact: A triangular matrix is nonsingular if and only if its diagonal entries are nonzero. Proof 1: the determinant of L is the product of the diagonal entries. Proof 2: the following algorithm obviously computes a unique solution of of L*x=b for any b if and only if no L(i,i)=0. Forward Substitution: This is an algorithm for solving L*x=b where L is lower triangular: x(1) = b(1)/L(1,1) for i = 2 to n x(i) = ( b(i) - sum (from j=1 to i-1) L(i,j)*x(j) )/L(i,i) end for It is easy to count operations; we get roughly n^2 operations. A very similar algorithm called "back substitution" works for solving U*x=b where U is upper triangular. The goal of Gaussian elimination is to replace A*x=b for general A with the solution of two triangular systems, which we just showed how to solve. Theorem: Suppose that A is n-by-n and nonsingular. Assume also that the n-1 matrices A(1:i,1:i) for i=1 to n-1 are also nonsingular (later, when we "pivot", we will remove this assumption). Then A can be "factored in the product A=L*U where U is upper triangular and L is lower triangular with ones on L's diagonal. L and U can be computed (via Gaussian elimination) in 2/3*n^3 + O(n^2) flops. This theorem is the basis of our algorithm: Since A=L*U, inv(A) = inv(U)*inv(L), (proof: A*inv(U)*inv(L) = L*U*inv(U)*inv(L) = L*I*inv(L) = I) so b = inv(A)*x = inv(U)*(inv(L)*b): Gaussian elimination (without pivoting) for solving A*x=b. Step 1: Factorize A=L*U Cost = (2/3)*n^3 Step 2: Solve L*y=b Cost = n^2 Step 3: Solve U*x=y Cost = n^2 We will prove this theorem twice. The first proof is simplest, but the second proof matches the way we actually implement Gaussian elimination. Here is a property of matrix-multiplication that we will need. It is also simple to prove: Proposition: "Block matrix multiplication" Let A = [ A_11 A_12 ] where A is m-by-n, and A_ij is m_i-by-n_j [ A_21 A_22 ] Let B = [ B_11 B_12 ] where B is n-by-k, and B_ij is n_i-by-k_j [ B_21 B_22 ] Then A*B = [ A_11*B_11 + A_12*B_21 A_11*B_12 + A_12*B_22 ] [ A_21*B_11 + A_22*B_21 A_21*B_12 + A_22*B_22 ] Proof: homework (very simple) Here is the first proof that we can factor A = L*U. We do induction on the dimension n of A. The base case is n=1, so everything is a scalar, and it is trivial to write A = 1*A. For the induction step, we assume the theorem is true for n-1 and prove it for n. Write A = [ A11 A12 ] [ A21 A22 ] where A11 is n-1 by n-1, A12 is n-1 by 1, A21 is 1 by n-1 and A22 is 1 by 1. By induction we can write A11 = L1 * U1 (why?). We claim that the following factorization is the one we want A = [ A11 A12 ] = [ L1 0 ] * [ U1 inv(L1)*A12 ] [ A21 A22 ] [ A21*inv(U1) 1 ] [ 0 x ] where x = A22 - A21*inv(A11)*A12. All the submatrices have corresponding dimensions: inv(L1)*A12 is n-1 by 1, A21*inv(U1) is 1 by n-1, and x is 1 by 1. Multiplying out we get [ L1 0 ] * [ U1 inv(L1)*A12 ] [ A21*inv(U1) 1 ] [ 0 x ] = [ L1*U1 L1*inv(L1)*A12 ] [ A21*inv(U1)*U1 A21*inv(U1)*inv(L1)*A12 + x ] = [ A11 A12 ] [ A21 A21*inv(A11)*A12 + x ] = [ A11 A12 ] [ A21 A22 ] = A as desired. We could translate this factorization directly into an algorithm, which would in fact be a variation of Gaussian elimination, but instead we use the traditional version of Gaussian elimination to reprove the theorem. We start with an example. Ex: Consider A = [ 6 -2 2 ] [12 -8 6 ] [ 3 -13 9 ] We will add multiples of each row to subsequent rows to zero out entries below the diagonal, and so make A upper triangular Step 1: subtract multiples of row 1 from rows 2 and 3 to zero out A(2:3,1): A_1 = [ 6 -2 2 ] [ 0 -4 2 ] [ 0 -12 8 ] The multipliers were 2 for row 2 and .5 for row 3: store them as follows: L_1 = [ 0 0 0 ] [ 2 0 0 ] [.5 0 0 ] Step 2: subtract multiples of row 2 from row 3 to zero out A(3,2): A_2 = [ 6 -2 2 ] [ 0 -4 2 ] [ 0 0 2 ] The multiplier was 3 for row 3; store it as follows: L_2 = [ 0 0 0 ] [ 2 0 0 ] [.5 3 0 ] Note that A_2 is upper triangular; write U = A_2. Note that L_2 is lower triangular; write L = I + L_2 = [1 0 0 ] [ 2 1 0 ] [.5 3 1 ] You can confirm that L*U = A, so L and U are the desired triangular factors. In fact, one can confirm that (I + L_i)*A_i = A for all i. This is in fact the induction hypothesis we will use in the proof. Now we can write an algorithm for this: Gaussian elimation without pivoting: L = zeros(n,n) ... initialize L to zero matrix for i = 1 to n-1 ... zero out column A(i+1:n,i) below i-th diagonal for j=i+1 to n ... compute multiplier L(j,i) L(j,i) = A(j,i)/A(i,i) ... note that we assume A(i,i) is nonzero end for j = i+1 to n ... update row j to zero out j-th entry in column i ... by subtracting L(j,i) times row i for k = i to n ... update each entry of row j A(j,k) = A(j,k) - L(j,i)*A(i,k) end end end We claim that this algorithm straightforwardly implements the algorithm illustrated above. Now we prove the Theorem. First, we compute the number of flops in this algorithm, by changing each loop to a summation, to add up the number of flops we do: sum (from i=1 to n-1) [ sum (from j=i+1 to n) 1 + sum (from j=i+1 to n) sum (from k=i to n) 2 ] = sum (from i=1 to n-1) [ n-i + sum (from j=i+1 to n) 2*(n-i-1) ] = sum (from i=1 to n-1) [ n-i + 2*(n-i-1)*(n-i) ] = sum (from i=1 to n-1) [ 2*(n-i)*(n-i) - (n-i) ] = sum (from i=1 to n-1) [ 2*i*i - i ] = (2/3)*n^3 + O(n^2) using the fact that sum (from i=1 to n-1) i^2 = n^3/3 + n^2/2 + n/6 which is what we claimed in the Theorem. Next we prove that L*U actually equals A, assuming that we never divide by a zero A(i,i). After step i, we have computed both L_i and A_i, where L_i is nonzero only below the diagonal in columns 1 to i, which is precisely where A_i is zero. We will use induction to prove that A = (I + L_i) * A_i for all i. The base case is i=0, when L_0 = 0 and A_0 = A; clearly (I+L_0)*A_0=A. Now we assume it is true for i-1, and show that it is true for i. The i-1 case can be written (*) A = (I+L_{i-1}) * A_{i-1} = [ Lhat_1 0 ] * [ Ahat_11 Ahat_12 ] [ Lhat_2 I ] [ 0 Ahat_22 ] where Lhat_1 is (i-1)-by-(i-1), Lhat_2 is (n-i+1)-by-(i-1), the I is (n-i+1)-by-(n-i+1), Ahat_11 is (i-1)-by-(i-1), Ahat_12 is (i-1)-by-(n-i+1), and Ahat_22 is (n-i+1)-by-(n-i+1). Def: Ahat_22 is called a "Schur complement". This is a historical term for an intermediate result of Gaussian elimination. Now just consider Ahat_22, which we write as Ahat_22 = [ a11 a12 ] for short [ a21 a22 ] where a11 is 1-by-1, a12 is 1 by n-i, a21 is n-i by 1, and a22 is n-i by n-i. At step i the algorithm works by 1) Computing the i-th column of L_i as a21/a11 2) Computing the trailing (n-i)-by-(n-i) block of A_i as a22' = a22-(a21/a11)*a12 (convince yourself that the (j,k) entry of a22' is a22(j,k) - (a21(j)/a11)*a12(k), by using the definition of matrix multiplication) The important identity that a22' satisfies is Ahat_22 = [ a11 a12 ] = [ 1 0 ] * [ a11 a12 ] [ a21 a22 ] [ a21/a11 I ] [ 0 a22' ] = Ltilde * Utilde This is called a "block matrix equation". We can multiply this out to see that it is true by using the Block Matrix Mutliplication Proposition that we proved earlier. We can substitute this into (*) to get A = (I+L_{i-1}) * A_{i-1} = [ Lhat_1 0 ] * [ Ahat_11 Ahat_12 ] [ Lhat_2 I ] [ 0 Ahat_22 ] this is just (*) = [ Lhat_1 0 ] * [ Ahat_11 Ahat_12 ] [ Lhat_2 I ] [ 0 Ltilde*Utilde ] this is the substitution = [ Lhat_1 0 ] * [ Ahat_11 Ahat_12 ] = (I+L_i)*A_i [ Lhat_2 Ltilde ] [ 0 Utilde ] The next-to-last identity is easy to confirm using the same Block Matrix Multiplication Proposition, and is precisely how L_i and A_i are defined in terms of L_{i-1} and A_{i-1}. This completes the induction step. Finally, we have to show that all the a11, which we need to divide by, will not be zero precisely when all the A(1:i,1:i) are nonsingular. Using the Proposition to multiply out (*), we see that Lhat_1 * Ahat_11 = A(1:i,1:i). We need the following Lemmas from Linear Algebra: Lemma 1: det(X*Y) = det(X)*det(Y) for any two n-by-n matrices X and Y. Lemma 2: If T is triangular, det(T) = prod (from i=1 to n) T(i,i) Thus det(A(1:i-1,1:i-1)) = det(Lhat_1)*det(Ahat_11) by Lemma 1 = prod (from j=1 to i-1) Lhat_1(j,j) * prod (from j=1 to i-1) Ahat_11(j,j) by Lemma 2 = prod (from j=1 to i-1) Ahat_11(j,j) since Lhat_1 has 1s on its diagonal. Thus det(A(1:i-1,1:i-1)) is the product of all the first i-1 a11's encountered during the factorization, which must then all be nonzero. This complete the proof of the theorem. Let us go back to the algorithm, and optimize it a bit into its final form. We start with Gaussian elimation without pivoting (version 1) for i = 1 to n-1 for j=i+1 to n L(j,i) = A(j,i)/A(i,i) end for j = i+1 to n for k = i to n A(j,k) = A(j,k) - L(j,i)*A(i,k) end end end Next, note that in the innermost loop, when k=i, we are wasting time by computing A(j,i)=0, so let's change it to the following, where we have to remember that only the matrix entries on and above the main diagonal are part of U: Gaussian elimation without pivoting (version 2) for i = 1 to n-1 for j=i+1 to n L(j,i) = A(j,i)/A(i,i) end for j = i+1 to n for k = i+1 to n A(j,k) = A(j,k) - L(j,i)*A(i,k) end end end Finally, note that we can save half the space by keeping L(j,i) in the "unused" locations A(j,i). Thus, the algorithm overwrites the input A with L (actually L-I) and U, so no extra space is needed: Gaussian elimation without pivoting (version 3) for i = 1 to n-1 for j=i+1 to n A(j,i) = A(j,i)/A(i,i) end for j = i+1 to n for k = i+1 to n A(j,k) = A(j,k) - A(j,i)*A(i,k) end end end This can be written extremely compactly in Matlab: Gaussian elimation without pivoting (version 4) for i = 1 to n-1 A(i+1:n,i) = A(i+1:n,i)/A(i,i) A(i+1:n,i+1:n) = A(i+1:n,i+1:n) - A(i+1:n,i)*A(i,i+1:n) end or as for i = 1 to n-1 R = (i+1:n) % rows i+1 through n A(R,i) = A(R,i)/A(i,i) A(R,R) = A(R,R) - A(R,i)*A(i,R) end L = tril(A,-1) + eye(n) % strict lower triangle of A, plus I U = triu(A,0) % upper triangle of A We already proved that as long as each A(1:i,1:i) is nonsingular (for i=1 to n), then each denominator A(i,i) in the algorithm will be nonzero, and the algorithm will compute A = L*U. Now we will show this assumption makes the algorithm unreliable, and how to fix it by pivoting. The following 2-by-2 example illustrates this assumption: A = [ 0 1 ] , b = [ b1 ] [ 1 0 ] [ b2 ] Solving A*x=b is simple: x1=b2 and x2=b1. Yet Gaussian elimianation (without pivoting) immediately tries to divide a21=1 by a11=0 and fails. But if we think about A*x=b, a set of linear equations, it doesn't matter what order we write the equations down in, we should get the same answer. In this case, let us write them down in the opposite order ie (**) 1*x1 + 0*x2 = b2 0*x1 + 1*x2 = b1 instead of 0*x1 + 1*x2 = b1 1*x1 + 0*x2 = b2 This leads to the linear system A'*x=b' where A' = [ 1 0 ] and b' = [ b2 ] [ 0 1 ] [ b1 ] Now Gaussian elimination can be applied to A' without difficulty. This process of reordering the rows of A is called "pivoting". Let us explore this 2-by-2 example further to see what happen when a11 is nonzero (so that the algorithm formally works) but very tiny. We will see that we get a poor answer because of roundoff error, which will motivate us to pivot more carefully. For simplicity, let us assume we have 4-decimal digit floating point arithmetic: A = [ 1e-4 1 ] b = [ 1 ] [ 1 1 ] [ 2 ] The correct answer to 3 decimal places is easily confirmed to be x = [ 1 ] . [ 1 ] The result of LU decomposition on A is as follows, where fl(a op b) is the floating point result of a op b: L = [ 1 0 ] = [ 1 0 ] [ fl(1/1e-4) 1 ] [ 1e4 1 ] ... no roundoff error yet and U = [ 1e-4 1 ] = [ 1e-4 1 ] [ 0 fl(1-1e4*1) ] [ 0 -1e4 ] ... roundoff error in the 4th place The only rounding error committed is replacing 1-1e4*1 by -1e4, an error in the fourth decimal place. Let's see how close L*U is to A: L*U = [ 1e-4 1 ] [ 1 0 ] The (2,2) entry is completely wrong. If we continue to try to solve A*x=b, the subsequent solves of L*y=b and U*x=y yield y = [ 1 ] = [ 1 ] [ fl(1-1e4*1) ] [ -1e4 ] ... rounding error in the 4th place and x = [ x1 ] = [ fl((1-x2*1)/1e-4) ] [ x2 ] [ fl(-1e4/(-1e4)) ] = [ 0 ] [ 1 ] ... no rounding error committed in either component We see that the computed answer is completely wrong, even though there were only two floating errors in the 4th decimal place, and even though the matrix A is "well-conditioned", i.e. changing the entries of A or b by eps<<1 only changes the true solution by O(eps). The problem can be traced to the first rounding error, U(2,2) = fl(1-1e4*1) = -1e4. Note that if we were to change A(2,2) from 1 to any other number z in the range from -5 to 5, we would get the same value of U(2,2) = fl(z-1e4*1) = -1e4, and so the final computed value of x would be the same, independent of z. In other words, the algorithm "forgets" the value of A(2,2), and so can't possibly get the right answer. This phenomenon is called numerical instability, and must be eliminated to yield a reliable algorithm. The standard solution to this problem is called partial pivoting, which means reordering the rows of A so that A(i,i) is large at each step of the algorithm. In particular, at the beginning of step i of the algorithm, row i is swapped with row k>i if |A(k,i)| is the largest entry among |A(i:n,i)|. This yields the following algorithm, which we write in pseudo-matlab: Gaussian elimation with partial pivoting for i = 1 to n-1 R = (i+1:n) Let j = index of largest entry among |A(R,i)| if A(j,i) == 0 ... column exactly zero signal "matrix singular" and return elseif i != j swap rows i and j of A end A(R,i) = A(R,i)/A(i,i) ... all quotients at most 1 in magnitude A(R,R) = A(R,R) - A(R,i)*A(i,R) end The search for the largest entry in each column guarantees that the denominator A(i,i) in the formula for entries L(R,i) = A(R,i) is at least as large as the numerators, so the quotients are all at most 1. This keeps the big entries of L (and subsequently U) that destroyed accuracy in our 2-by-2 example from occuring. Gaussian elimination without pivoting factors A as A = L*U. Pivoting, which means reordering the rows of A and then factoring, can be written as P*A = L*U, where P is a "permutation matrix," or identity matrix with its rows in the desired permuted order. For example, reversing the order of the rows of a 2-by-2 matrix means multiplying P*A = [ 0 1 ] * [ A11 A12 ] = [ A21 A22 ] [ 1 0 ] [ A21 A22 ] [ A11 A12 ] Given P*A = L*U, solving A*x=b is done using the identity x = inv(A) * b = inv( inv(P)*L*U ) * b = inv(U) * ( inv(L) * ( P * b ) ) (1) Compute y = P*b (permute the entries of b, no arithmetic needed) (2) Compute z = inv(L)*y by forward substitution (2) Compute x = inv(U)*z by back substitution Partial pivoting is the standard version of Gaussian elimination used in practice. There is another variation of pivoting called complete pivoting. It is rarely used because it is more expensive than partial pivoting (doing O(n^3) more operations) and only rarely much more accurate. The difference is that instead of search just column i for the largest entry, and swapping rows to put that largest entry in A(i,i), the entire submatrix A(i+1:n,i+1:n) is searched for the largest entry, and rows and columns swapped to put that in A(i,i). This guarantees not just that |L(i,j)| <= 1, but that |U(i,j)| <= |U(i,i)|. You will see in Programming Assignment 3 that there are examples where this makes the answer much more accurate than partial pivoting. Now we turn to error analysis, and try to explain how much accuracy to expect in the solutino of A*x=b, assuming both roundoff analysis, and noise in A and b. To measure error, we need some measures of the sizes of vectors and matrices. These measures are called norms, and we define just the simplest one: Def: ||x||_infinity = max (from i=1 to n) |x(i)| is the vector infinity norm, or just infinity norm, of the vector x. (Notation: from now on, I drop the subscript infinity) Shortly, we will define an analogous measure ||A||_infinity (or just ||A||) of the size of a matrix. Our goal is to understand (and partially prove) the following Theorem: Suppose we solve A*x=b using Gaussian elimination in floating point arithmetic with machine epsilon eps. Then we can bound the error as follows: || x - xTRUE || / || x || <= O(eps)*PG(A)*k(A) Here O(eps) is a multiple of machine precision, usuall about n*eps. PG(A) is the "pivot growth factor", and k(A) = ||A||*||inv(A)|| is called the condition number of A. PG(A) can be defined as ||L||*||U||/||A||, and is always at least about 1 PG(A) measures how well the pivoting strategy succeed in keeping L*U close to A, i.e. in keeping the algorithm numerically stable. As you will see on your programming assignment, PG(A) ~ 1 for complete pivoting PG(A) ~ 1 usually for partial pivoting, with a few known examples where it can get as large as 2^n PG(A) can be very large with no pivoting. In practice, both PG(A) and k(A) are easy to compute after getting L and U, costing just an extra O(n^2). Now we study the background of this theorem more carefully. As in our study of polynomial evaluation, we have two steps: 1) Perturbation Theory: This means bounding the difference between the solution of A*x=b and the solution of the perturbed problem (A+E)*xhat = b+f. Here E is a "small matrix" added to A, and f is a "small vector" added to b. We want to bound the difference between x and xhat in terms of bounds on f, E and properties of A. 2) Roundoff analysis: This means bounding how big roundoff can make E and f. To measure error, we need some measures of the sizes of vectors and matrices. These measures are called norms, and we use just the simplest ones, starting with the infinity norm ||x|| = max (from i=1 to n) |x(i)| defined above. Lemma 1: (Triangle inequality): ||x+y|| <= ||x|| + ||y|| Def: ||A||_infinity = max (from i=1 to n) sum (from j=1 to m) |A(i,j)| is the matrix infinity norm, or just infinity norm, of the n-by-m matrix A (Notation: from now on, I also drop the subscript infinity on the matrix norm too.) Note: When A is n-by-1, the two definitions coincide (a good thing!) Lemma 2: (Triangle inequality): ||A+B|| <= ||A|| + ||B|| The following is the most important property of these norms: Lemma: 3 (Consistency): ||A*x|| <= ||A|| * ||x|| Note that ||A*x|| and ||x|| are vector norms, and ||A|| is a matrix norm Proof: Let y=A*x, then |y(i)| = | sum (from j=1 to m) A(i,j)*x(j) | <= sum (from j=1 to m) |A(i,j)|*|x(j)| by the triangle inequality <= max (from j=1 to m) |x(j)| * sum (from j=1 to m) |A(i,j)| = ||x|| * sum (from j=1 to m) |A(i,j)| and so ||y|| = max (from i=1 to n) |y(i)| <= ||x|| * max (from i=1 to n) sum (from j=1 to m) |A(i,j)| = ||x|| * ||A|| Now we can compare the solutions of A*x=b and (A+E)*xhat = b+f. Write xhat = x+e. We want to bound ||e|| = ||x-xhat|| in terms of ||E|| and ||f||. We will use the same approach as in analyzing (to first order) how much the root of a polynomial changed when we changed its coefficients: 1) multiply out (A+E)*(x+e)=b+f, to get A*x + A*e + E*x + E*e = b+f 2) subtract A*x=b, to get A*e + E*x + E*e = f 3) ignore the "higher order term" E*e, which is much smaller than the other terms, to get A*e + E*x = f 4) solve for e, to get e = inv(A)*(f-E*x) Now we go one step further, bounding the norm of e: ||e|| = ||inv(A)*(f-E*x)|| <= ||inv(A)|| * ||f - E*x|| by Lemma 3 (consistency) <= ||inv(A)|| * (||f|| + ||E*x||) by Lemma 1 (triangle ineq) <= ||inv(A)|| * (||f|| + ||E||*||x||) by Lemma 3 again It is tradition to use the following normalization. We suppose that ||f|| <= eps * ||b|| and ||E|| <= eps * ||A|| i.e. eps measures the relative changes in b and A by f and E, respectively. For example, if the entries of b and A are only known to about 5 decimal places, then eps = 10^(-5). Then ||e|| <= ||inv(A)|| * (eps*||b|| + eps*||A||*||x||) Since b=A*x, Lemma 3 implies ||b|| <= ||A||*||x|| so we finally get ||e|| <= ||inv(A)|| * (eps*||A||*||x|| + eps*||A||*||x||) = 2*eps* ||inv(A)|| * ||A|| or, measuring the relative change in x, ||e||/||x|| <= 2* eps * ||inv(A)||*||A|| Def: k(A) = ||inv(A)||*||A|| is called the condition number of A. A more careful analysis that does not ignore the higher order terms yields ||e||/||x|| <= 2* eps * k(A) / ( 1 - eps*k(A) ) which is nearly the same when eps*k(A) << 1, which is required for an accurate solution anyway. This completes the perturbation theory for solving A*x=b. Here is a summary of the roundoff analysis of solving A*x=b: Theorem: Assume we solve A*x=b using Gaussian elimination with any chosen kind of pivoting (including none at all), and in floating point arithmetic with machine precision macheps. Assume that neither overflow nor underflow occur. Let L and U be the computed triangular factor of A (after any permutations of rows or columns from pivoting), and let xhat be the computed solution. Then xhat is the exact solution of (A+E)*xhat = b where ||E||/||A|| <= O(macheps)*||L||*||U||/||A|| = O(macheps)*PG(A) Thus, if the pivot growth PG(A) = ||L||*||U||/||A|| is not too much larger than 1, then ||E||/||A|| = O(macheps) and so Gaussian elimination is very accurate, in the sense it is equivalent to changing the entries of A by small numbers on the order of macheps*||A|| (roundoff error in the entries of A), and then solving this perturbed problem (A+E)*xhat = b exactly. When can we expect PG(A) = ||L||*||U||/||A|| ~ 1? 1) It may easily fail if no pivoting is done (see 2-by-2 example above) 2) It is nearly always true with partial pivoting, although PG(A) can be as large as 2^(n-1), as you'll see in your last program. 3) It is true as far as anyone know for complete pivoting, although the determining largest possible value of PG(A) is a famous unknown problem in numerical analysis.