Math 128a - Spring 2002 - Lecture 4 - Jan 31 (Thursday) Goals: Review and illustrate general approach to understanding error bounds: 1) Suppose that we want to compute f(y); try to show that we actually get alg(y) = f(y + dy), where dy is "small" Notation: dy is called "backward error" If we can do this (not for all algorithms!) then alg(y) is called "backward stable" 2) Bound absolute error = | alg(y) - f(y) | = | f(y + dy) - f(y) | ~ | f'(y)*dy | Similarly, relative error = | alg(y) - f(y) | / | f(y) | ~ | f'(y)*dy | / | f(y) | = [ |f'(y)|*|y| / |f(y)| ] * [ |dy|/|y| ] = condition number * "relative backward error" Clear when y is a scalar; in most interesting case y is more complicated, say a vector y=[y_1,y_2,...,y_m], so use multivariable version of Taylor's theorem: f(y+dy) ~ f(y) + sum_{i=1 to m} (partial f(y) / partial y_i)* dy_i = f(y) + grad f * dy where grad f = gradient of f = vector of partial derivative of f wrt y_i so |f(y+dy)-f(y)| <= sum_{i=1 to m} |partial f(y) / partial y_i|*|dy_i| Example 1: last algorithm for f(y) = exp(y) in homework is backward stable, in sense that alg(y) = exp(y + dy) where |dy|/|y| is small (near 2^(-53) ~ 1e-16) f'(y) = exp(y) => condition number = |y| Note that we must have y < ln OV ~ 710 (else exp(y) overflows) and ln UN ~ -710 < y (else exp(y) underflows) so |y| < 710, so relative error in result limited to about 1e3 * 1e-16 = 1e-13 for very large and very small y, better for small arguments. Example 2: f(y) = y, alg(y) = (y+1e300) - 1e300 is not backward stable. why? Example 3: y = [c_0,c_1,...,c_n, x], f(y) = p(x) = sum_{i=0 to n} c_i*x^i alg(y) = Horner's rule to evaluate p(x) last time we showed that alg(y) is backward stable because alg(y) = sum_{i=0 to n} (c_i+delta_c_i)*x^i where |delta_c_i| <= 2*n*epsilon*|c_i| Now grad f = [1,x,x^2,...,x^n,p'(x)] dy = [delta_c_0,...,delta_c_n,0] |f(y+dy)-f(y)| <= 2*n*epsilon*sum_{i=0 to n} |c_i|*|x|^i Example 4: y = [c_0,c_1,...,c_n], f(y) = root of polynomial p(x) = sum_{i=0 to n} c_i*x^i alg(y) = result of bisection. Fact: if alg(y) reports that the polynomial has a root between xlower and xupper, then that is true for some polynomial with coefficients c_i + delta_c_i, delta_c_i small as described above (proof on homework!) How do we compute and bound f(y+dy)-f(y) ~ grad f * dy How to differentiate f? Return to this soon after 2 more examples: Example 5: y = [A,b], a matrix A and a vector b f(y) = inv(A)*b = solution x of system of linear equations A*x=b alg(y) = Gaussian elimination (later in semester) Properly implmented, this is backward stable (care necessary!) How to differentiate? later in semester ... Example 6: y = [ function g(), t_0, z_0, t_f ] f(y) = value of solution of differential equation dz/dt = g(z) with initial value z(t_0) = z_0 at t_f, i.e. z(t_f) In other words, solve differential equation dz/dt=g(z) starting at z(t_0)=z_0 and return z(t_f) alg(y) = subject later in semester; will be backward stable Back to Example 4: roots of polynomials. How to differentiate f? Implicitly: First way: equation sum from i=0 to n c_i x^i = 0 defines root x implicitly in terms of coefficients c_i so differentiate with respect to c_k, getting sum from i=0 to n c_i*i*x^(i-1)* (dx/dc_k) + x^k = 0 or dx/dc_k = -x^k / p'(x) so root(c_0 + deltac_c_0, ... , c_n + deltac_c_n) ~ root(c_0,...,c_n) + sum from k=0 to n -delta_c_k*x^k/p'(x) = x - delta_p(x)/p'(x) where delta_p(x) = sum from k=0 to n delta_c_k*x^k Second way (equivalent, perhaps easier notation): write p(x) = 0 and (p + delta_p)(x+dx) = 0, expand, ignore "second order terms", and solve for dx: where p(x) = sum_i c_i*x^i 0 = (p + delta_p)(x+dx) = sum_i (c_i+delta_c_i)*(x+dx)^i = sum_i c_i*(x+dx)^i + sum_i delta_c_i*(x+dx)^i ~ sum_i c_i*(x^i + i*x^(i-1)*dx) + sum_i delta_c_i*(x^i + i*x^(i-1)*dx) = sum_i c_i*x^i + sum_i i*c_i*x^(i-1) * dx + sum_i delta_c_i*x^i + sum_i i*delta_c_i*x^(i-1) * dx = 0 + p'(x) * dx + delta_p(x) + O(delta^2) ... we ignore last term Thus, dx = -delta_p(x)/p'(x) to first order Matlab example to illustrate this: ebnd = polyperturb(r,e,m) takes the roots r of a polynomial, forms the coefficients for i=1 to m, perturbs coefficients by relative amount e, computes the roots plots them illustrates pictorially how sensitive the roots are to perturbations. ebnd = polyperturb([1 2 3],1e-4, 500) ebnd = polyperturb([1 2 3],1e-3, 500) ebnd = polyperturbANS([1 2 3],1e-3, 500) ebnd = polyperturb([1 2 3],1e-2, 500) ebnd = polyperturbANS([1 2 3],1e-2, 500) ebnd = polyperturb(1:10,1e-8, 500) ebnd = polyperturbANS(1:10,1e-8, 500) ebnd = polyperturb(1:10,1e-7, 500) ebnd = polyperturb(1:10,1e-4, 500) ebnd = polyperturbANS([1 2 2 3 3 3],1e-6, 300) ebnd = polyperturbANS([1 2 2 3 3 3],1e-4, 300) ebnd = polyperturbANS([1 2 2 3 3 3],1e-3, 300) ebnd = polyperturb(2*ones(1,5),1e-12, 300) why does it look like this? Def: z is an m-fold root of p if p(x)=(x-z)^m*g(x), g(z) != 0 (m=1 => "simple root", m=2 => "double root", m=3 => "triple root" etc.) Fact: if z is at least a double root (m >=2) then p(z)=p'(z)=0 proof: differentiate p'(x) and plug in z So in particular, since error bound has p'(z) in denominator, it is infinite Why do roots of perturbed polynomial with multiple roots show up as they do? Here is plausibility argument (not a proof!) Need to know about complex numbers... Suppose p(x) = (x-z)^m, so all roots at z. What are root of p(x)-a=0? Solve (x-z)^m-a=0 for x to get x = z + a^{1/m) Suppose m=2, a = 1e-4; then a^(1/m) = +- .01 * sqrt(-1) Suppose m=2, a =-1e-4; then a^(1/m) = +- .01 ebnd = polyperturb([2 2],1e-4,300) Suppose m=3, a= 1e-6, then a^(1/m) = .01 * complex cube roots of 1 (3 of them, 1, -.5 + sqrt(3)/2*sqrt(-1), -.5 - sqrt(3)/2*sqrt(-1) ) Suppose m=3, a=-1e-6, then a^(1/m) = .01 * complex cube roots of -1 (3 of them,-1, +.5 + sqrt(3)/2*sqrt(-1), +.5 - sqrt(3)/2*sqrt(-1) ) ebnd = polyperturb([2 2 2],1e-6,300) Your homework assignment will be to draw error bounds on top of these pictures of how much the zeros actually change (implement polyperturbANS) Other methods for finding zeros of functions (faster than bisection!) Drawbacks of bisection: need to start with an interval where f(x) changes sign hard to see how to generalize to case of more than 1 variable Slow convergence: find width of interval = tol means that we take log_2 (start_width/tol) = O(log (1/tol)) = O( # correct digits ) steps to converge. Def: A sequence x(n) -> x converges linearly if |x(n+1) - x| <= c * |x(n) - x| for some 0<c<1 Thus, bisection converges linearly with c=1/2 We seek methods that 1) do not require an interval where f changes sign 2) converge in O( log( # correct digits) ) steps, i.e. exponentially faster than linear convergence 3) generalize to functions of more than 1 variable Drawback: they are not guaranteed to converge this fast (or at all) unless 1) you start sufficiently close to the zero, and 2) the function is sufficently "nice" near the zero (details below) Approach 1) given an approximate root x(n), replace f(x) by a simple function f*(x) that approximates it near x(n) 2) find a root x(n+1) of f*(x) (should be easy if f* simple enough) 3) repeat until x(n) stops changing (or f(x) too inaccurate) Note: It is common in numerical analysis to solve a problem by approximating the function involved by a simpler one, and solving the problem for the approximation instead. Polynomials are frequently used as these simple functions, and straight lines (as above) are the simplest polynomials. We will spend lots more time on polynomial approximation later. So how do we choose f*(x)? Taylor's theorem says that close enough to x(n), f(x) is f(x) = f(x(n)) + f'(x(n))*(x-x(n)) + O((x-x(n))^2) ~ f(x(n)) + f'(x(n))*(x-x(n)) = f*(x) The graph of f*(x) is a straight line, so it is easy to see what x(n+1) should be by solving for x(n+1) in f*(x) = 0 = f(x(n)) + f'(x(n))*(x(n+1)-x(n)) or x(n+1) = x(n) - f(x(n))/f'(x(n)) which is Newton's method. Algorithm: input x(0) n=0; repeat x(n+1) = x(n) - f(x(n))/f'(x(n)) n=n+1 until (|x(n)-x(n+1)| < tol or error bound for f(x(n)) > |f(x(n))| or error bound for f'(x(n)) > |f'(x(n))| or n > Max_n or ...) Graphical illustration of Newton's Method. Run matlab code, illustrate convergence. [xvals,fvals,dfvals]=newton('funcN',3,1e-14) [xvals,fvals,dfvals] [xvals,fvals,dfvals]=newton('funcN',4,1e-14) [xvals,fvals,dfvals] Notice that error approximately squares at each point, This is called quadratic convergence: |x(n+1) - x| <= C * |x(n) - x|^2 for some C Ex: if C=1, |x(n)-x|=.1, then get 1e-1 => 1e-2 => 1e-4 => 1e-8 => 1e-16 (done) Rough convergence analysis (see what assumptions we need for a formal proof): Suppose z is exact zero, and use next term of Taylor expansion: 0 = f(x(n)) + f'(x(n))*(x(n+1) - x(n)) f(z) = 0 = f(x(n)) + f'(x(n))*(z - x(n)) + f''(x(n))/2*(z-x(n))^2 + O((z-x(n))^3) Subtract to get x(n+1) - z = f''(x(n))/(2*f'(x(n))*(z - x(n))^2 + O((z-x(n))^3) So it seems that if we are close enough to x that the Taylor expansions are accurate f''(x(n))/f'(x(n)) is not too large then we should get quadratic convergence with C ~ f''(z)/(2*f'(z)) Theorem (Kantorovich): If 1) |x(1) - z| <= r where z is a zero of f() 2) |f'(x)| >= m > 0 when |z - x| <= r 3) |f''(x)| exists and is <= M when |z - x| <= r 4) r <= m/M (relates size of interval, magnitudes of f', f'') then the sequence computed by Newton's method 1) stays in the interval [z-r,z+r], 2) converges to z at least as fast as bisection, and 3) satisfies |x(n+1) - z| <= (M/(2*m)) * |x(n) - z|^2 i.e. converges quadratically with C = M/(2*m) Proof: Taylor's Theorem says f(z) = 0 = f(x(n)) + f'(x(n))*(z - x(n)) + f''(eta)/2*(z-x(n))^2 where eta lies between z and x(n) Subtract from 0 = f(x(n)) + f'(x(n))*(x(n+1) - x(n)) to get x(n+1) - z = f''(eta)/(2*f'(x(n))*(x(n) - z)^2 so |x(n+1) - z| <= M/(2*m) * |x(n) - z|^2 If |x(n) - z| <= r, then RHS <= M/(2*m)*r*|x(n)-z| < |x(n)-z|/2, so the entire sequence of x(n)'s stays in the interval [z-r,z+r], and this analysis applies to each step In particular convergence is always at least as fast as bisection, and convergence is quadratic with C = M/(2*m) This is called a "local convergence" theorem, because it says that you converge once you are close enough to the zero. In contrast, our analysis of bisection is a "global convergence" theorem, because it doesn't matter how wide the initial interval is, as long as the function changes sign on it. Global convergence theorems, i.e. ones that apply to wide intervals, are rare and hard to prove, because most methods can have very complicated behavior (dynamics) on large intervals (example below), unless the function is very "well behaved" What can happen when conditions in the theorem are violated? 1) If f' gets too small: In the extreme case, if some f'(x(n))=0, the whole sequence blows up. (picture) 2) If f'' gets too large: this means f' can change a lot in a small interval so that a straightline approximation to f can change drastically. For example, f'(x(n)) could become 0 3) If we start far away from a root, anything can happen Theorem. Let p(x) = product from i=1 to n (x-x(i)) be a polynomial with real distinct roots x(1) < x(2) < ... < x(n) Define the intervals I(0) = (-infinity, x(1)) I(1) = (x(1), x(2)) ... I(j) = (x(j), x(j+1)) I(n) = (x(n), infinity) Pick any arbitrary infinite sequence of integers (n(1), n(2), n(3), ... ) in the range 1 to n-1, such that n(i) != n(i+1) such as (1, 2, 1, 5, 3, 4, 6, 2, n-1, ... ) Then there is a starting point x(1) for Newton in interval I(n(1)) such that 1) Newton never converges 2) x(i) is in interval I(n(i)) Proof: uses idea of Cantor set from Ma 104: try it! Cover of 2nd edition of text book or See Fig 3.8, p 127, 3rd Edition of text (or cover of 2nd edition) This represents Newton's method applied to x^5+1=0 (or x^4+4 = 0) using complex arithmetic. Each point in the complex plane (or "pixel") is coded to indicate to which of the 5 (or 4) roots Newton converges when started there. The picture is "fractal", and very complicated, no matter how close you look. There are some simple cases where we can establish that Newton converges in a large interval: Definition: A function is convex on an interval I if f''(x)>0 on I Lemma: If f is convex on I, and if x1 < x2 are both in I, then the line segment connecting (x1,f(x1)) and (x2,f(x2)) lies on or above the graph of f Proof: try it. Theorem: Suppose 1) f has 2 continuous derivatives on the real axis 2) f'(x) > 0 on the real axis (f strictly increasing) 3) f''(x) > 0 on the real axis (f convex) 4) f(z) = 0 for some real value (draw picture of typical function f) Then Newton converges to z from any real starting point x(1) Proof: From the proof of Kantorovich's Thm we still have x(n+1) - z = f''(eta)/(2*f'(x(n))*(x(n) - z)^2 The RHS is always positive, so x(i)>z for i>1. Since f(x(i))/f'(x(i))>0 for any x(i)>z, x(i+1) = x(i) - f(x(i))/f'(x(i)) < x(i) so the x(i) form a decreasing sequence bounded below by z. So they have a limit called z*. But z* satisifies z* = z* - f(z*)/f'(z*) so f(z*)=0 as desired. Since f is strictly increasing, the zero is unique and z = z*. Example: Suppose f(x) is polynomial, with all coeffs >= 0, except the constant, which is <0. Then above theorem applies (when x >= 0) to show that Newton converges from any positive starting point. This is the case in computing rates of return on investments (and done in financial calculators, eg HP12C) Examples: Newton to compute the square root (used in some computers for sqrt) solve f(x) = x^2 - a = 0; f'(x) = 2*x so Newton is x -f(x)/f'(x) = x-(x^2-a)/(2*x) = .5*(x+a/x) Newton to compute reciprocal (used in some computers for division) Try solving f(x) = x*a - 1 = 0; get x(n+1) = 1/a - one step converge, but need to be able to reciprocate! Try solving f(x) = a - 1/x; get x(n+1) = x(n)*(2-x(n)*a) only requires addition, multiplication, so ok Note: neither method alone is good enough to get result correctly rounded as IEEE requires, but they can be fixed up Next Algorithm: Secant method Advantage over Newton: don't need to evaluate f'(x), which may be expensive or impossible (may only be given a "black box" subroutine for f(x), that we can't differentiate) If the most expensive operation is evaluating f and f', then secant is faster Idea: if we can't evaluate f'(x(n)), what is a good approximation? Use the slope of the line between the last two point on the curve: f'(x(n)) ~ [ f(x(n)) - f(x(n-1)) ] / [ x(n) - x(n-1) ] so x(n+1) = x(n) - f(x(n))*[x(n)-x(n-1)]/[f(x(n))-f(x(n-1))] pick two starting values x(1) and x(2) compute f(1) = f(x(1)) and f(2) = f(x(2)) n = 2 while (|f(x(n))| > tol or ... ) x(n+1) = x(n) - f(n)*[x(n)-x(n-1)]/[f(n)-f(n-1)] f(n+1) = f(x(n+1)) n=n+1 end Convergence proof: we will show that the order of this method is p = (1+sqrt(5))/2 ~ 1.62, i.e. |x(n) - z| = O(|x(n-1) - z|^p) Let e(n+1) = x(n+1) - z. Then e(n+1) = e(n) - f(x(n))*[x(n)-x(n-1)]/[f(x(n))-f(x(n-1))] = e(n) - f(x(n))*(e(n)-e(n-1)]/[f(x(n))-f(x(n-1))] ~ e(n) - f'(z)*e(n) * ( e(n)-e(n-1) ) / [f'(z)*e(n) + f''(z)*e(n)^2/2 - f'(z)*e(n-1) - f''(z)*e(n-1)^2/2 ] = e(n)*[ 1 - f'(z) * (e(n)-e(n-1)) / [f'(z)*(e(n)-e(n-1)) + f''(z)/2*(e(n)^2-e(n-1)^2)]] = e(n)*[ 1 - f'(z)/[f'(z) + f''(z)/2*(e(n)+e(n-1)) ] = e(n)*[ 1 - 1/(1 + (f''(z)/2/f'(z))*(e(n)+e(n-1)) ] ~ e(n)*[ 1 - 1/(1 + (f''(z)/2/f'(z))*e(n-1) ] since we assume |e(n)| << |e(n-1)| near convergence ~ e(n)*[ 1 - ( 1 - (f''(z)/2/f'(z))*e(n-1) ) ] = (f''(z)/2/f'(z)) * e(n)*e(n-1) Now let us analyze the convergence of a sequence of the form e(n+1) = C*e(n)*e(n-1) We can "guess" that e(n+1) = c*e(n)^a, plug in and get e(n+1) = c*e(n)^a = c*(c*e(n-1)^a)^a = c^(1+a)*e(n-1)^(a^2) so c^(1+a)*e(n-1)^(a^2) = C*c*e(n-1)^a*e(n-1)) or c/C = e(n-1)^(1+a-a^2) Since the LHS is independent of n, the RHS must also be so 1+a-a^2 = 0 or a=(1+sqrt(5))/2 ~ 1.62 (The other root of the quadratic is -.62, corresponding to e(n) -> infinity, which is not relevant) A related approach, with a connection to Fibonacci numbers: Take logarithms to get le(n+1) = log(C) + le(n) + le(n-1) Write le'(n) = le(n) + log(C) to get le'(n+1) = le'(n) + le'(n-1) the Fibonacci recurrence How do we solve Fibonacci recurrence ? write le'(n) = p^n, plug in to get p^(n-1) = p^n + p^(n-1) divide by p^(n-1) to get p^2 = p + 1 or p+- = (1 +- sqrt(5))/2 ~ 1.62, -.62 so the general solution is le'(n) = c1 * (p+)^n + c2 * (p-)^n where we choose c1 and c2 to agree with le'(0) and le'(1). Since |p-| < 1, the second term goes to zero, and we get le'(n) ~ c1 * 1.62^n and e(n) ~ 2^(c1 * 1.62^n)/C ~ c^(1.62^n) / C where c = 2^c1 This seems to converge more slowly than Newton, so why might it be faster? Suppose each evaluation of f and f' is equally (and very) expensive. Then in the same time we can do one Newton iteration we can do two secant steps. This means that the error e(n) from secant decreases to e(n+2) = O(e(n)^(p^2)) = O(e(n)^((3+sqrt(5))/2)) = O(e(n)^2.62) in the same time Newton's error decreases to e(n+1) = O(e(n)^2) What is best to use to find a zero of a function of a single variable? The best available methods are hybrids, using sign changes if they are found ('help fzero') in matlab, or specialized to certain functional forms, like polynomaisl (Laguerre's method, which is cubically convergent, provides guaranteed error bounds (modulo round off), and converges monotonically if all the roots are real. The Matlab roots function finds roots of polynomials by constructing a matrix whose eigenvalues are the roots of the polynomials, and using their eigenvalue routine. This may sound like reducing to a more difficult problem (and a more expensive one, since finding the eigenvalues costs O(n^3), in contrast to probably O(n^2) for the methods we have been talking about). But the eigenvalue routine is so reliable, and their concern about performance low enough, that they choose to use it. Last topic on finding zeros of a function of a single variable: Several of these methods, notably Newton, are of the form x(n+1) = F(x(n)) where in the case of Newton F(x) = x - f(x)/f'(x). Let us ask under what general conditions an iteration x(n+1)=F(x(n)) converges. If F() is continuous, and x(n) -> x, then x = F(x) and x is called a "fixed point" of F. x(n+1)=F(x(n)) is called "fixed point iteration". There are many theorems in mathematics about when a function F() has a fixed point, and whether fixed-point iteration will find it. Here is a basic defintion and theorem: Definition. F is called a contraction if |F(x)-F(y)| <= c*|x-y| where 0<= c < 1 In words, applying F makes points x and y get closer together, by at least a factor c<1. Ex: F(x) = x/2 + 10. Then |F(x)-F(y)| = |x-y|/2 Theorem (Contraction Mapping Theorem): Let C be a closed interval [a,b], and suppose F maps C to C. Then if F is a contraction, it has a unique fixed point x in C, and fixed point iteration x(n+1) = F(x(n)) converges to it. Furthermore, if |F(z)-F(y)| <= c * |z-y| then |x-x(n)| <= |x(n+1)-x(n)| / (1-c) is an error bound. Proof: |x(n+1)-x(n)| = |F(x(n)) - F(x(n-1))| <= c |x(n) - x(n-1)| so by induction |x(n+1)-x(n)| <= c^n * |x(1) - x(0)| which goes to 0, since 0 <= c < 1. Now x(n) - x(0) = sum from i=0 to n-1 x(i+1)-x(i) and we can ask whether this telescoping sum converges; it will if it converges "absolutely", i.e. the following converges: sum from i=0 to n-1 |x(i+1)-x(i)| But this is dominated by the convergent geometric series sum from i=0 to n-1 c^i * |x(1)-x(0)| = |x(1)-x(0)| * (1-c^n)/(1-c) so it converges. Thus |x - x(0)| <= |x(1) - x(0)|/(1-c) Ex: F(x) = x/2 + 10 maps [0,40] to [10,30] and so satisifies the conditions of the theorem. Iteration converges to x = F(x) = x/2+10 or x=20. (picture) Ex: On your homework, you will show that if |F'| < 1, then F is a contraction. Let us look at Newton's method: F(x) = x - f(x)/f'(x) => F'(x) = f(x)*f''(x) / (f'(x)*f'(x)) If z is simple zero (f'(z) is nonzero), then F'(x) is small when x is near z, and so F is a contraction, and we expect Newton to converge