Math 128a - Spring 2002 - Lecture Notes on Numerical Differentiation Read Section 7.1 in the textbook. Recall our motivation for studying polynomial approximation: Given a problem with a "difficult" function f(x), replace f(x) by a simpler function, like a polynomial, and solve that problem instead. Now we will use this idea when the problem is differentiation, and then integration (also called "quadrature"). In this chapter, as in the last chapter, we will get error bounds that depend on higher derivatives of f(x), because if the function is not smooth, knowing its values at some points x{i} will not tell us anything about f(x) at other points, and so nothing about its derivative (or integral). (This observation will motivate our ultimate algorithm for integration, "adaptive quadrature", which uses more points in regions where it thinks the function is not smooth.) Numerical Differentiation: Let's start with the definition of derivative: f'(x) = lim h->0 (f(x+h)-f(x))/h This motivates the following approximation: choose a "small" h>0 and let D = [f(x+h)-f(x)]/h be our approximate derivative. What is the error? Here are two (similar) approaches, both of which assume f(x) is smooth enough to have a second derivative: 1) Use a Taylor expansion: f(x+h) = f(x) + h*f'(x) + h^2*f'(z)/2 for some z in [x,x+h]. Solving for f'(x) yields f'(x) = [f(x+h)-f(x)]/h - h*f''(z)/2 = D - h*f''(z)/2 So we see that the error -h*f''(z)/2 is proportional to h times the second derivative of f. This error is called the "truncation error", because it arises from truncating the Taylor expansion in deriving the formula for D. 2) Use polynomial interpolation: This means we take our data points (x,f(x)), (x+h,f(x+h)), interpolate to get a polynomial p(t), and approximate f'(x) by p'(x). The polynomial is a straight line: p(t) = f(x) + t* [ f(x+h)-f(x) ]/h with a constant derivative p' = [ f(x+h)-f(x) ]/h = D. Our error term for polynomial interpolation tells us f(t) = p(t) + f''(z(t))/2 * (t-x)*(t-(x+h)) where z(t) is some point between x and max(t,x+h), assuming t >= x. Thus f'(t) = p'(t) + d/dt (f''(z(t))/2) * (t-x)*(t-(x+h)) + f''(z(t))/2 * (2*t - 2*x - h) and f'(x) = p'(x) - f''(z(x))/2 * h the same as before. Either way, we see the error=truncation error is proportional to h, i.e. "first order in h". This means that if we want, say, to lower the error by 10x, we need to have h 10x smaller. What prevents us from making h as small as we like? The answer is roundoff, in particular "subtractive cancellation". Ex: Consider computing the derivative of sin(x) this way; format short e, h=10.^(-(0:15)'); D = [sin(1+h)-sin(1)]./h;[h,D,D-cos(1)] loglog(h,abs(D-cos(1)),'+'), Notice that as h decreases by factors of 10 from .1 to 1e-7, the error in D also decreases by factors of 10. But after that, the error in D increases up to nearly .0148/cos(1) ~ 3% (actual value computer dependent!) Let's try to understand this example. For suppose roundoff makes an error eps1 ~ 1e-16*f(x) in the computed values of f(x): fl(f(x)) = f(x)+eps1 and similarly an error eps2 in f(x+h): fl(f(x+h)) = f(x+h)+eps2 Then the computed value of D will be fl(D) = [f(x+h)+eps2 - (f(x)+eps1)]/h *(1+eps3) where eps3~1e-16 accounts for the error in subtraction and division by h. Then fl(D) = [f(x+h)-f(x)]/h + [f(x+h)-f(x)]/h * eps3 + ... harmlessly small +[eps2 - eps1)]/h *(1+eps3) The second term is harmless because it is small compared to the first term, the exact value of D. The numerator of the third term is essentially an unpredictable number with absolute value bounded by around 1e-16*|f(x)|. So we are dividing a small but unpredictable number by another small number h. If h gets small enough, the third term will be a large unpredictable number, i.e. a large error in the computed value of D. In particular, once the trunction error, roughly |f''(x)|*h/2, falls below 1e-16*|f(x)|/h, we expect the error to increase proportionally to 1/h. Ex: Again consider computing the derivative of sin(x) this way; format short e, h=10.^(-(0:15)'); D = [sin(1+h)-sin(1)]./h; [h,sin(1)*h/2,1e-16*sin(1)./h,D,D-cos(1)] loglog(h,sin(1)*h/2,'r+', ... h,1e-16*sin(1)./h,'bo', ... h,abs(D-cos(1)),'kx') title('truncation error = red, roundoff = blue') notice that as decreases by factors of 10 from .1 down to 1e-7, the error in D also decreases by a factor of 10, as predicted. But after that, the truncation error sin(1)*h/2 in column 2 falls below the roundoff error 1e-16*sin(1)/h in column 3, and the error increases, roughly proportionally to 1/h. We can compute the optimal value of h, under the assumptions above: truncation error(h) = roundoff error(h) or |f''(x)|*h/2 = 1e-16*|f(x)|/h or h = sqrt(2e-16*|f(x)/f''(x)|) = 1e-8 *sqrt(|2*f(x)/f''(x)|) in particular, it is proportional to the square root of machine epsilon, 1e-16. How we do better, i.e. get an acceptably small error with larger values of h? (later we will see there is another reason for this: the cost of our integration algorithms will depend on the number of points, and the number of points is proportional to 1/h). One way to do better is to find a different approximation than D, whose trunction error is proportional to h^2 instead of h. That way we can expect, for example, that instead of needing, say, h=1e-4 to get an acceptable error, we'd only need h = sqrt(1e-4) = 1e-2. We give three methods: 1) Polynomial Interpolation: Here we take the values of f(x) at x{0},...,x{n}, compute the interpolation polynomial p(x), and differentiate it. The error bound is gotten by differentiating the expression for f(t)-p(t): f(t) - p(t) = f^{(n+1)}(z(t))/(n+1)! * prod (i=0 to n) (t-x{i}) Assuming that we want to approximate f'(x{j}) for some j, we get f'(x{j}) - p'(x{j}) = f^{(n+1)}(z)/(n+1)! * prod (i=0 to n except j) (x{j}-x{i}) This is particularly useful when the x{i} are irregularly spaced; the following methods are better when they are evenly spaced, and easier to analyze. 2) Centered Difference Formula: We use D1 = [f(x+h)-f(x-h)]/2*h which is the slope of the straightline connecting (x-h,f(x-h)) to (x+h,f(x+h)). We use Taylor's theorem to analyze the error, assuming f(x) is sufficiently smooth: f(x+h) = f(x) + h*f'(x) + h^2/2*f''(x) + h^3/6*f'''(z+) f(x-h) = f(x) - h*f'(x) + h^2/2*f''(x) - h^3/6*f'''(z-) where z+ and z- are two different points between x-h and x+h. Subtract to get f(x+h)-f(x-h) = 0 + 2*h*f'(x) + 0 + h^3/6*[f'''(z+)+f'''(z-)] or D1 = [f(x+h)-f(x-h)]/(2*h) = f'(x) - h^2/6*[f'''(z+)+f'''(z-)]/2 Now the truncation error is proportional to h^2 as desired. We can simplify the formula for truncation error by using the intermediate value theorem to replace it by D1 = f'(x) - h^2/6*f'''(z) for some z between z+ and z-. Ex: Returning to differentiating sin(x) at x=1: format short e, h=10.^(-(0:15)'); D1= (sin(1+h)-sin(1-h))./(2*h);[h,D1,D1-cos(1)] loglog(h,sin(1)*h.^2/6,'r+', ... h,1e-16*sin(1)./h,'bo', ... h,abs(D1-cos(1)),'kx') title('truncation error = red, roundoff = blue') In the last column, note how the error decreases by 100x at each step, bottoming out when truncation error = roundoff error, or approximately when |f'''(x)*h^2/6| = |macheps*f(x))/h| or h^3 = macheps * 6 * |f(x)/f'''(x)| or h = ( macheps * 6 * |f(x)/f'''(x)| )^(1/3) which is proportional to macheps^(1/3) ~ 1e-5. The error at this minimum point is macheps^(2/3) ~ 1e-10, which is what we observe. 2) Richardson Extrapolation The centered difference formula gave us a truncation error proportional to h^2. Richardson extrapolation is a technique to get a sequence of approximations with trunction errors proportional to h^4, h^6, etc. To do this, we need to evaluate f(x) at more than 2 points. Let us return to the centered difference formula, and take more terms in the Taylor expansions (assuming f is smooth enough:) f(x+h) = sum (from j=0 to infinity) f^(j)(x)*h^j/j! f(x-h) = sum (from j=0 to infinity) f^(j)(x)*(-h)^j/j! Subtract, and note that all the terms where j is even cancel: f(h+h)-f(x-h) = sum (from j=1 to infinity, j odd) f^(j)(x)*2*h^j/j! Divide by 2h to get D1(h): D1(h) = f'(x) + sum (from j=3 to infinity, j odd) f^(j)(x)*h^(j-1)/j! We'd like to get rid of the h^2 term (j=3) to get a better approximation. So suppose we also compute D1(h/2) = [f(x+h/2)-f(x-h/2)]/h which requires two more evaluations of f(). Then the same expansion tells us D1(h/2) = f'(x) + sum (from j=3 to infinity, j odd) f^(j)(x)*h^(j-1)/2^(j-1)j! Writing out some more terms: D1(h) = f'(x) + f^(3)(x)*h^2/3! + f^(5)(x)*h^4/5! + ... D1(h/2) = f'(x) + f^(3)(x)*h^2/4/3! + f^(5)(x)*h^4/16/5! + ... Can we somehow combine these to get a better approximation? Yes: the idea is to take a linear combination a*D1(h)+b*D1(h/2) where we choose a and b so a*D1(h) + b*D2(h) = f'(x) + O(h^4) i.e. to cancel the O(h^2) term. This means a and b should make a*D1(h) + b*D2(h) = (a+b)f'(x) + a*(f^(3)(x)*h^2/3!) + b*(f^(3)(x)*h^2/4/3!) + O(h^4) = f'(x) + 0*h^2 + O(h^4) This puts two conditions on a and b: a + b = 1, so (a+b)*f'(x) = f'(x) and a + b/4 = 0 so the h^2 terms cancel (f^(3)(x)*h^2/3! is a common factor, which we omit) We can solve these two simultaneous equations in 2 unknowns to get a = -1/3, b = 4/3 In other words D2(h) = -1/3*D1(h) + 4/3*D1(h/2) = f'(x) + sum (from j=5 to infinity, j odd) f^(j)(x)*h^(j-1)/j!*(-1+2^(3-j))/3 Now has truncation error proportional to h^4 instead of h^2. Indeed, the highest order term in the truncation error is now f^(5)(x)*h^4/5!/4 Note that we can also write D2 as D2(h) = D1(h/2) + (D1(h/2)-D1(h))/3 Ex: Returning again to differentiating sin(x) at x=1: D1h = (sin(1+h)-sin(1-h))./(2*h); D1h2 = (sin(1+h/2)-sin(1-h/2))./h; D2h = D1h2 + (D1h2-D1h)/3; [h,D1,D1-cos(1),D2h,D2h-cos(1)] loglog(h,sin(1)*h.^4/4/gamma(6),'r+', ... h,1e-16*sin(1)./h,'bo', ... h,abs(D2h-cos(1)),'kx') title('truncation error = red, roundoff = blue') In the last column, note how the error decreases by h^4 = 10000x at each step, bottoming out when truncation error = roundoff error, or approximatly when O(h^4) = O(macheps/h) or h = O(macheps^(1/5)) = 1e-3 as observed. Now observe that D2(h) has the same functional form as D1(h): D2(h) = f'(x) + Taylor expansion in h, with terms h^4, h^6, h^8,... This means we can do Richardson extrapolation again, taking a linear combination D3(h) = a'*D2(h) + b'*D2(h/2) to cancel the h^4 term. Analogous to before, we get a' + b' = 1 a' + b'/2^4 = 0 or b' = 2^4/(2^4-1) = 16/15 and a' = -1/(2^4-1) = -1/15 Ex: Returning again to differentiating sin(x) at x=1: D1h = (sin(1+h)-sin(1-h))./(2*h); D1h2 = (sin(1+h/2)-sin(1-h/2))./h; D2h = D1h2 + (D1h2-D1h)/3; D1h4 = (sin(1+h/4)-sin(1-h/4))./(h/2); D2h2 = D1h4 + (D1h4-D1h2)/3; D3h = D2h2 + (D2h2-D2h)/15; [h,D2h,D2h-cos(1),D3h,D3h-cos(1)] loglog(h,sin(1)*h.^6,'r+', ... h,1e-16*sin(1)./h,'bo', ... h,abs(D3h-cos(1)),'kx') title('truncation error = red, roundoff = blue') In the last column, note how the error decreases by h^6 = 1e6x at the first step, and then bottoms out due to roundoff This procedure can be repeated recursively, although the previous example shows that roundoff means it is not worth repeating too often. (also, we require that f(x) be quite smooth, i.e. have lots of derivatives): Given D{j}(h) and D{j}(h/2), with truncation errors = O(h^(2*j)), then D{j+1}(h) = D{j}(h/2) + (D{j}(h/2) - D{j}(h))/(4^j-1) will have truncation error O(h^(2*j+2)).