Math 128a - Spring 2002 - Lecture Notes on Numerical Differentiation
Read Section 7.1 in the textbook.
Recall our motivation for studying polynomial approximation:
Given a problem with a "difficult" function f(x), replace
f(x) by a simpler function, like a polynomial, and solve that
problem instead. Now we will use this idea when the problem is
differentiation, and then integration (also called "quadrature").
In this chapter, as in the last chapter, we will get error bounds
that depend on higher derivatives of f(x), because if the function
is not smooth, knowing its values at some points x{i} will not
tell us anything about f(x) at other points, and so nothing about
its derivative (or integral). (This observation will
motivate our ultimate algorithm for integration,
"adaptive quadrature", which uses more points in regions where
it thinks the function is not smooth.)
Numerical Differentiation:
Let's start with the definition of derivative:
f'(x) = lim h->0 (f(x+h)-f(x))/h
This motivates the following approximation: choose a "small" h>0
and let
D = [f(x+h)-f(x)]/h
be our approximate derivative. What is the error? Here are two
(similar) approaches, both of which assume f(x) is smooth enough to
have a second derivative:
1) Use a Taylor expansion: f(x+h) = f(x) + h*f'(x) + h^2*f'(z)/2
for some z in [x,x+h]. Solving for f'(x) yields
f'(x) = [f(x+h)-f(x)]/h - h*f''(z)/2
= D - h*f''(z)/2
So we see that the error -h*f''(z)/2 is proportional to h
times the second derivative of f. This error is called the
"truncation error", because it arises from truncating the
Taylor expansion in deriving the formula for D.
2) Use polynomial interpolation: This means we take our data
points (x,f(x)), (x+h,f(x+h)), interpolate to get a
polynomial p(t), and approximate f'(x) by p'(x).
The polynomial is a straight line:
p(t) = f(x) + t* [ f(x+h)-f(x) ]/h
with a constant derivative p' = [ f(x+h)-f(x) ]/h = D.
Our error term for polynomial interpolation tells us
f(t) = p(t) + f''(z(t))/2 * (t-x)*(t-(x+h))
where z(t) is some point between x and max(t,x+h),
assuming t >= x.
Thus
f'(t) = p'(t) + d/dt (f''(z(t))/2) * (t-x)*(t-(x+h))
+ f''(z(t))/2 * (2*t - 2*x - h)
and
f'(x) = p'(x) - f''(z(x))/2 * h
the same as before.
Either way, we see the error=truncation error is proportional to h,
i.e. "first order in h". This means that if we want, say, to lower
the error by 10x, we need to have h 10x smaller. What prevents us
from making h as small as we like? The answer is roundoff, in
particular "subtractive cancellation".
Ex: Consider computing the derivative of sin(x) this way;
format short e,
h=10.^(-(0:15)'); D = [sin(1+h)-sin(1)]./h;[h,D,D-cos(1)]
loglog(h,abs(D-cos(1)),'+'),
Notice that as h decreases by factors of 10 from .1 to 1e-7,
the error in D also decreases by factors of 10.
But after that, the error in D increases up to nearly
.0148/cos(1) ~ 3% (actual value computer dependent!)
Let's try to understand this example. For suppose roundoff
makes an error eps1 ~ 1e-16*f(x) in the computed values of f(x):
fl(f(x)) = f(x)+eps1
and similarly an error eps2 in f(x+h):
fl(f(x+h)) = f(x+h)+eps2
Then the computed value of D will be
fl(D) = [f(x+h)+eps2 - (f(x)+eps1)]/h *(1+eps3)
where eps3~1e-16 accounts for the error in subtraction and division
by h. Then
fl(D) = [f(x+h)-f(x)]/h +
[f(x+h)-f(x)]/h * eps3 + ... harmlessly small
+[eps2 - eps1)]/h *(1+eps3)
The second term is harmless because it is small compared to the
first term, the exact value of D.
The numerator of the third term is essentially an unpredictable
number with absolute value bounded by around 1e-16*|f(x)|. So we
are dividing a small but unpredictable number by another small
number h. If h gets small enough, the third term will be a large
unpredictable number, i.e. a large error in the computed value
of D. In particular, once the trunction error,
roughly |f''(x)|*h/2, falls below 1e-16*|f(x)|/h,
we expect the error to increase proportionally to 1/h.
Ex: Again consider computing the derivative of sin(x) this way;
format short e, h=10.^(-(0:15)'); D = [sin(1+h)-sin(1)]./h;
[h,sin(1)*h/2,1e-16*sin(1)./h,D,D-cos(1)]
loglog(h,sin(1)*h/2,'r+', ...
h,1e-16*sin(1)./h,'bo', ...
h,abs(D-cos(1)),'kx')
title('truncation error = red, roundoff = blue')
notice that as decreases by factors of 10 from .1 down to 1e-7,
the error in D also decreases by a factor of 10, as predicted.
But after that, the truncation error sin(1)*h/2 in column 2
falls below the roundoff error 1e-16*sin(1)/h in column 3, and
the error increases, roughly proportionally to 1/h.
We can compute the optimal value of h, under the assumptions above:
truncation error(h) = roundoff error(h)
or
|f''(x)|*h/2 = 1e-16*|f(x)|/h
or
h = sqrt(2e-16*|f(x)/f''(x)|) = 1e-8 *sqrt(|2*f(x)/f''(x)|)
in particular, it is proportional to the square root of
machine epsilon, 1e-16.
How we do better, i.e. get an acceptably small error with larger
values of h?
(later we will see there is another reason for this: the cost of
our integration algorithms will depend on the number of points,
and the number of points is proportional to 1/h).
One way to do better is to find a different approximation than D,
whose trunction error is proportional to h^2 instead of h. That way
we can expect, for example, that instead of needing, say, h=1e-4
to get an acceptable error, we'd only need h = sqrt(1e-4) = 1e-2.
We give three methods:
1) Polynomial Interpolation: Here we take the values of f(x) at
x{0},...,x{n}, compute the interpolation polynomial p(x), and
differentiate it. The error bound is gotten by differentiating
the expression for f(t)-p(t):
f(t) - p(t) = f^{(n+1)}(z(t))/(n+1)! * prod (i=0 to n) (t-x{i})
Assuming that we want to approximate f'(x{j}) for some j, we get
f'(x{j}) - p'(x{j}) = f^{(n+1)}(z)/(n+1)! *
prod (i=0 to n except j) (x{j}-x{i})
This is particularly useful when the x{i} are
irregularly spaced; the following methods are better when they
are evenly spaced, and easier to analyze.
2) Centered Difference Formula:
We use
D1 = [f(x+h)-f(x-h)]/2*h
which is the slope of the straightline connecting (x-h,f(x-h)) to
(x+h,f(x+h)). We use Taylor's theorem to analyze the error,
assuming f(x) is sufficiently smooth:
f(x+h) = f(x) + h*f'(x) + h^2/2*f''(x) + h^3/6*f'''(z+)
f(x-h) = f(x) - h*f'(x) + h^2/2*f''(x) - h^3/6*f'''(z-)
where z+ and z- are two different points between x-h and x+h.
Subtract to get
f(x+h)-f(x-h) = 0 + 2*h*f'(x) + 0
+ h^3/6*[f'''(z+)+f'''(z-)]
or
D1 = [f(x+h)-f(x-h)]/(2*h) = f'(x) - h^2/6*[f'''(z+)+f'''(z-)]/2
Now the truncation error is proportional to h^2 as desired. We
can simplify the formula for truncation error by using the
intermediate value theorem to replace it by
D1 = f'(x) - h^2/6*f'''(z)
for some z between z+ and z-.
Ex: Returning to differentiating sin(x) at x=1:
format short e, h=10.^(-(0:15)');
D1= (sin(1+h)-sin(1-h))./(2*h);[h,D1,D1-cos(1)]
loglog(h,sin(1)*h.^2/6,'r+', ...
h,1e-16*sin(1)./h,'bo', ...
h,abs(D1-cos(1)),'kx')
title('truncation error = red, roundoff = blue')
In the last column, note how the error decreases by 100x at
each step, bottoming out when truncation error = roundoff error,
or approximately when
|f'''(x)*h^2/6| = |macheps*f(x))/h|
or
h^3 = macheps * 6 * |f(x)/f'''(x)|
or
h = ( macheps * 6 * |f(x)/f'''(x)| )^(1/3)
which is proportional to macheps^(1/3) ~ 1e-5. The error at this
minimum point is macheps^(2/3) ~ 1e-10, which is what we observe.
2) Richardson Extrapolation
The centered difference formula gave us a truncation error
proportional to h^2. Richardson extrapolation is a technique to
get a sequence of approximations with trunction errors proportional
to h^4, h^6, etc. To do this, we need to evaluate f(x) at more
than 2 points.
Let us return to the centered difference formula, and take more
terms in the Taylor expansions (assuming f is smooth enough:)
f(x+h) = sum (from j=0 to infinity) f^(j)(x)*h^j/j!
f(x-h) = sum (from j=0 to infinity) f^(j)(x)*(-h)^j/j!
Subtract, and note that all the terms where j is even cancel:
f(h+h)-f(x-h) = sum (from j=1 to infinity, j odd) f^(j)(x)*2*h^j/j!
Divide by 2h to get D1(h):
D1(h) = f'(x) + sum (from j=3 to infinity, j odd) f^(j)(x)*h^(j-1)/j!
We'd like to get rid of the h^2 term (j=3) to get a better
approximation. So suppose we also compute
D1(h/2) = [f(x+h/2)-f(x-h/2)]/h
which requires two more evaluations of f().
Then the same expansion tells us
D1(h/2) = f'(x) +
sum (from j=3 to infinity, j odd) f^(j)(x)*h^(j-1)/2^(j-1)j!
Writing out some more terms:
D1(h) = f'(x) + f^(3)(x)*h^2/3! + f^(5)(x)*h^4/5! + ...
D1(h/2) = f'(x) + f^(3)(x)*h^2/4/3! + f^(5)(x)*h^4/16/5! + ...
Can we somehow combine these to get a better approximation?
Yes: the idea is to take a linear combination
a*D1(h)+b*D1(h/2)
where we choose a and b so
a*D1(h) + b*D2(h) = f'(x) + O(h^4)
i.e. to cancel the O(h^2) term. This means a and b should make
a*D1(h) + b*D2(h) = (a+b)f'(x) +
a*(f^(3)(x)*h^2/3!) +
b*(f^(3)(x)*h^2/4/3!) +
O(h^4)
= f'(x) + 0*h^2 + O(h^4)
This puts two conditions on a and b:
a + b = 1,
so (a+b)*f'(x) = f'(x) and
a + b/4 = 0
so the h^2 terms cancel (f^(3)(x)*h^2/3! is a common factor,
which we omit) We can solve these two simultaneous equations
in 2 unknowns to get
a = -1/3, b = 4/3
In other words
D2(h) = -1/3*D1(h) + 4/3*D1(h/2)
= f'(x) +
sum (from j=5 to infinity, j odd) f^(j)(x)*h^(j-1)/j!*(-1+2^(3-j))/3
Now has truncation error proportional to h^4 instead of h^2.
Indeed, the highest order term in the truncation error is now
f^(5)(x)*h^4/5!/4
Note that we can also write D2 as
D2(h) = D1(h/2) + (D1(h/2)-D1(h))/3
Ex: Returning again to differentiating sin(x) at x=1:
D1h = (sin(1+h)-sin(1-h))./(2*h);
D1h2 = (sin(1+h/2)-sin(1-h/2))./h;
D2h = D1h2 + (D1h2-D1h)/3;
[h,D1,D1-cos(1),D2h,D2h-cos(1)]
loglog(h,sin(1)*h.^4/4/gamma(6),'r+', ...
h,1e-16*sin(1)./h,'bo', ...
h,abs(D2h-cos(1)),'kx')
title('truncation error = red, roundoff = blue')
In the last column, note how the error decreases by h^4 = 10000x at
each step, bottoming out when truncation error = roundoff error, or
approximatly when O(h^4) = O(macheps/h) or
h = O(macheps^(1/5)) = 1e-3 as observed.
Now observe that D2(h) has the same functional form as D1(h):
D2(h) = f'(x) + Taylor expansion in h, with terms h^4, h^6, h^8,...
This means we can do Richardson extrapolation again, taking a linear
combination
D3(h) = a'*D2(h) + b'*D2(h/2)
to cancel the h^4 term. Analogous to before, we get
a' + b' = 1
a' + b'/2^4 = 0
or b' = 2^4/(2^4-1) = 16/15 and a' = -1/(2^4-1) = -1/15
Ex: Returning again to differentiating sin(x) at x=1:
D1h = (sin(1+h)-sin(1-h))./(2*h);
D1h2 = (sin(1+h/2)-sin(1-h/2))./h;
D2h = D1h2 + (D1h2-D1h)/3;
D1h4 = (sin(1+h/4)-sin(1-h/4))./(h/2);
D2h2 = D1h4 + (D1h4-D1h2)/3;
D3h = D2h2 + (D2h2-D2h)/15;
[h,D2h,D2h-cos(1),D3h,D3h-cos(1)]
loglog(h,sin(1)*h.^6,'r+', ...
h,1e-16*sin(1)./h,'bo', ...
h,abs(D3h-cos(1)),'kx')
title('truncation error = red, roundoff = blue')
In the last column, note how the error decreases by h^6 = 1e6x
at the first step, and then bottoms out due to roundoff
This procedure can be repeated recursively, although the
previous example shows that roundoff means it is not worth
repeating too often. (also, we require that f(x) be quite smooth,
i.e. have lots of derivatives):
Given D{j}(h) and D{j}(h/2), with truncation errors = O(h^(2*j)),
then
D{j+1}(h) = D{j}(h/2) + (D{j}(h/2) - D{j}(h))/(4^j-1)
will have truncation error O(h^(2*j+2)).