Math 128a - Spring 2002 - Lecture 3 - Jan 29 (Tuesday)
The first homework assignment has been posted on the web page.
it is due Feb 7.
Reading in text: Sections 2.3, 3.2, 3.3
Goals:
Want to understand how to bound error in evaluating polynomial
Want to modify bisection (or any zero finder) to use error bound to
give bound to user on location of zero
Example 11: x=(1:.001:3), y=func(x), plot(x,y), axis([1 3 -1e-8 1e-8])
round off makes answer "junk" but with structure
x=2+(0:2000)*2*eps;y=func(x);plot(y,'.')
Use Fact from last time to bound error:
UN = smallest normalized floating point number
= "Underflow threshold"
= 2^(-1022)
~ 10^(-308)
OV = largest normalized floating point number
= "Overflow threshold"
~ 2^(1024)
~ 10^(308)
Let op be one of =,-,*,/.
Suppose UN <= |fl(a op b)| <= OV.
(i.e. fl(a op b) does not overflow or underflow).
Then
(*) fl(a op b) = (a op b)*(1+ delta)
where |delta| <= epsilon
We note that IEEE also guarantees that
(**) fl(sqrt(a)) = sqrt(a)*(1+delta), |delta| <= eps
Example 12: The program
p = 1, for i=1:n, p=p*x(i), end
computes the product x(1)*x(2)* ... *x(n)
What is difference between the true p and the computed p?
Approach: write down a "program" that includes all the
rounding error explicitly, and evaluate it exactly:
p* = 1, for i=1:n, p* = (p* * x(i))*(1 + delta(i))
where delta(i) is the roundoff error in the
i-th multiplication. Evaluating, we get
p* = x(1)*...*x(n) * (1+delta(1))*...*(1+delta(n))
= p * (1+delta(1))*...*(1+delta(n))
= p * (1+r) where 1+r = (1+delta(1))*...*(1+delta(n))
Ideally, 1+r is as close to 1 as possible. How close is it?
Simple bounds:
(1-epsilon)^n <= 1+r = (1+delta(1))*...*(1+delta(n) <= (1+epsilon)^n
so
(1-epsilon)^n - 1 <= r <= (1+epsilon)^n - 1
How to approximate (1+-epsilon)^n?
Simplest way: Binomial theorem or Taylor expansion:
(1+-epsilon)^n = 1 +- n*epsilon + O((n*epsilon)^2)
so if |n*epsilon|<<1, get that
-n*epsilon + O((n*epsilon)^2 ) <= r <= n*epsilon + O((n*epsilon)^2 )
For example, n<=100, epsilon ~ 1e-16, means p* approximates p to at least
14 decimal places
Approximation ok if n*epsilon << 1 (see below)
To be rigorous, and not ignore O(epsilon^2) term, one can prove
Lemma: |r| <= n*epsilon / (1 - n*epsilon ) if n*epsilon < 1
Proof: homework
This is "worst case analysis", only happens if all delta(i) ~ epsilon,
or all delta(i) ~ -epsilon. This is unlikely, but possible, so often
answer is more accurate than worst case prediction (examples later)
From now on, we will write |r| <= n*epsilon, ignoring second order terms
In case of p*, if for example, epsilon = 1e-16 and n = 100, then
n*epsilon <= 1e-14 => p* good to 14 decimal places (details on homework)
Example 13: The program
s = 0; for i=1:n, s=s+x(i), end
computes the sum x(1)+x(2)+ ... +x(n)
What is difference between the true s and the computed s*?
Approach: as beforewrite down a program that includes all
the rounding error explicitly, and evaluate it exactly:
s* = 0, for i=1:n, s* = (s* + x(i))*(1 + delta(i))
where delta(i) is the roundoff error in the i-th addition
This is tricker algebra, so idea is to modify program to
have a unique identifier for each intermediate value:
s*(0) = 0, for i=1:n, s*(i) = (s*(i-1) + x(i))*(1 + delta(i))
Multiply out a few terms to see pattern (skip induction proof)
s*(1) = x(1) ... no roundoff when adding 0
s*(2) = (x(1)+x(2))*(1+delta(2))
s*(3) = ((x(1)+x(2))*(1+delta(2)) + x(3))*(1+delta(3))
= x(1)*(1+delta(2))*(1+delta(3))
+ x(2)*(1+delta(2))*(1+delta(3))
+ x(3)*(1+delta(3))
s*(4) = x(1)*(1+delta(2))*(1+delta(3))*(1+delta(4))
+ x(2)*(1+delta(2))*(1+delta(3))*(1+delta(4))
+ x(3)*(1+delta(3))*(1+delta(4))
+ x(4)*(1+delta(4))
...
s*(k) = x(1)*(1+delta(2))*...*(1+delta(k))
+ x(2)*(1+delta(2))*...*(1+delta(k))
+ x(3)*(1+delta(3))*...*(1+delta(k))
+ x(4)*(1+delta(4))*...*(1+delta(k))
+ ...
+ x(k)*(1+delta(k))
so s*(n) = sum from i=1 to n of x*(i) where
x*(i) = x(i)*(1+r(i)) where |r(i)| <= (n-1)*epsilon
In words, s*(n) is the exact sum of a perturbed set of
summands x*(i), each of which is a small relative
perturbation of x(i)
What does this mean?
First, suppose all the x(i) are positive; then claim
that s*(n) = s(n)*(1+delta), where |delta| <= (n-1)*epsilon
Proof:
s*(n) - s(n) = sum x(i)*(1+r(i)) - sum x(i)
= sum x(i)*r(i)
<= sum x(i)*(n-1)*epsilon
= s(n)*(1+(n-1)*epsilon)
and similary s*(n) >= s(n)*(1-(n-1)*epsilon)
What if not all x(i) > 0? then can only show that
| s*(n) - s(n) | = | sum_i x(i)*r(i) | <= (n-1)*epsilon* sum_i |x(i)|
so error bound same as if all x(i)>0, and so the relative error is
| s*(n) - s(n) | / | s(n) |
<= (n-1)*epsilon * [ sum |x(i)| / | sum x(i) | ]
Thus, if |s(n)| = | sum x(i) | << sum |x(i)|, i.e. if there
is a lot of "cancellation" in the sum, then there may be a large
relative error:
Ex: fl((x+1e16) - 1e16) = 0, not x, for any 0 < |x| < epsilon*1e16 ~ 4.44
so the relative error is 100%.
What about polynomial root finding, which involves summing and multiplying?
As we approach the the root, we are in precisely the situation where the
value of the polynomial (a sum) is supposed to be near zero, i.e. we
expect large relative errors. The analysis is similar to summing:
Example 14: Horner's rule for polynomial evaluation (used in polyval)
... compute p = sum from i=0 to n of c(i)*x^i
p = c(n); for i=n-1:-1:0, p = x*p + c(i)
Using same approach as for rounding error analysis in Example 13, we get
p*(n) = c(n);
for i=n-1:-1:0,
p*(i) = (x*p*(i+1)*(1+eps(i)) + c(i))*(1+delta(i))
end
with |eps(i)| <= epsilon, |delta(i)| <= epsilon
p*(n) = c(n) ... no roundoff
p*(n-1) = ( (x * p*(n) * (1+eps(n-1)) + c(n-1)) * (1+delta(n-1))
= [x * c(n)] * (1+eps(n-1))*(1+delta(n-1))
+ c(n-1) * (1+delta(n-1))
p*(n-2) = ( (x * p*(n-1)) * (1+eps(n-2)) + c(n-2)) * (1+delta(n-2))
= [x^2 * c*(n)] *
(1+eps(n-1))*(1+eps(n-2))*(1+delta(n-1))*(1+delta(n-2)
+[x * c(n-1)] *
(1+eps(n-2))*(1+delta(n-1))*(1+delta(n-2)
+ c(n-2) *
(1+delta(n-2))
p*(n-k) = [x^k * c(n) ] *
(1+eps(n-1))*...*(1+eps(n-k) *
(1+delta(n-1))*...*(1+delta(n-k) *
+[x^(k-1) * c(n-1) ] *
(1+eps(n-2))*...*(1+eps(n-k) *
(1+delta(n-1))*...*(1+delta(n-k) *
+[x^(k-2) * c(n-2) ] *
(1+eps(n-3))*...*(1+eps(n-k) *
(1+delta(n-2))*...*(1+delta(n-k) *
...
+ [c(n-k)] * (1+delta(n-k))
Note that the maximum number of terms of the form (1+tiny) multiplied
together is 2*n. Thus p*(n) = sum from i=0 to n of c*(i)*x^i where
c*(i) = c(i)*(1+r(i)) where |r(i)| <= 2*n*epsilon
That is the computed value p* of the polynomial is
the exact value of another polynomial evaluated at the same point x,
but with different coefficiencts c*(i), each of which
differs from the original c(i) by a small relative changes r(i).
Further more, we can write the error bound as
| p* - p | = | sum from i=0 to n c*(i)*x^i - c(i)*x^i |
= | sum from i=0 to n [c*(i) - c(i)]*x^i |
= | sum from i=0 to n [r(i)*c(i)]*x^i |
<= sum from i=0 to n |r(i)*c(i)]*x^i |
because r(i) can be positive or negative,
<= 2*n*epsilon * sum from i=0 to n |c(i)|*|x|^i
= 2*n*epsilion * value of a polynomial with
coefficients |c(i)| at |x|,
i.e. all positive numbers, without cancellation
In matlab, if p = polyval(coef,x), then
| p* - p | <= 2*n*epsilon * polyval(abs(coef),abs(x))
In practice, factor 2*n is pessimistic, and often omitted.
Example: [val,errorbnd]=polyvalbnd(coef,x)
[x,val,errorbnd,errorbnd./val] for x=(1:.2:3)'
when relative error bound = errorbnd./val > 1, can't trust
sign in zero finder
Example: [xout,HornerVal,TrueVal,AbsErrorBound,TrueAbsError]=
polyplot(roots,x)
[...]=polyplot(2*ones(13,1),1:.001:3);
[...]=polyplot(1:20,-5:.001:25);
Using this error bound in a zero finder, like bisection:
1) compute both f(xmid) and an error bound on f(xmid)
2) stop when error bound > |f(xmid)|,
because then sign of f(xmid) in doubt, might as well be zero
End of floating point arithmetic details and error analysis, for now:
Big picture to remember:
value of polynomial computed with roundoff is the exact
value of a slightly perturbed polynomial; this is an
example of a general phenomenon called "backward stability"
that is widely used to analyze the impact of roundoff
on an algorithm
(Warning: this is the first of several uses of the word
"stability" in this course. It has different meanings
in different contexts, so be aware!)
Here is generally how we will analyze algorithms:
Let y=f(x) be the true mathematical function we'd like to evaluate
Ex: y = sin(x), or
y = value of polynomial evaluated at a point, in which case
x is a vector of values (polynomial coefficients and
argument of the polynomial)
y = root of a polynomial, in which case x is as above
y = solution of a system of equations, in which case
y is a vector, and x includes all the coefficients
of the system of equations
Let alg(x) be the result of the algorithm in the machine, including
all round off errors and any other approximations we make
Then we want to bound the error = f(x) - alg(x)
We do this in two steps, by analyzing
1) backward stability
2) conditioning
Both terms have different meanings in different mathematical contexts,
so beware
For simplicity at first, suppose x and y are scalars
1) "Backward Stability" :
Suppose we can show that alg(x) = f(x+ dx)
where |dx| << |x|, then we say that the algorithm is "stable"
because we get the exact answer for a slightly wrong problem
x + dx
(Terminology: dx also called the "backward error")
Examples: Evaluating sin(x), other built-in functions is stable.
Evaluating a sum is stable because we get the exact sum
of a slightly different set of summands (see last time)
Evaluating a polynomial is stable, because we get the
exact value of a slightly different polynomial
f(x)=x, alg(x) = (x+1e100)-1e100 is unstable, because
we get 0 for |x|<1e84, and
alg(x) 0 != x+dx = f(x+dx) for |dx|<<|x| and x != 0
2) "Conditioning"
By Taylor's theorem,
f(x + dx) = f(x) + dx*f'(x) + O(dx^2)
Thus if f'(x) is large, we expect f(x+dx) - f(x) to be large,
else small.
Since we care about relative errors, we often write this in
the equivalent way as
[ f(x+dx) - f(x) ] / f(x) = ( dx / x ) * ( x * f'(x) / f(x) )
or
relative change in f = relative change in x *
"condition number"
So putting it all together, we get
[ alg(x) - f(x) ] / f(x) ~ ( dx / x ) * ( x * f'(x) / f(x) )
or, taking absolute values
| [ alg(x) - f(x) ] / f(x) | ~ | dx / x | * | x * f'(x) / f(x) |
relative error in f = relative change in x *
"condition number"
Thus we expect a small error if
the algorithm is stable (dx/x is small) and
the condition number is not too large
Having a stable algorithm obviously depends on having a good algorithm
The condition number only depends on f(x), not the algorithm.
Ex: consider f(x) = sin(x), then condition number = k(x) =
x*cos(x)/sin(x) = x*cot(x)
Consequence 1: k(x) is large when sin(x) near 0
Consequence 2: k(x) is large when x is large
What about f(x,y)? or more variables?
Recall Taylor's Thm f:R^2 -> R
f(x + deltax, y + deltay)
= f(x,y) + df/dx(x,y) deltax + df/dy(x,y) deltay
+ second order terms
Proof: Use chain rule (ignoring second order terms)
d f(x(t),y(t))/ dt = df/dx dx/dt + df/dy dy/dt so
f(x(t+deltat),y(t+deltat))
= f(x,y) + df/dx dx/dt deltat + df/dy dy/dt deltat so
f(x(t) + dx/dt deltat, y(t) + dy/dt deltat)
= f(x,y) + df/dx dx/dt deltat + df/dy dy/dt deltat
write dx/dt deltat = deltax, dy/dt deltat = deltay to get result
More generally, f(x1 + deltax1, ... , xn + deltaxn) = f(x1,...,xn) +
sum from i=1 to n df(x1,...,dx)/dxi deltaxi
Def: vector delf = (df/dx1,...,df/dxn) called gradient of f
So rewrite Taylor's theorem as
f(x1 + deltax1 ,..., xn + deltaxn) = f(x1,...,xn) + delf * deltax
where deltax = (deltax1 , ... , deltaxn)
where the second term is the dot product of two vectors
Thus [ f(x1+deltax1,...,xn+deltaxn) ] - f(x1,...,xn) / f(x1,...,xn)
= delf * deltax / f(x1,..,xn)
Example: Evaluating a polynomial p(z) = sum from i=0 to d c(i)*z^i
What are parameters?
If only z changes, then n=1, x=z, delf = p'(x)
But from rounding error analysis, only coeffs change,
so n=d+1, x=[c(0),...,c(n)],
f(c(0),c(1),c(2),...,c(n)) = sum from i=0 to n c(i)*x^i
delf = [ 1, x, x^2, ... , x^n]
deltax = [ deltac(0), deltac(1), ... , deltac(n)]
Thus [ f(x1+deltax1,...,xn+deltaxn) ] - f(x1,...,xn) / f(x1,...,xn)
= sum from i=0 to n deltac(i)*x^i / sum from i=0 to n c(i)*x^i
Example: Finding zeros of a polynomial
In other words f(c(0),...,c(n)) = root of polynomial with coeffs c(i)
How to differentiate it? Implicitly:
write p(x) = 0 and (p + deltap)(x+dx) = 0, solve for dx
where p(x) = sum c(i)*x^i
0 = (p + deltap)(x+dx) = sum (c(i)+delta(c(i)))*(x+dx)^i
= sum c(i)*(x+dx)^i + sum deltac(i)*(x+dx)^i
~ sum c(i)*(x^i + i*x^(i-1)*dx)
+ sum deltac(i)*(x^i + i*x^(i-1)*dx)
= sum c(i)*x^i + sum i*c(i)*x^(i-1) * dx
+ sum deltac(i)*x^i + sum i*deltac(i)*x^(i-1) * dx
= 0 + p'(x) * dx
+ deltap(x) + O(delta^2) ... we ignore last term
Thus,
dx = -deltap(x)/p'(x) to first order
Matlab example to illustrate this:
ebnd = polyperturb(r,e,m)
takes the roots r of a polynomial,
forms the coefficients
for i=1 to m, perturbs coefficients by relative amount e,
computes the roots
plots them
illustrates pictorially how sensitive the roots are to
perturbations.
ebnd = polyperturb([1 2 3],1e-4, 300)
ebnd = polyperturb([1 2 3],1e-3, 300)
ebnd = polyperturb([1 2 3],2e-3, 300)
ebnd = polyperturbANS([1 2 3],2e-3, 300)
ebnd = polyperturb([1 2 3],2e-2, 300)
ebnd = polyperturb(1:10,1e-4, 300)
ebnd = polyperturbANS([1 2 2 3 3 3],1e-6, 300)
ebnd = polyperturbANS([1 2 2 3 3 3],1e-4, 300)
ebnd = polyperturbANS([1 2 2 3 3 3],1e-3, 300)
ebnd = polyperturb(2*ones(5,1),1e-12, 300)
why does it look like this?