Math 128a - Spring 2002 - Lecture 1 - Jan 22 (Tuesday)
Name, class, Text, URL on board
Handout intro.ps
Read secs 1.1, 1.2 of book (calculus review)
Read secs 2.1, 2.2 (computer arithmetic)
Read sec 3.1 (bisection)
I expect you to read this ahead of time!
Why study this stuff? essential tool in science and engineering:
Traditional version of "scientific (and engineering) method":
1) make mathematical model of physical system that predicts behavior
2) make laboratory measurements
(in engineering you build something to measure)
3) if predictions accurate enough, success! else fix model
But this is not good enough for a lot of current problems:
1) How will the climate change over the next 50 years,
depending on the amount of greenhouse gasses emitted
equations too complicated to make predictions by hand;
experiment too slow; just one possible!
2) how proteins fold and interact with one another
(how a new drug attacks a disease)
equations too complicated to make predictions by hand;
too hard to measure what happens at molecular level
3) Design a computer chip with 10^8 transistors quickly and
so that you know it works the first time and your $10^9
factory is not wasting its time and resources
chip far too complicated to keep track of by hand
cost of failure (chip doesn't work) is company failure
CAD = Computer Aided Design
(about 10 faculty in EECS at Berkeley,
see bsac.berkeley.edu/cadtools/sugar for one project)
(power point slides on SUGAR)
So in practice the way numerical analysis fits in this process is
as follows:
scientists (geophysicists, biochemists, electrical engineers)
formulate mathematical model of system under study
numerical analysts devise algorithms for solving equations
computer scientists build tools (eg math libraries, compilers)
to make algorithms run on a particular computer
(for examples, see www.nersc.gov or www.npaci.edu)
Many algorithms we need have already been built into tools:
elementary school: calculators for arithmetic
calculus: Maple and Mathematica
numerical analysis: interactive systems like Matlab,
subroutine libraries like LAPACK, LSODE, etc.
So why do we need to still teach these subjects? How has teaching them changed?
(and still changing...)
Why study arithmetic? first mathematical concept of all, hard to think or teach
about anything else
Some believe we no longer need to spend so much time on drill
and can do other things instead (controversial!)
Why study calculus? Most of science and engineering uses differential
equations to describe how the world works
(F=ma in physics, ME, CE; Kirchhoff's laws in EE)
so just to talk about these things
requires understanding their properties
So just formulating a problem that you can type into
Mathematica requires knowing calculus
What about learning how to take integrals, and derivatives
by hand, when Mathematica can do them faster, with fewer
errors? need similar skills for more advanced problems
Mathematica/Maple can't solve all problems!
eg Try Maple, as accessed via Matlab:
syms x
int(1/(x^5 - (x^4+x^3+x^2+x)/16 - 1/2,-1,1)
true answer = does not exist, because pole at .9342
How could Maple have easily avoided this error?
Abel's Theorem - no closed form formula exists for roots
of polynomials of degrees >= 5 - means that
this is an "impossible" (read "nonautomatable")
problem
Most real-world problems do not have closed-form answers,
so we need to approximate them, often numerically
(see badintegral.m on Matlab page on class web page)
Why study numerical analysis? Essentially all the major algorithms we will
study (and many more) are built into Matlab and similar
systems.
The nature of problems we will solve is that they
are so hard that no analytical solution exists,
and we have no choice but to approximate them.
For example, we could try to integrate a function
numerically instead of analytically, or find its
roots numerically.
Approximations mean understanding accuracy is challenging:
In Matlab, try c = poly(2*(ones(n,1));sort(roots(c)) for
n=5,10, 15
In Matlab, try c = poly((1:n));sort(roots(c)) for
n=5,10, 20, 40
We always face the following tradeoff:
for any problem there are several possible
approximations, and the more accurate the
approximation, the more expensive it is
Typically, very the cost of the approximation
rises as its accuracy increases, so
we have to know how to pick an approximation
that is accurate enough, but not so accurate
that it is too expensive.
So we will spend our time studying
1) approximation algorithms for various problems that we cannot
solve exactly
2) analysis of these algorithms (accuracy and complexity = cost)
Final reason to study calculus, numerical analysis, or any mathematics:
because it is fun!
Topics: Extended example: solving f(x)=0 with bisection
computer arithmetic, first approximation we must all live with
more on solving nonlinear equations
interpolation and approximation
numerical integration (quadrature)
numerical solution of ODEs
solving linear systems of equations
available mathematical software
Start with Bisection to solve f(x)=0, as implemented in Matlab (see bisect.m)
Basic idea of bisecting an interval in which f(x) changes sign
Stopping criterion: when interval narrow enough
How matlab code works
bisect: pass in function name as string
bisect1: always call func
bisect_anim: pass in function name as string, animate
inputs: xlower, xupper, tol, function f(x) such that f(xlower) and f(xupper)
have different signs, so that f(x) has a zero in [xlower,xupper]
outputs: updated xlower, xupper such that f(x) still has different signs
and width = xupper-xlower < tol
fupper = f(xupper)
flower = f(xlower)
width = xupper - xlower
iters = 0;
while (width > tol)
xmid = (xlower + xupper)/2
fmid = f(xmid)
if (fmid*flower <= 0)
C f changes sign on [xlower, xmid]
xupper = xmid
fupper = fmid
else
C f changes sign on [xmid, xupper]
xlower = xmid
flower = fmid
endif
width = xupper - xlower
iters = iters + 1
end
Example 1: f(x) = cos(x) on [1,2], tol = 1e-15, works fine
[xout,fn,iters]=bisect('cos',[1,2],1e-15,0)
what should answer be?
animate algorithm via:
[xout,fn,iters]=bisect_anim('cos',[1,2],1e-15)
What is complexity?
Count the number of function evaluations (calls to cos in this case),
since that is probably the most expensive operation and everything
else is proportional to it: if n = number of calls, then
width after i steps = original width /2^i
so need
original width / 2^i <= tol
or
i >= log_2 (original width / tol)
or i = ceiling(log_2 (original width / tol))
= O( log (original width /tol) )
Review what f(n) = O(g(n)) means (section 1.2 of text)
Example 2: f(x) = cos(x) on [1,2], tol = 1e-16, apparent infinite loop
[xout,fn,iters]=bisect('cos',[1,2],1e-16,0)
Example 3: f(x) = cos(x) on [1,3.1], tol = 1e-16, debug on
[xout,fn]=bisect('cos',[1,3.1],1e-16,1)
we see that xlower and xupper stagnate, width fixed at ~2.2e-16
Example 4: f(x) = cos(x) on [1,3.1], tol = 1e-16, animate
[xout,fn]=bisect_anim('cos',[1,3.1],1e-16)
see that the numbers are discrete, and can't expect to get
tol < 2.220446..e-16 = 2^-52
Floating point numbers:
scientific notation
use of binary
IEEE standard format for double precision numbers
History of IEEE standard ,
association with Berkeley, Prof. W. Kahan as chief designer
1 bit sign,
11 bit exponent e (0 <= e <= 2^11-1 = 2047),
52 bit fraction f (0 <= f < 1)
Interpretation depends on e, f:
0 < e < 2047 ("usual" case) means that
x = +- 2^exp * (1+f) where exp = e-1023
means that there is a minimum spacing between
adjacent floating point numbers, so bisection
can't continue forever
Note distance between adjacent nonzero floating point numbers
in range [1,2) is 2^(-52), which matlab calls eps
More generally,
eps*|a| >= |a - nextafter(a)| >= eps*|a|/2,
(this will be key to fixing bisection algorithm below)
These called "normalized" because leading bit of (1+f) is one.
(makes floating representation unique)
Largest finite number = 2^(1023)*(1+(1-2^(-52))) ~ 1e308
= "Overflow threshold"
(because something special must happen
if a computed result would be bigger)
Tiniest normalized positive number = 2^(-1022)*(1+0) ~ 1e308
= "Underflow threshold"
(because something special must happen
if a computed result would be smaller and nonzero)
e = 2047, f = 0 means x = +- infinity
obeys reasonable rules (1/0 = inf, 7 + inf = inf, 37/inf = 0, etc.)
e = 2047, f != 0 means x = NaN = Not a Number
(arises from 0/0, inf/inf, inf - inf, sqrt(-1), NaN*5, etc.)
0 = e and f = 0 means x = +-0
(why -0? because want 1/(1/+-inf) = +-inf, not lose sign,
and other mathematical identities to hold)
These called "subnormal" because leading bit of (0+f) is one.
Tiniest positive number = 2^(-1022)*(0+2^(-52)) ~ 1e-323 (subnormal)
0 = e and f != 0, means x is "subnormal"
x = +- 2^-1022 * (0 + f)
(illustrate with Matlab "hex format")
Picture of floating point number line
Exceptions:
Overflow - 1e200^2 = 1e400, too big, get +Inf
Underflow - 1e-308 / 100, get subnormal number
1e-308^2, get zero
Divide-by-zero - 1/+-0, get +-Inf
Invalid - inf-inf, 0/0, sqrt(-1), inf/inf, get NaN
Inexact - a rounding error occurred (most of the time!),
get "rounded" result - more on this soon
For each exception, always get a "sticky bit", a flag that says
an exception occurred since bit last reset
May continue computing (default) or trap
Unavoidability of Rounding - almost any operation (sum, product,...)
on a pair of floating point numbers yields a nonfloating point number
as the exact answer, so we have to round it off to store it back in the
computer as a floating point number
Fortunately, the default on nearly all computers is to round it to
the nearest floating point number (the natural thing to do)
Tie-breaking for half-way case: pick answer with 0 bottom bit.
More on analyzing roundoff error later; we know enough now to
analyze infinite loop:
Explanation of Examples 3, 4: if xlower and xupper are adjacent floating point
numbers, then xmid rounds to xlower or xupper, and no progress is made
(why? homework)
Note: consider 3 digit decimal: fl((.999+.997)/2) = 1.00; so (xlower+xupper)/2
does not even have to be in the interval [a,b] ! This can't happen
in binary (homework)
Possible fixes for bisection algorithm:
stop if xmid = xlower or xupper
keep tol > eps*root, but then need to know root=answer
change to relative stopping criterion
width < reltol*(largest number in interval)
Example 5: f(x) = x, in [-1 1], reltol = 1e-15
yields interval not containing 0, and negative values at both
endpoints; why?
Fix for bisection algorithm:
replace test fmid*flower <= 0 by sign(fmid)*sign(flower) <= 0
Now it goes into an infinite loop; fix by having absolute lower bound on width
Example 6: f(x) = (x-2)^13, in [1.1,3.2], tol = 1e-15
[xout,fn,iters]=bisect('func',[1.1, 3.2], 1e-15, 0)
Example 7: f(x) = (x-2)^13 evaluated using Horner's rule (polyval in Matlab)
tol =1e-10
[xout,fn,iters]=bisect('func',[rand(1), 3.2], 1e-10, 0)
observe that answer changes a lot depending on rand(1)
Example 8: f(x) = (x-2)^13 evaluated using Horner's rule, tol = 1e-10
[xout,fn,iters]=bisect_anim('func',[rand(1), 3.2], 1.e-15)
Example 9: x=(1:.001:3), y=func(x), plot(x,func(x)), axis([1 3 -1e-8 1e-8])
round off makes answer "junk" but with structure
x=2+(0:2000)*2*eps;y=func(x);plot(y,'.')
Next time: understand why it looks like this, how to fix algorithm.