Math 128a - Spring 2002 - Lecture 1 - Jan 22 (Tuesday) Name, class, Text, URL on board Handout intro.ps Read secs 1.1, 1.2 of book (calculus review) Read secs 2.1, 2.2 (computer arithmetic) Read sec 3.1 (bisection) I expect you to read this ahead of time! Why study this stuff? essential tool in science and engineering: Traditional version of "scientific (and engineering) method": 1) make mathematical model of physical system that predicts behavior 2) make laboratory measurements (in engineering you build something to measure) 3) if predictions accurate enough, success! else fix model But this is not good enough for a lot of current problems: 1) How will the climate change over the next 50 years, depending on the amount of greenhouse gasses emitted equations too complicated to make predictions by hand; experiment too slow; just one possible! 2) how proteins fold and interact with one another (how a new drug attacks a disease) equations too complicated to make predictions by hand; too hard to measure what happens at molecular level 3) Design a computer chip with 10^8 transistors quickly and so that you know it works the first time and your $10^9 factory is not wasting its time and resources chip far too complicated to keep track of by hand cost of failure (chip doesn't work) is company failure CAD = Computer Aided Design (about 10 faculty in EECS at Berkeley, see bsac.berkeley.edu/cadtools/sugar for one project) (power point slides on SUGAR) So in practice the way numerical analysis fits in this process is as follows: scientists (geophysicists, biochemists, electrical engineers) formulate mathematical model of system under study numerical analysts devise algorithms for solving equations computer scientists build tools (eg math libraries, compilers) to make algorithms run on a particular computer (for examples, see www.nersc.gov or www.npaci.edu) Many algorithms we need have already been built into tools: elementary school: calculators for arithmetic calculus: Maple and Mathematica numerical analysis: interactive systems like Matlab, subroutine libraries like LAPACK, LSODE, etc. So why do we need to still teach these subjects? How has teaching them changed? (and still changing...) Why study arithmetic? first mathematical concept of all, hard to think or teach about anything else Some believe we no longer need to spend so much time on drill and can do other things instead (controversial!) Why study calculus? Most of science and engineering uses differential equations to describe how the world works (F=ma in physics, ME, CE; Kirchhoff's laws in EE) so just to talk about these things requires understanding their properties So just formulating a problem that you can type into Mathematica requires knowing calculus What about learning how to take integrals, and derivatives by hand, when Mathematica can do them faster, with fewer errors? need similar skills for more advanced problems Mathematica/Maple can't solve all problems! eg Try Maple, as accessed via Matlab: syms x int(1/(x^5 - (x^4+x^3+x^2+x)/16 - 1/2,-1,1) true answer = does not exist, because pole at .9342 How could Maple have easily avoided this error? Abel's Theorem - no closed form formula exists for roots of polynomials of degrees >= 5 - means that this is an "impossible" (read "nonautomatable") problem Most real-world problems do not have closed-form answers, so we need to approximate them, often numerically (see badintegral.m on Matlab page on class web page) Why study numerical analysis? Essentially all the major algorithms we will study (and many more) are built into Matlab and similar systems. The nature of problems we will solve is that they are so hard that no analytical solution exists, and we have no choice but to approximate them. For example, we could try to integrate a function numerically instead of analytically, or find its roots numerically. Approximations mean understanding accuracy is challenging: In Matlab, try c = poly(2*(ones(n,1));sort(roots(c)) for n=5,10, 15 In Matlab, try c = poly((1:n));sort(roots(c)) for n=5,10, 20, 40 We always face the following tradeoff: for any problem there are several possible approximations, and the more accurate the approximation, the more expensive it is Typically, very the cost of the approximation rises as its accuracy increases, so we have to know how to pick an approximation that is accurate enough, but not so accurate that it is too expensive. So we will spend our time studying 1) approximation algorithms for various problems that we cannot solve exactly 2) analysis of these algorithms (accuracy and complexity = cost) Final reason to study calculus, numerical analysis, or any mathematics: because it is fun! Topics: Extended example: solving f(x)=0 with bisection computer arithmetic, first approximation we must all live with more on solving nonlinear equations interpolation and approximation numerical integration (quadrature) numerical solution of ODEs solving linear systems of equations available mathematical software Start with Bisection to solve f(x)=0, as implemented in Matlab (see bisect.m) Basic idea of bisecting an interval in which f(x) changes sign Stopping criterion: when interval narrow enough How matlab code works bisect: pass in function name as string bisect1: always call func bisect_anim: pass in function name as string, animate inputs: xlower, xupper, tol, function f(x) such that f(xlower) and f(xupper) have different signs, so that f(x) has a zero in [xlower,xupper] outputs: updated xlower, xupper such that f(x) still has different signs and width = xupper-xlower < tol fupper = f(xupper) flower = f(xlower) width = xupper - xlower iters = 0; while (width > tol) xmid = (xlower + xupper)/2 fmid = f(xmid) if (fmid*flower <= 0) C f changes sign on [xlower, xmid] xupper = xmid fupper = fmid else C f changes sign on [xmid, xupper] xlower = xmid flower = fmid endif width = xupper - xlower iters = iters + 1 end Example 1: f(x) = cos(x) on [1,2], tol = 1e-15, works fine [xout,fn,iters]=bisect('cos',[1,2],1e-15,0) what should answer be? animate algorithm via: [xout,fn,iters]=bisect_anim('cos',[1,2],1e-15) What is complexity? Count the number of function evaluations (calls to cos in this case), since that is probably the most expensive operation and everything else is proportional to it: if n = number of calls, then width after i steps = original width /2^i so need original width / 2^i <= tol or i >= log_2 (original width / tol) or i = ceiling(log_2 (original width / tol)) = O( log (original width /tol) ) Review what f(n) = O(g(n)) means (section 1.2 of text) Example 2: f(x) = cos(x) on [1,2], tol = 1e-16, apparent infinite loop [xout,fn,iters]=bisect('cos',[1,2],1e-16,0) Example 3: f(x) = cos(x) on [1,3.1], tol = 1e-16, debug on [xout,fn]=bisect('cos',[1,3.1],1e-16,1) we see that xlower and xupper stagnate, width fixed at ~2.2e-16 Example 4: f(x) = cos(x) on [1,3.1], tol = 1e-16, animate [xout,fn]=bisect_anim('cos',[1,3.1],1e-16) see that the numbers are discrete, and can't expect to get tol < 2.220446..e-16 = 2^-52 Floating point numbers: scientific notation use of binary IEEE standard format for double precision numbers History of IEEE standard , association with Berkeley, Prof. W. Kahan as chief designer 1 bit sign, 11 bit exponent e (0 <= e <= 2^11-1 = 2047), 52 bit fraction f (0 <= f < 1) Interpretation depends on e, f: 0 < e < 2047 ("usual" case) means that x = +- 2^exp * (1+f) where exp = e-1023 means that there is a minimum spacing between adjacent floating point numbers, so bisection can't continue forever Note distance between adjacent nonzero floating point numbers in range [1,2) is 2^(-52), which matlab calls eps More generally, eps*|a| >= |a - nextafter(a)| >= eps*|a|/2, (this will be key to fixing bisection algorithm below) These called "normalized" because leading bit of (1+f) is one. (makes floating representation unique) Largest finite number = 2^(1023)*(1+(1-2^(-52))) ~ 1e308 = "Overflow threshold" (because something special must happen if a computed result would be bigger) Tiniest normalized positive number = 2^(-1022)*(1+0) ~ 1e308 = "Underflow threshold" (because something special must happen if a computed result would be smaller and nonzero) e = 2047, f = 0 means x = +- infinity obeys reasonable rules (1/0 = inf, 7 + inf = inf, 37/inf = 0, etc.) e = 2047, f != 0 means x = NaN = Not a Number (arises from 0/0, inf/inf, inf - inf, sqrt(-1), NaN*5, etc.) 0 = e and f = 0 means x = +-0 (why -0? because want 1/(1/+-inf) = +-inf, not lose sign, and other mathematical identities to hold) These called "subnormal" because leading bit of (0+f) is one. Tiniest positive number = 2^(-1022)*(0+2^(-52)) ~ 1e-323 (subnormal) 0 = e and f != 0, means x is "subnormal" x = +- 2^-1022 * (0 + f) (illustrate with Matlab "hex format") Picture of floating point number line Exceptions: Overflow - 1e200^2 = 1e400, too big, get +Inf Underflow - 1e-308 / 100, get subnormal number 1e-308^2, get zero Divide-by-zero - 1/+-0, get +-Inf Invalid - inf-inf, 0/0, sqrt(-1), inf/inf, get NaN Inexact - a rounding error occurred (most of the time!), get "rounded" result - more on this soon For each exception, always get a "sticky bit", a flag that says an exception occurred since bit last reset May continue computing (default) or trap Unavoidability of Rounding - almost any operation (sum, product,...) on a pair of floating point numbers yields a nonfloating point number as the exact answer, so we have to round it off to store it back in the computer as a floating point number Fortunately, the default on nearly all computers is to round it to the nearest floating point number (the natural thing to do) Tie-breaking for half-way case: pick answer with 0 bottom bit. More on analyzing roundoff error later; we know enough now to analyze infinite loop: Explanation of Examples 3, 4: if xlower and xupper are adjacent floating point numbers, then xmid rounds to xlower or xupper, and no progress is made (why? homework) Note: consider 3 digit decimal: fl((.999+.997)/2) = 1.00; so (xlower+xupper)/2 does not even have to be in the interval [a,b] ! This can't happen in binary (homework) Possible fixes for bisection algorithm: stop if xmid = xlower or xupper keep tol > eps*root, but then need to know root=answer change to relative stopping criterion width < reltol*(largest number in interval) Example 5: f(x) = x, in [-1 1], reltol = 1e-15 yields interval not containing 0, and negative values at both endpoints; why? Fix for bisection algorithm: replace test fmid*flower <= 0 by sign(fmid)*sign(flower) <= 0 Now it goes into an infinite loop; fix by having absolute lower bound on width Example 6: f(x) = (x-2)^13, in [1.1,3.2], tol = 1e-15 [xout,fn,iters]=bisect('func',[1.1, 3.2], 1e-15, 0) Example 7: f(x) = (x-2)^13 evaluated using Horner's rule (polyval in Matlab) tol =1e-10 [xout,fn,iters]=bisect('func',[rand(1), 3.2], 1e-10, 0) observe that answer changes a lot depending on rand(1) Example 8: f(x) = (x-2)^13 evaluated using Horner's rule, tol = 1e-10 [xout,fn,iters]=bisect_anim('func',[rand(1), 3.2], 1.e-15) Example 9: x=(1:.001:3), y=func(x), plot(x,func(x)), axis([1 3 -1e-8 1e-8]) round off makes answer "junk" but with structure x=2+(0:2000)*2*eps;y=func(x);plot(y,'.') Next time: understand why it looks like this, how to fix algorithm.