Math 55 - Spring 2004 - Lecture notes # 5 - Feb 3 (Tuesday) Read: Sections 2.1-2.3 Note: we will not cover binary search, sorting, greedy algorithms, which are covered elsewhere (CS61B) Homework : Due Feb 11 in section. 1) Show that if the first 14 positive integers are placed around a circle in any order, there exist 5 integers in consecutive locations around the circle whose sum is 38 or greater. Hint: Use the result of question 1.5-71. 2) A real number is called "algebraic" if it is the root of some polynomial with integer coefficients and degree at least 1. Let A be the set of all algebraic numbers. So A includes sqrt(2), cuberoot((5-sqrt(2))/sqrt(3)), any other such expression with roots and integers, and many other real numbers besides. This exercise will show that A is countable. 2.1) Show that if r is a root of a polynomial with rational coefficients, it is also a root of a polynomial with integer coefficients. So we won't miss any real numbers by restricting ourselves to polynomials with integer coefficients. 2.2) Show that the set P_d of polynomials of degree d >= 1 and with integer coefficients is countable. (A polynomial has degree d if it can be written as a_d*x^d + a_{d-1}*x^{d-1} + ... + a_1*x + a_0 with a_d nonzero) 2.3) Show that the set A_d of all real roots of all polynomials in P_d is countable. 2.4) Show that the set A of all algebraic numbers is countable. 2.5) Conclude that there are a great many more real numbers that are not roots of polynomials with integer coefficients than real numbers that are roots of such polynomials. 3) Simplify O(f(n)) where f(n) is given below. Your expression should be both as simple and accurate as possible (it should not overestimate f(n) by more than a constant factor). All logarithms are base pi. f(n) = [ 9^(2^(n^.3)) - 2^(9^(n^.3)) ] * [ .99^(n^3) + log (log (log (log n ))) ] * [ (log (log n))^(log (log (log n))) + (log (log (log n)))^(log (log n)) ] * [ 3^n * n^4 - 4^n * n^3 ] * [ 89*n^4 + 1234 * n * (log n)^18 ] 4) Show that log_(1.75) 3.5 must be irrational. Hint: proof by contradiction 5) sec 2.2: 20, 36, 60, 62 6) sec 2.3: 8 7) Modify the algorithm in the last question (2.3-8) to compute the derivative of the given polynomial. How many additions and multiplications does your algorithm take (ignoring additions to increment the loop variable)? Goals for today: Expressing algorithms how do we measure, compare running times? Big-O notation Introduction to Complexity Theory Algorithms: ASK&WAIT: what does this program do? prog1(integer n, integer array a(1),...,a(n)) M = a[1] for i = 2 to n M = max(M,a(i)) end for return M ASK&WAIT: What does this program do? prog2(integer n, integer array a(1),...,a(n)) for i = 1 to n M = a(i) for j = 1 to n except i if a(j)>M goto next end for return M next: endfor ASK&WAIT: Which program do you think is faster? How much faster? How do we determine how long prog1 takes to run? Approach 1: run it and measure the run time in seconds, ASK&WAIT: What are the pros and cons of this approach? Approach 2: for each operation performed by the program, find out how many seconds it takes, and add them all up n-1: max n-1: increment i n-1: test to see if loop is done start up cost of M=a[1], etc... ASK&WAIT: What are the pros and cons of this approach? Approach 3: it takes time proportional to n, ie. about k*n for some constant k, large enough n that startup is small ASK&WAIT: What are the pros and cons of this approach? ASK&WAIT: How long does prog2 take to run? i.e. proportional to what function of n? If it depends on values of a(1),...,a(n), what is the worst case? Is prog2 or prog1 faster, in the worst case? Big-O notation Motivation: given a complicated function f(n), which may represent how long a program runs on a problem of size n, (or how much memory it takes), quickly approximate it by a much simpler function g(n) that roughly bounds how fast f(n) increases as a function of n Ex: Consider f(n) = (pi+7)/3*n^2 + n + 1 + sin(n). When n is large, n^2 is so much larger than n+1+sin(n) that we would like to ignore these terms. Furthermore, (pi+7)/3 is just a constant, and for many purposes we don't care exactly what the constant is. So we'd like some notation to say that f(n) grows like some constant time n^2: we will write "f(n) is O(n^2)". Introduce notation to mean "proportional to n", or any other g(n): DEF: Let f and g map integers (or real) to reals. We say that f(x) is O(g(x)) (read "Big-O of g") if there are constants C>0 and k>=0 such that |f(x)| <= C*|g(x)| whenever x>k Intuitively, if f(x) grows as x grows, then g(x) grows at least as fast EG: f(x) = 100*x and g(x)=x^2, then for x>100, g(x)>f(x) and f(x)=O(g(x)) EG: if n>=1 then f(n) = (pi+7)/3*n^2 + n + 1 + sin(n) <= (pi+7)/3*(n^2 + n^2 + n^2 + n^2) = 4*(pi+7)/3*n^2 so f(n) = O(n^2), with k=1 and C=4*(pi+7)/3 in the definition (Draw pictures to illustrate) Remark: Sometimes we write f(x) = O(g(x)), but this is misleading notation, because f1(x) = O(g(x)) and f2(x) = O(g(x)) does not mean f1(x) = f2(x), for example x = O(x) and 2*x = O(x) EG: f(n) = run-time of prog1 for input of size n = O(n) ASK&WAIT: what is (worst case) running time of input: x, array a(1),...,a(n) found = false for i=1 to n if x = a(i) found = true exit loop endif end for In some of most important applications of O(), we never have an exact formula for f(n), such as when f(n) is the exact running time of a program. In such cases all we can hope for is a simpler function g(n) such that f(n) is O(g(n)). But to teach you how to simplify f(n) to get g(n), we will use exact expressions f(n) as examples. Goals of O() are 1) simplicity: O(n^2) is simpler than (pi+7)/3*x^2+n+1+sin(n), O(n) is simpler than actual run time of prog1 2) reasonable "accuracy": EG: Consider f(x) = (pi+7)/3*n^2 + n + 1 + sin(n) f(n) is both O(n^2) and O(n^3) ASK&WAIT: why? ASK&WAIT: Which is a "better" answer, to say f(n) is O(n^2) or O(n^3), since both are true? EX: Suppose we have two programs for the same problem and want to pick the fastest. Suppose prog1 runs in exactly time 10*n and prog2 runs in time n^2, so prog1 is clearly faster when n>10. But if we are "sloppy" and say that both run in time O(n^2), then we can't compare them. So we would like rules that make it easy to find simple and accurate g(x) so f(x)=O(g(x)) for complicated f(x) that avoid ever needing explicit values of C and k in the definition of Big-O Rule 1: if c is a constant, c*f(x) is O(f(x)) ASK&WAIT: what are C and k in definition of O() that proves this? EX given any a,b >0, we have log_a x = O(log_b x) ASK&WAIT: why? Rule 2: x^a = O(x^b) if 0 < a < b ASK&WAIT: what are C and k in definition of O() that proves this? Rule 3: If f1(x) = O(g1(x)) and f2(x) = O(g2(x)), then f1(x)+f2(x) = O(h(x)) where h(x)=max(|g1(x)|,|g2(x)|) proof: |f1(x)| <= C1*|g1(x)| for x>k1 and |f2(x)| <= C2*|g1(x)| for x>k2 means |f1(x)|+|f2(x)| <= C1*|g1(x)|+C2*|g2(x)| for x>max(k1,k2) so <= C1*h(x) +C2*h(x) for x>max(k1,k2) so <= (C1+C2)*h(x) for x>max(k1,k2) EX: let f(x) = a_k*x^k + a_{k-1}*x^{k-1} + ... + a_1*x + a_0, be a polynomial of degree k, i.e. a_k is nonzero by Rule 1 each term a_j*x^j is O(x^j), so by Rule 2 each term is O(x^k), so by Rule 3 f(x) is O(x^k) In other words, for polynomials only the term with the largest exponent matters Rule 4: If f1(x) = O(g1(x)) and f2(x) = O(g2(x)), then f1(x)*f2(x) = O(h(x)) where h(x)=g1(x)*g2(x) ASK&WAIT: what are C and k in definition of O() that proves this? EG: f(x) = (x+1)*log(x-1) = f1(x)*f2(x) = O(x)*O(log(x)) = O(x*log x) Rule 5: if f(x) = O(g(x)) and a>0, then (f(x))^a = O((g(x))^a) ASK&WAIT: what are C and k in definition of O() that proves this? Rule 6: (log_c x)^a = O(x^b) for any a>0, b>0, c>0 in fact, the limit as x increases of (log_c x)^a / x^b is zero Proof: By rules 1 and 5 we can assume c is any convenient constant, say e=2.71828... If we can show log x = O(x^(b/a)) for any b,a>0, then taking the a-th power yields (log x)^a = O(x^b) (Rule 5) So try to show log x = O(x^d) for any d>0 First we will show that as x increases, f(x) = log x / x^d decreases, once x is large enough: Differentiate f(x), and show that for x large enough, f'(x)<0: f'(x) = (x^d * (1/x) - d*x^(d-1)*log x)/x^(2*d) = x^(-d-1)*(1 - log x^d ) < 0 if x > e^(1/d) Now we show that f(x) actually goes to zero as x increases. Since f(x) is decreasing, it is enough to show that there is an increasing sequence of values x1,x2,x3,... such that f(xi) -> 0 Let x(i) = e^(i/d). Then f(x(i+1)) = log e^((i+1)/d) / e^(i+1) = log e^((i/d)*(i+1)/i) / ( e * e^i ) = ((i+1)/i)/e * log e^(i/d) / e^i = ((i+1)/i)/e) * f(x(i)) <= (2/e) * f(x(i)) because (i+1)/i takes values 2 > 3/2 > 4/3 > ... < .74 * f(x(i)) < .74^2 * f(x(i-1)) ... < .74^i * f(x(1)) = .74^i * (1/(d*e)), which goes to zero as i increases ASK&WAIT: simplify O(log x + sqrt(x)) ASK&WAIT: simplify O((log n)^1000 + n^.001) Rule 7: x^a = O(b^x) for any a>0, b>1 Proof: Just take logarithms (base 2, say), apply Rule 6: log(x^a) = a*log x and log(b^x) = x*log b; we know that for large enough x, x*log b is larger than a*log x, so 2^(x*log b)=b^x is larger than 2^(a*log x) = x^a EG: There is a standard list of functions that appears frequently when computing running time of programs, and you should recognize them, and which is bigger than the other: O(1) O(log n) = time to find an element in a sorted list, binary search O(n) = time to find maximum O(n*log n) = time to sort n numbers, using good algorithm O(n^2) = time to sort n numbers, using dumb algorithm O(n!) = O(1*2*3*...*n) ASK&WAIT: Simplify O((n+1)*log(sqrt(4n-2)) + log((n!)^2)) ASK&WAIT: Simplify O(n^(log n) + 1.1^(sqrt n)) ASK&WAIT: simplify O((log n)^(log n) - n^(log log n)) ASK&WAIT: simplify O((2^n + n^3*log n)*(n^4 + (log n)^2) + 1.5^n*n^5) Some more definitions related to Big-O DEF: We say f(x) is Big-Omega(g(x)) if g(x) is O(f(x)) Motivation: use g(x)=O(f(x)) to say a constant*f(x) is an upper bound on g(x) use f(x)=Big-Omega(g(x)) to say a constant*g(x) is a lower bound on f(x) DEF: We say f(x) and g(x) are of the "same order" (or f(x) = BIG_THETA( g(x) )) if f(x) = O(g(x)) and f(x) = Big-Omega(g(x)), i.e. for x > k, there are constants C1>0 and C2>0 such that C1*g(x) <= f(x) <= C2*g(x) EX: 2*n^2 + n + 1 = O(n^3) but BIG_THETA(n^2), because n^2 <= 2*n^2 + n + 1 <= 4*n^2 for n>=1 ASK&WAIT: If prog1 runs in time O(n) and prog2 runs in time O(n^2), then is prog1 is faster than prog2 for large enough n? ASK&WAIT: If prog1 runs in time BIG_THETA(n) and prog2 runs in time BIG_THETA(n^2), then is prog1 is faster than prog2 for large enough n? "Complexity Theory" is the study of which how fast certain problems can be solved, expressed in terms like "Given an input of size n, this algorithm will run in time O(f(n))" EX: Given a list L of n numbers, and one other number s, linear search will decide if s is in L in time O(n) EX: Given a sorted list L of n numbers, and one other number s, binary search will decide if s in in L in time O(log n) EX: Given a list of n numbers, we can sort it using insertion sort (same algorithm you'd use to sort by hand) in time O(n^2) ASK&WAIT: Do you know any other faster sorting algorithms? EX: Given a highway map labelled with distances between n towns, finding the shortest way to drive between every pair of towns (also called "all pairs shortest paths"), costs O(n^3) using the Floyd-Warshall algorithm (see CS170) DEF All these algorithms and many others are called "polynomial time algorithms" because they cost O(n^a) for some constant a. Here is another problem which cannot be solved in polynomial time as far as anyone knows: given any compound logical proposition such as q = p1 and p2 or not p3 .... using n proposition p1, p2, ... , pn, can q ever be True for any value of the p1,..,pn? This problem is called the "satisfiability" problem or SAT for short: can any values of p1,..,pn satisfy q, i.e. make it true? EX: if q = p1 and p2 and not p3 then setting p1 = True, p2 = True and p3 = False makes q = True EX: if q = (p1 or p2) and (not p1 or not p2) and (not p1 or p2) and (p1 or not p2) then no matter what values p1 and p2 have, q is False Here is an obvious algorithm to solve this problem: evaluate q for all possible values of p1, p2,..., pn if q is ever True then the answer is yes (q can be true) else no ASK&WAIT: What is the cost of this algorithm, at least? What may be surprising is that no significantly better algorithm is known, i.e. no algorithm that runs in polynomial time, O(n^a) for some constant a. They all run in "exponential time" (maybe faster than 2^n, but still exponential) One of the most famous open problem in mathematics (and computer science) is the question of whether any polynomial time algorithm for SAT can exist. The problem is also sometimes asked as "does P = NP or not?". Here P is the set of problem you can solve in polynomial time, and NP is a larger class including SAT. CS 170 and especially CS 172 talk about this question in more detail.