Math 55 - Fall 2007 - Lecture notes #10 - Sep 19 (Wednesday) Goals for today: (Finish Big-O) Expressing algorithms in pseudo code Using Big-O to measure, compare running times Introduction to Complexity Theory Algorithms: ASK&WAIT: what does this program do? prog1(integer n, integer array a(1),...,a(n)) M = a[1] for i = 2 to n M = max(M,a(i)) end for return M ASK&WAIT: What does this program do? prog2(integer n, integer array a(1),...,a(n)) for i = 1 to n M = a(i) for j = 1 to n except i if a(j)>M goto next end for return M next: endfor ASK&WAIT: Which program do you think is faster? How much faster? How do we determine how long prog1 takes to run? Approach 1: run it and measure the run time in seconds, ASK&WAIT: What are the pros and cons of this approach? Approach 2: it takes time proportional to n, ie. about k*n for some constant k, or O(n) ASK&WAIT: What are the pros and cons of this approach? How long does prog1 take to run, in a Big-O sense? Label each line of code with cost, add up: M = a[1] O(1) for i = 2 to n multiply cost of inside of loop by n M = max(M,a(i)) O(1) end for cost of loop = n*O(1) = O(n return M How long does prog2 take to run, in a Big-O sense? for i = 1 to n multiply inside cost by n M = a(i) O(1) for j = 1 to n except i multiply inside cost by n if a(j)>M goto next O(1) end for cost of inner loop = n*O(1)=O(n) return M O(1) next: endfor cost = n*O(n) = O(n^2) But it depends on values of a(1),...,a(n), since may branch out of inner loop, and so end faster. ASK&WAIT: What is the best (fastest) case? what is the worst (slowest) case? ASK&WAIT: Is prog2 or prog1 faster, in the worst case? ASK&WAIT: Generally, if prog1 runs in time O(n) and prog2 runs in time O(n^2), then is prog1 necessarily always faster than prog2 for large enough n? ASK&WAIT: If prog1 runs in time BIG_THETA(n) and prog2 runs in time BIG_THETA(n^2), then is prog1 is faster than prog2 for large enough n? "Complexity Theory" is the study of which how fast certain problems can be solved, expressed in terms like "Given an input of size n, this algorithm will run in time O(f(n))" EG: There is a standard list of functions that appears frequently when computing running time of programs, and you should recognize them, and which is bigger than the other: O(1) = time for any fixed number of operations O(log n) = time to find an element s in a sorted list of n numbers, using binary search O(n) = time to find an element s in an unsorted list of n numbers, using linear search O(n*log n) = time to sort n numbers, using good algorithm O(n^2) = time to sort n numbers, using dumb algorithm EX: Given a highway map labelled with distances between n towns, finding the shortest way to drive between every pair of towns (also called "all pairs shortest paths"), costs O(n^3) using the Floyd-Warshall algorithm (see CS170) DEF All these algorithms and many others are called "polynomial time algorithms" because they cost O(n^a) for some constant a. Here is another problem which cannot be solved in polynomial time as far as anyone knows: given any compound logical proposition such as q = p1 and p2 or not p3 .... combining n propositions p1, p2, ... , pn, can q ever be True for any values of the p1,..,pn? This problem is called the "satisfiability" problem or SAT for short: can any values of p1,..,pn satisfy q, i.e. make it true? EX: if q = p1 and p2 and not p3 then setting p1 = True, p2 = True and p3 = False makes q = True EX: if q = (p1 or p2) and (not p1 or not p2) and (not p1 or p2) and (p1 or not p2) then no matter what values p1 and p2 have, q is False Here is an obvious algorithm to solve this problem: evaluate q for all possible values of p1, p2,..., pn if q is ever True then the answer is yes (q can be true) else no ASK&WAIT: What is the cost of this algorithm, at least? What may be surprising is that no significantly better algorithm is known, i.e. no algorithm that runs in polynomial time, O(n^a) for some constant a. They all run in "exponential time" (maybe faster than 2^n, but still exponential) One of the most famous open problem in mathematics (and computer science) is the question of whether any polynomial time algorithm for SAT can exist. The problem is also sometimes asked as "does P = NP or not?". Here P is the set of problem you can solve in polynomial time, and NP is a larger class including SAT. CS 170 and especially CS 172 talk about this question in more detail.