CS 228 meeting -*- Outline -*- * algorithm efficiency and big-O notation (HR 12.1-3) ** context ---------------------------------- GOODNESS OF PROGRAMS How to measure? - correct behavior - cost to develop and maintain - resource usage ---------------------------------- It's not all captured by our specifications, which usually only talk about behavior, not costs and resources ---------------------------------- What resources? - space - time ---------------------------------- we'll focus on time efficiency for examples, but both important Q: The terms often used refer to "efficiency". Can you explain that? ---------------------------- def: the study of resource usage by algorithms is called *complexity theory* ------------------------------ ** big-O notation (12.1) *** motivation ---------------------------- THREE SORTING ALGORITHMS input time in seconds size selection bubble quick 10 0.005 0.001 0.003 100 0.05 0.01 0.006 1000 0.6 1.1 0.01 10000 50. 100. 1.3 100000 5000. 10000. 16.6 1000000 199. ---------------------------- computations for selection are n^2/2, bubble: n^2, quick: 10*(n log n) each operation takes 1 millisecond 10000 seconds is almost 3 hours Q: what happens if we buy a computer that is twice as fast? quicksort is still faster if we buy a computer 1000 times faster, with bubble sort we can do only 31 more elements in the same time (table 12.2) with quick sort we can do 140 more elements in the same time Q: what happens if we code bubble sort to be 3 times as fast? faster than selection, but still worse than quicksort big-O notation captures these kinds of differences important for large inputs. show graphs (like table 12.1) note that 2^n (exponential) algorithms are infeasible recall the difference between linear and exponential for fibonacci numbers *** function dominance recall that 0.5 n^2 and n^2 are pretty similar in the big picture so is n^2 + 1000 n in the long run. --------------------------- COST FUNCTIONS def: a *cost function* gives the resources used by an algorithm example: 2 actualTime(n) = n + 5*n + 100 millisecond ----------------------- n is usually the size of the input (number of inputs, or length) e.g., for sorting, multiplying numbers, ... ----------------------- ABSTRACTIONS - ignore time units - ignore all but most important term --------------------------- time units vary with computers leading term dominates others for very large inputs ---------------------------- FUNCTION DOMINANCE def: g asymptotically dominates f iff there are positive constants c and x0 such that c * g(x) >= f(x), for all x >= x0 2 e.g., n asymptotically dominates n*log(n) 2 2 n asymptotically dominates n +5n +100 ---------------------------- draw grah picture of definition *** estimating functions --------------------------------- ESTIMATING FUNCTIONS def: an *estimating function* is an approximation to a cost function desired characteristics: - estimate asymptotically dominates the actual cost - estimate is simple - estimate is close to the actual cost example: 2 actualTime(n) = n + 5*n + 100 2 estimate(n) = n --------------------------------- Q: why is this better than n^2 + 5n? than n^3? --------------------------------- BIG-O NOTATION def: if f and g are nonnegative functions, then the *order of f is g* iff g asymptotically dominates f notation: f = O(g) --------------------------------- note the notation doesn't say O(n^2) is n^2 + 5n + 100, read it as "n^2 + 5n + 100 is in the set of functions asymptotically dominated by n^2." -------------------------------- examples: 2 2 n + 5*n + 100 = O(n ) 5 0.0001*n + 9999*n = 100 13 * n! + 29 * n = ------------------------------- note that we're talking about relative cost, not absolute cost *** categories of running time ---------------------------------- IMPORTANT RELATIONSHIPS AND CATEGORIES OF GROWTH O(1) < O(log n) < O(n) < O(n log n) constant logrithmic linear 2 3 O(n log n) < O(n ) < O(n ) quadratic cubic polynomial 3 n n n O(n ) < O(2 ) < O(3 ) < O(n!) < O(n ) exponential ---------------------------------- if we buy a computer 1000 times faster, we can only do about 10 more elements using a 2^n algorithm so the exponential algorithhms are really infeasible *** big-O arithmetic compare p.546 (remember polynomials in Scheme?) ------------------------------------ BIG-O ARITHMETIC Thm: Let f and g be functions, and let k be a constant. Then 1. O(k * f) = O(f) 2. O(f * g) = O(f) * O(g), and O(f / g) = O(f) / O(g) 3. f asymptotically dominates g iff O(f) >= O(g) 4. O(f + g) = max(O(f), O(g)) Examples: 2 2 O(3 * n ) = O(n ) 2 2 O(17n * 5n) = O(17n ) * O(5n) 2 = O(n ) * O(n) 3 = O(n ) 3 2 O(n ) > O(n ) 3 2 O(13n + 5n ) = ------------------------------------ Note: the converse of 3 in the text on p.546 that f dominates g implies O(f) > O(g) is untrue! counterexample: n^2 dominates n^2, but O(n^2) = n^2 = O(n^2); ** time efficiency of control structures (12.2) problem: how to derive big-O for running time of a program? ------------------------------------- ESTIMATE THE RUNNING TIME void swap(double & i, double & j) { double temp = j; j = i; i = temp; } void initialize(double arr[], int size) { for (int i = 0; i < size; i++) { arr[i] = 0.0; } } void bubblesort(double arr[], int size) { for (int i = 0; i < size; i++) { for (int j = i+1; j < size; j++) { if (arr[i] > arr[j]) { swap(arr[i], arr[j]); } } } } ------------------------------------- summary ------------------------------ TIME COST OF CONTROL STRUCTURES CONTROL TIME COST ESTIMATE STRUCTURE i + j O(1) i = 3; O(1) Stmt1; Stmt2 max(O(Stmt1), O(Stmt2)) if (cond) { max(O(cond), Stmt1 O(Stmt1), O(Stmt2)) } else { Stmt2 } i=1; O(n * O(Stmt1)) while (i<=n) { Stmt1; i++; } ------------------------------ ----------------------------------- FOR YOU TO DO estimate the running times of: #include int main() { char ch, maxCh; cin >> maxCh; cin >> ch; while (cin) { if (maxCh < ch) { maxCh = ch; } cin >> ch; } } #include #include int main() { int i = 1; while (i < INT_MAX / 2) { cout << i << "\n"; i *= 2; } } ------------------------------------ first is linear, second constant time ** cautions (12.9) -------------------------------- SOME CAUTIONARY NOTES - big-O analysis can't capture small differences in cost - big-O analysis doesn't say much about performance on small data sets - typically 90% of time is spent in 10% of code (inner loops) -------------------------------- other stuff is also important: correctness, clarity, cost to write moral: don't optimize unless worth it