CS 228 meeting -*- Outline -*-

* algorithm efficiency and big-O notation (HR 12.1-3)

** context

----------------------------------
     GOODNESS OF PROGRAMS

How to measure?
 - correct behavior
 - cost to develop and maintain
 - resource usage
----------------------------------
	It's not all captured by our specifications,
		which usually only talk about behavior, not costs and resources

----------------------------------
What resources?
 - space
 - time
----------------------------------
	we'll focus on time efficiency for examples, but both important

	Q: The terms often used refer to "efficiency".  Can you explain that?

----------------------------
def: the study of resource usage
  by algorithms is called
  *complexity theory*
------------------------------

** big-O notation (12.1)

*** motivation
----------------------------
    THREE SORTING ALGORITHMS

input         time in seconds
size     selection   bubble    quick

10           0.005      0.001   0.003
100          0.05       0.01    0.006
1000	     0.6        1.1     0.01
10000	    50.	      100.      1.3
100000	  5000.     10000.     16.6
1000000			      199.
----------------------------
	computations for selection are n^2/2, bubble: n^2, quick: 10*(n log n)
		each operation takes 1 millisecond

	10000 seconds is almost 3 hours
	
	Q: what happens if we buy a computer that is twice as fast?
		quicksort is still faster

		if we buy a computer 1000 times faster,
		   with bubble sort
			we can do only 31 more elements in the same time
			 (table 12.2)
		   with quick sort
			we can do 140 more elements in the same time

	Q: what happens if we code bubble sort to be 3 times as fast?
		faster than selection, but still worse than quicksort

	big-O notation captures these kinds of differences
		important for large inputs.

	show graphs (like table 12.1)
		note that 2^n (exponential) algorithms are infeasible

	recall the difference between linear and exponential for fibonacci
		numbers

*** function dominance
	recall that 0.5 n^2 and n^2 are pretty similar in the big picture
		so is n^2 + 1000 n in the long run.

---------------------------
      COST FUNCTIONS

def: a *cost function* gives the
 resources used by an algorithm

example:           2
  actualTime(n) = n  + 5*n + 100
                  millisecond
-----------------------
	n is usually the size of the input (number of inputs, or length)
		e.g., for sorting, multiplying numbers, ...

-----------------------
        ABSTRACTIONS

 - ignore time units
 - ignore all but most important term
---------------------------
	time units vary with computers
	leading term dominates others for very large inputs

----------------------------
       FUNCTION DOMINANCE

def: g asymptotically dominates f iff
    there are positive constants c and x0
    such that
      c * g(x) >= f(x), for all x >= x0

       2
e.g., n  asymptotically dominates n*log(n)
     2                           2
    n  asymptotically dominates n +5n +100
----------------------------
	draw grah picture of definition

*** estimating functions
---------------------------------
        ESTIMATING FUNCTIONS

def: an *estimating function* is
 an approximation to a cost function

desired characteristics:

  - estimate asymptotically dominates
    the actual cost

  - estimate is simple

  - estimate is close to the actual cost


example:           2
  actualTime(n) = n  + 5*n + 100
                 2
  estimate(n) = n
---------------------------------
	Q: why is this better than n^2 + 5n?  than n^3?

---------------------------------
        BIG-O NOTATION

def: if f and g are nonnegative functions,
     then the *order of f is g* iff
     g asymptotically dominates f

notation:   f = O(g)
---------------------------------
	note the notation doesn't say O(n^2) is n^2 + 5n + 100,
		read it as "n^2 + 5n + 100 is in the set of functions
				asymptotically dominated by n^2."
--------------------------------
examples:   2                  2
           n  + 5*n + 100 = O(n )
                   5
	   0.0001*n  + 9999*n = 
                           100 
           13 * n! + 29 * n   =
-------------------------------
	note that we're talking about relative cost, not absolute cost

*** categories of running time
----------------------------------
      IMPORTANT RELATIONSHIPS
      AND CATEGORIES OF GROWTH

 O(1)    <  O(log n) < O(n)  < O(n log n)
constant   logrithmic  linear 

                 2          3
O(n log n) <  O(n )   <  O(n )
            quadratic   cubic
                polynomial
   
   3       n       n               n
O(n ) < O(2 ) < O(3 ) < O(n!) < O(n )
         exponential
----------------------------------
	if we buy a computer 1000 times faster, we can only do about 10
		more elements using a 2^n algorithm
	so the exponential algorithhms are really infeasible

*** big-O arithmetic
	compare p.546  (remember polynomials in Scheme?)
------------------------------------
        BIG-O ARITHMETIC

Thm: Let f and g be functions,
    and let k be a constant.
    Then 1. O(k * f) = O(f)
         2. O(f * g) = O(f) * O(g),
         and O(f / g) = O(f) / O(g)
	 3. f asymptotically dominates g
	    iff O(f) >= O(g)
         4. O(f + g) = max(O(f), O(g))

Examples:      2         2
        O(3 * n )   = O(n )

             2              2        
        O(17n  * 5n) = O(17n ) * O(5n)
                          2
                     = O(n ) * O(n)
                          3
                     = O(n )

           3       2
        O(n ) > O(n )

             3     2
        O(13n  + 5n )  = 
------------------------------------
	Note: the converse of 3 in the text on p.546
			that f dominates g implies O(f) > O(g) is untrue!
		counterexample: n^2 dominates n^2, but O(n^2) = n^2 = O(n^2);
		
** time efficiency of control structures (12.2)

	problem: how to derive big-O for running time of a program?

-------------------------------------
       ESTIMATE THE RUNNING TIME

void swap(double & i, double & j)
{
  double temp = j;
  j = i;
  i = temp;
}

void initialize(double arr[], int size)
{
  for (int i = 0; i < size; i++) {
    arr[i] = 0.0;
  }
}

void bubblesort(double arr[], int size)
{
  for (int i = 0; i < size; i++) {
   for (int j = i+1; j < size; j++) {
     if (arr[i] > arr[j]) {
        swap(arr[i], arr[j]);
     }
   }
 }
}
-------------------------------------

	summary

------------------------------
  TIME COST OF CONTROL STRUCTURES

CONTROL		TIME COST ESTIMATE
STRUCTURE

 i + j		O(1)

 i = 3;		O(1)

 Stmt1; Stmt2	max(O(Stmt1), O(Stmt2))

 if (cond) {    max(O(cond),
   Stmt1            O(Stmt1), O(Stmt2))
 } else {
   Stmt2
 }

 i=1;           O(n * O(Stmt1))
 while (i<=n) {
   Stmt1;
   i++;
 }
------------------------------

-----------------------------------
       FOR YOU TO DO

estimate the running times of:

#include <iostream.h>
int main()
{
  char ch, maxCh;
  cin >> maxCh;
  cin >> ch;
  while (cin) {
    if (maxCh < ch) {
      maxCh = ch;
    }
    cin >> ch;
  }
}

#include <iostream.h>
#include <limits.h>
int main()
{
  int i = 1;
  while (i < INT_MAX / 2) {
    cout << i << "\n";
    i *= 2;
  }
}
------------------------------------
	first is linear, second constant time

** cautions (12.9)

--------------------------------
    SOME CAUTIONARY NOTES

- big-O analysis can't capture small
  differences in cost

- big-O analysis doesn't say much
  about performance on small data sets

- typically 90% of time is spent
  in 10% of code (inner loops)
--------------------------------

	other stuff is also important: correctness, clarity, cost to write

	moral: don't optimize unless worth it