CS 226 Lecture 1: Introduction R. Sedgewick rs@cs.princeton.edu Intermediate-level survey course prerequisite: COS 126 programming/problem solving Algorithm: method for solving a problem Data structure: a way to store information Efficient algorithms use good data structures Types of algorithms considered Sorting Searching Strings Geometric Graph Mathematical Other ---------- Why study algorithms? Using a computer? want it to go faster want it to process more data want to do something that would otherwise be impossible Technology improves things by a constant factor Good algorithm design can do better Abacus Slide Rule Calculator Computer Supercomputer A bad algorithm on supercomputer can be worse than a good algorithm on abacus Algorithms as a field of study old enough that basics are known new enough that new discoveries arise burgeoning application areas science engineering commercial philosophical implications . . . ---------- Analysis of algorithms Compare algorithms by comparing estimated costs Refine and verify estimates for the best ones N: size of the input Running time usually proportional to 1 log N N N log N N^2 2^N Other functions sometimes arise loglog N [log(log N)] log* N number of logs until 1 reached Often can write down a formula for running time * worst case (guarantee) * average case (prediction) Analysis depends on detailed information * costs of basic operations * properties of input Suppress details in approximate analysis "Within a constant factor" O-notation Predict time for 2N from time for N ---------- Example: Online connectivity Input: sequence of pairs of integers (p, q) p "is connected to" q Output: nothing if p and q are already connected (p, q) otherwise -- in out evidence --- --- 3 4 3 4 4 9 4 9 8 0 8 0 2 3 2 3 5 6 5 6 2 9 (2--3--4--9) 5 9 5 9 7 3 7 3 4 8 4 8 5 6 (5--6) 0 2 (2--3--4--8--0) 6 1 6 1 --- Example of application integers represent computers pairs represent network connections can p and q communicate through network? ---------- Network connection example Number of nodes and edges can be huge for practical problems Ex: telephone network computer chip Disconnected piece may be hard to spot particularly for a computer! ---------- Quick-find algorithm Maintain array with names for components if i and j are connected, id[i] and id[j] are the same To maintain this property for p-q connection ignore if id[p] = id[q] change all entries with p's id to q's id -- main(int argc, char *argv[]) { int i, p, q, t, N = atoi(argv[1]); int *id = malloc(N*sizeof(int)); for (i = 0; i < N; i++) id[i] = i; while (scanf("%d %d\n", &p, &q) != EOF) { if (id[p] == id[q]) continue; t = id[p]; for (i = 0; i < N; i++) if (id[i] == t) id[i] = id[q]; printf(" %d %d\n", p, q); } } --- QUICK-FIND name due to constant-time test to find out if edge makes a new connection SLOW-UNION? ---------- Quick-find example Trace contents of id array after each connection Ex: -- 3 4 0 1 2 4 4 5 6 7 8 9 4 9 0 1 2 9 9 5 6 7 8 9 8 0 0 1 2 9 9 5 6 7 0 9 2 3 0 1 9 9 9 5 6 7 0 9 5 6 0 1 9 9 9 6 6 7 0 9 5 9 0 1 9 9 9 9 9 7 0 9 7 3 0 1 9 9 9 9 9 9 0 9 4 8 0 1 0 0 0 0 0 0 0 0 6 1 1 1 1 1 1 1 1 1 1 1 --- Problem with quick-find: change too many entries for union too slow for huge problems ---------- Quick-find example Use graphical representation numbers in nodes node i points up to id[i] top nodes represent components ---------- Problem size and computation time Rough standard for 2000 10^9 operations per second 10^9 words of memory touch each word in approximately 1 second (roughly unchanged since at least 1950) Bigger memories --> bigger problems Ex: huge problem for quick-find 10^10 edges connecting 10^9 nodes (edges need not fit in memory) examples: telephone network, computer chip Quick-find might take 10^20 operations relabel every node (10 ops) for every edge 3000 years of computer time (too much) Use rough estimate number of edges and nodes both O(N) running time of quick find O(N^2) Gap grows as scale increases new computer may be 10 times faster and have 10 times as much memory but (with quadratic algorithm) takes 10 times as long to finish! ---------- Quick-union algorithm Maintain array with names for components if i and j are connected, (id[i])* and (id[j])* are the same where (id[i])* = id[id[id[...id[i]]]] (go until it doesn't change) To maintain this property for p-q connection ignore if (id[p])* = (id[q])* set id[i] to j -- main(int argc, char *argv[]) { int i, j, p, q, t, N = atoi(argv[1]); int *id = malloc(N*sizeof(int)); for (i = 0; i < N; i++) id[i] = i; while (scanf("%d %d\n", &p, &q) != EOF) { i = p; j = q; while (i != id[i]) i = id[i]; while (j != id[j]) j = id[j]; if (i == j) continue; id[i] = j; } } --- QUICK-UNION: constant-time to make a new connection SLOW-FIND? ---------- Quick-union example Graphical representation exposes tree structure FIND operation is traveling up tree to root (check if two nodes hit the same root) UNION operation links one tree to another Ex: -- 3 4 0 1 2 4 4 5 6 7 8 9 4 9 0 1 2 4 9 5 6 7 8 9 8 0 0 1 2 4 9 5 6 7 0 9 2 3 0 1 9 4 9 5 6 7 0 9 5 6 0 1 9 4 9 6 6 7 0 9 5 9 0 1 9 4 9 6 9 7 0 9 7 3 0 1 9 4 9 6 9 9 0 9 4 8 0 1 9 4 9 6 9 9 0 0 6 1 1 1 9 4 9 6 9 9 0 0 --- Faster than quick-find for "random" input (is real input "random"?) Problem with quick-union: paths to root can grow too long no guarantee for huge problems ---------- Quick-union example ---------- Weighted quick-union algorithm Modify quick-union to avoid imbalance keep track of size of each component balance by linking small one below large one -- main(int argc, char *argv[]) { int i, j, p, q, t, N = atoi(argv[1]); int *id = malloc(N*sizeof(int)); int *sz = malloc(N*sizeof(int)); for (i = 0; i < N; i++) id[i] = i; for (i = 0; i < N; i++) sz[i] = 1; while (scanf("%d %d\n", &p, &q) != EOF) { for (i = p; i != id[i]; i = id[i]) ; for (j = q; j != id[j]; j = id[j]) ; if (i == j) continue; if (sz[i] < sz[j]) { id[i] = j; sz[j] += sz[i]; } else { id[j] = i; sz[i] += sz[j]; } printf(" %d %d\n", p, q); } } --- Fast enough? ---------- Weighted quick-union example Avoids long paths in trees Ex: -- 3 4 0 1 2 3 3 5 6 7 8 9 4 9 0 1 2 3 3 5 6 7 8 3 8 0 8 1 2 3 3 5 6 7 8 3 2 3 8 1 3 3 3 5 6 7 8 3 5 6 8 1 3 3 3 5 5 7 8 3 5 9 8 1 3 3 3 3 5 7 8 3 7 3 8 1 3 3 3 3 5 3 8 3 4 8 8 1 3 3 3 3 5 3 3 3 6 1 8 3 3 3 3 3 5 3 3 3 --- Is performance improved? To answer this question run empirical studies analyze the algorithm [need to do both] ---------- Weighted quick-union example ---------- Weighted quick-union worst case Height of biggest tree increases only when it links to an equally big tree Ex: -- 0 1 0 0 2 3 4 5 6 7 8 9 2 3 0 0 2 2 4 5 6 7 8 9 4 5 0 0 2 2 4 4 6 7 8 9 6 7 0 0 2 2 4 4 6 6 8 9 8 9 0 0 2 2 4 4 6 6 8 8 0 2 0 0 0 2 4 4 6 6 8 8 4 6 0 0 0 2 4 4 4 6 8 8 0 4 0 0 0 2 0 4 4 6 8 8 6 8 0 0 0 2 0 4 4 6 0 8 --- Good news: Worst case is O(lg N) steps per edge Better news: Average case is O(1) steps per edge Ex: 10^10 edges connecting 10^9 nodes reduces time from 3000 years to 1 minute ---------- Weighted quick-union worst case ---------- Path compression Modify weighted quick-union to compress tree make every node touched point to the new root (no reason not to!) -- main(int argc, char *argv[]) { int i, j, p, q, t, N = atoi(argv[1]); int *id = malloc(N*sizeof(int)); int *sz = malloc(N*sizeof(int)); for (i = 0; i < N; i++) id[i] = i; for (i = 0; i < N; i++) sz[i] = 1; while (scanf("%d %d\n", &p, &q) != EOF) { for (i = p; i != id[i]; i = id[i]) ; for (j = q; j != id[j]; j = id[j]) ; if (i == j) continue; if (sz[i] < sz[j]) { id[i] = j; sz[j] += sz[i]; t = j; } else { id[j] = i; sz[i] += sz[j]; t = i; } for (i = p; i != id[i]; i = id[i]) id[i] = t; for (j = q; j != id[j]; j = id[j]) id[j] = t; printf(" %d %d\n", p, q); } } --- Worth doing? ---------- Weighted quick-union with path compression Avoids long paths in trees -- 3 4 0 1 2 3 3 5 6 7 8 9 4 9 0 1 2 3 3 5 6 7 8 3 8 0 8 1 2 3 3 5 6 7 8 3 2 3 8 1 3 3 3 5 6 7 8 3 5 6 8 1 3 3 3 5 5 7 8 3 5 9 8 1 3 3 3 3 5 7 8 3 7 3 8 1 3 3 3 3 5 3 8 3 4 8 8 1 3 3 3 3 5 3 3 3 6 1 8 3 3 3 3 3 3 3 3 3 --- THM: Worst-case tree height is O(lg* N) Proof: Extremely difficult (but the *algorithm* is still simple!) Note: lg* N is constant in this world -- . N lg* N . --------------- . 2 1 . 4 2 . 16 3 . 65536 4 . any practical value 5 --- Reduces cost to within a constant factor of gathering the data ---------- Path compression example ---------- Union-find summary Worst-case cost per edge is proportional to -- quick-find N quick-union N weighted lg N path compression 5 --- Online algorithm: can solve problem while collecting the data, for "free" Set-merging abstraction FIND: is A in the same set as B? UNION: merge A's set and B's set useful for many other applications Lessons * start with simple algorithm * don't use simple algorithm for large problems * can't use simple algorithm for huge problems * higher level of abstraction (tree) is helpful * fast performance on real data OK, but * strive for worst-case performance guarantees * identify fundamental abstraction A "trivial" algorithm can be both useful and interesting ---------- SORTING Elementary algorithms, Shellsort Quicksort Mergesort Priority queues Radix sorts * Sort an array that fills memory * Make the union of M spelling dictionaries * Priority queue abstract data type ---------- SEARCHING Tree searching Hashing Trie searching * Oxford English Dictionary * Internet search engines * DNA subsequence library * Dictionary abstract data type for other algorithms ---------- STRINGS String searching Pattern matching File compression * file systems, audio and video * cryptology ---------- GEOMETRIC ALGORITHMS Elementary algorithms Convex hull Multidimensional searching * N-body simulation * "Toy Story" ---------- GRAPH ALGORITHMS Properties of graphs Searching in graphs Advanced graph algorithms * Cycle detection * matching (e. g. medical residents to jobs) * networks ---------- OTHER TOPICS Mathematical algorithms Dynamic programming Parallel algorithms Randomized algorithms Intractable problems ---------- COURSEWORK Text * Algorithms, 3rd edition, in C * new book, old book after midterm Lecture notes Programming Assignments * weekly, eleven in all * electronic submission Program due Thursdays 11:59PM first one due Thursday NEXT week * README due following Sunday at 11:59PM Problem Sets * weekly, nine in all * due in precept * first one due NEXT Monday or Tuesday Exams * open book, notes * midterm in class Wednesday before break * final at scheduled time Online course information on homepage READ (and do not lose) HANDOUT ONE READ ONLINE INFORMATION COS 226 Lecture 2: Elementary sorting algs Insertion sort Selection sort Bubble sort Shellsort ----- Why study elementary algorithms? Easy to code Fastest for small files Context for developing ground rules Fastest in some special situations May not be so elementary ----- Ground rules FILES of RECORDS containing KEYS File fits in memory Use abstract comparison, exchange -- typedef int Item #define less(A, B) (A < B) #define exch(A, B) { Item t = A; A = B; B = t; } --- Macros or subroutines? Macros: low cost, simple Subroutines: more general ----- Selection sort -- void selection(Item a[], int l, int r) { int i, j; for (i = l; i < r; i++) { int min = i; for (j = i+1; j <= r; j++) if (less(a[j], a[min])) min = j; exch(a[i], a[min]); } } --- ----- Insertion sort -- void insertion(Item a[], int l, int r) { int i, j; for (i = l+1; i <= r; i++) { Item v = a[i]; j = i; while (j > l && less(v, a[j-1])) { a[j] = a[j-1]; j--; } a[j] = v; } } --- ----- Bubble sort -- void bubble(Item a[], int l, int r) { int i, j; for (i = l; i < r; i++) for (j = r; j > i; j--) compexch(a[j], a[j-1]); } --- Improvements: * add a test to exit if no exchanges * go back and forth ----- Properties of elementary sorts All quadratic running time Selection sort comparisons: N-1 + N-2 + ... + 2 + 1 = N^2/2 exchanges: N Insertion sort comparisons: (N-1 + N-2 + ... + 1)/2 = N^2/4 exchanges: N^2/4 Bubble sort comparisons: N-1 + N-2 + ... + 2 + 1 = N^2/2 exchanges: about N^2/2 Large records, small keys selection sort linear in amount of data N records M words (1-word keys) comparison cost N^2/2 exchange cost NM if N is about equal to M costs and amount of data are both about M^2 LINEAR sort Files nearly in order bubble and insertion sort can be linear (quicksort can be quadratic) ----- Pointer sort Sort large records by swapping references to the records, not the records themselves -- 1 9 Fox 1 --- [associated info] --- 2 6 Quilici 1 --- ... --- 3 8 Chen 2 --- ... --- 4 3 Furia 3 --- ... --- 5 1 Kanaga 3 --- ... --- 6 4 Andrews 3 --- ... --- 7 10 Rohde 3 --- ... --- 8 5 Battle 4 --- ... --- 9 2 Aaron 4 --- ... --- 10 7 Gazsi 4 --- ... --- --- Trivial to implement: change abstract comparison Array indices -- typedef int Item #define less(A, B) (data[A].key < data[B].key) #define exch(A, B) { Item t = A; A = B; B = t; } --- True pointers -- typedef dataType* Item #define less(A, B) (*A.key < *B.key) #define exch(A, B) { Item t = A; A = B; B = t; } --- ----- Stable sorting for two-key records Sort on the first key, then on the second -- Aaron 4 Fox 1 Andrews 3 Quilici 1 Battle 4 Chen 2 Chen 2 Furia 3 Fox 1 Kanaga 3 Furia 3 Andrews 3 Gazsi 4 Rohde 3 Kanaga 3 Battle 4 Quilici 1 Aaron 4 Rohde 3 Gazsi 4 --- Stable sort: stays sorted on first key where equal on second -- Fox 1 Quilici 1 Chen 2 Andrews 3 Furia 3 Kanaga 3 Rohde 3 Aaron 4 Battle 4 Gazsi 4 --- Which of the elementary methods are stable? ----- 4-sorting Divide into 4 subfiles every 4th element starting at the 1st every 4th element starting at the 2nd every 4th element starting at the 3rd every 4th element starting at the 4th ----- Interleaved 4-sorting Use insertion sort with an "increment" of 4 -- h = 4; for (i = l+h; i <= r; i++) { Item v = a[i]; j = i; while (j >= l+h && less(v, a[j-h])) { a[j] = a[j-h]; j -= h; } a[j] = v; } --- ----- Shellsort Use a decreasing sequence of increments Each pass makes the next easier /lineno 6 def 13 /lineno 9 def 4 /lineno 18 def 1 ----- Shellsort implementation -- void shellsort(Item a[], int l, int r) { int i, j; int incs[16] = { 1391376, 463792, 198768, 86961, 33936, 13776, 4592, 1968, 861, 336, 112, 48, 21, 7, 3, 1 }; for ( k = 0; k < 16; k++) { int h = incs[k]; for (i = l+h; i <= r; i++) { Item v = a[i]; j = i; while (j >= h && less(v, a[j-h])) { a[j] = a[j-h]; j -= h; } a[j] = v; } } } --- Need a sort routine, fast? Use Shellsort! * not much code * best method for small and medium files * still OK even for giant files How do we know what increments to use? * plenty of proven winners to use * easiest: 1, 4, 13, 40, 121, 364, 1093, ... ----- Relatively prime increment sequences When we h-sort a file that is k-sorted, it stays k-sorted (Know an easy proof? SEND MAIL) Only 18N comparisons are needed to 1-sort a file that is 4-sorted and 13-sorted Elements to the left of x that could be greater: x In general, if h and k are relatively prime: Only (h-1)(k-1)N comparisons to 1-sort a file that is h-sorted and k-sorted Only (h-1)(k-1)N/g comparisons to g-sort a file that is h-sorted and k-sorted Big increments, small files: h(N/h)^2 = N^2/h Small increments, use theorem: h^2N/h = Nh Tradeoff best bounds: N^(3/2) total Similar methods (harder proofs) give 4/3, 5/4, 6/5 ... ----- More increment sequences On the other hand, common divisors are good: Only N comparisons to 1-sort a file that is 2-sorted and 3-sorted Only N comparisons to 2-sort a file that is 4-sorted and 6-sorted Only N comparisons to 3-sort a file that is 6-sorted and 9-sorted . 1 . 2 3 . 4 6 9 . 8 12 18 27 . 16 24 36 54 81 . 32 48 72 108 162 243 . 64 96 144 216 324 486 729 Total time: N (log N)(log N) Too many increments for real sizes * start with bigger numbers than 2 and 3 * mix up with primes Have a better idea for an increment sequence? SEND MAIL if it beats 1 2 7 21 48 112 336 ... COS 226 Lecture 3: Quicksort To sort an array, first divide it so that * some element a[i] is in its final position * no larger element left of i * no smaller element right of i Then sort the left and right parts recursively ---------- Partitioning To partition an array, pick a partitioning element * scan from right for smaller element * scan from left for larger element * exchange * repeat until pointers cross ---------- Partitioning implementation v: partitioning element i: left-to-right pointer j: right-to-left pointer -- int partition(Item a[], int l, int r) { int i, j; Item v; v = a[r]; i = l-1; j = r; for (;;) { while (less(a[++i], v)) ; while (less(v, a[--j])) if (j == l) break; if (i >= j) break; exch(a[i], a[j]); } exch(a[i], a[r]); return i; } --- Issues * stop pointers on keys equal to v? * sentinels or explicit tests for array bounds? * details of pointer crossing ---------- Quicksort implementation -- quicksort(Item a[], int l, int r) { int i; if (r > l) { i = partition(a, l, r); quicksort(a, l, i-1); quicksort(a, i+1, r); } } --- Issues * overhead for recursion? * running time depends on input * worst-case time cost (quadratic, a problem) * worst-case space cost (linear, a serious problem) ---------- Nonrecursive Quicksort Use explicit stack instead of recursive calls Sort smaller of two subfiles first -- #define push2(A, B) push(A); push(B); void quicksort(Item a[], int l, int r) { int i; stackinit(); push2(l, r); while (!stackempty()) { r = pop(); l = pop(); if (r <= l) continue; i = partition(a, l, r); if (i-l > r-i) { push2(l, i-1); push2(i+1, r); } else { push2(i+1, r); push2(l, i-1); } } --- Guaranteed less than (lg N) extra space cost Still quadratic worst-case time cost ---------- Analysis Total running time is sum of cost*frequency for all the basic operations Cost depends on machine Frequency depends on algorithm, input For Quicksort A -- number of partitioning stages B -- number of exchanges C -- number of comparisons Cost on a typical machine 35A + 11B + 4C Number of comparisons in the worst case N + (N-1) + (N-2) + ... = N(N-1)/2 ---------- Average case analysis Assume input randomly ordered * each element equally likely to be partitioning element * subfiles randomly ordered (if partitioning is "blind") Average number of comparisions satisfies C(N) = N+1 + (C(1) + C(N-1))/N + (C(2) + C(N-2))/N ... + (C(N-1) + C(1))/N C(N) = N+1 + 2( C(1) + C(2) + ... + C(N-1) )/N NC(N) = N(N+1) + 2( C(1) + C(2) + ... + C(N-1)) NC(N) - (N-1)C(N-1) = 2N + 2C(N-1) NC(N) = (N+1)C(N-1) + 2N C(N)/(N+1) = C(N-1)/N + 2/(N+1) = 2( 1 + 1/2 + 1/3 + ... 1/(N+1) ) = 2 ln N + (small error term) THM: Quicksort uses about 2N ln N = (1.38...)N lg N comparisons (and 1/6 as many exchanges) on average. Randomized algorithm * pick random partitioning element * analysis then valid for any input ---------- Inner loop Use profiler * find instructions executed most often * verify analysis * streamline program by iterating -- quicksort(int a[], int l, int r) <132659>{ int v, i, j, t; if (<132659>r > l) { <66329>v = a[r]; <66329>i = l-1; <66329>j = r; for (<66329>;<327102>;<327102>) { while (<1033228>a[++i] < v) <639797>; while (<1077847>a[--j] > v) <684416>; if (<393431>i >= j) <66329>break; <327102>t = a[i]; a[i] = a[j]; a[j] = t; } <66329>t = a[i]; a[i] = a[r]; a[r] = t; <66329>quicksort(a, l, i-1); <66329>quicksort(a, i+1, r); } <132659>} --- ---------- Another partitioning method Seeking an easier-to-understand method (detailed justification omitted) -- quicksort(int a[], int l, int r) <133395>{ int v, i, k, t; if (<133395>r <= l) return; <66697>v = a[l]; <66697>k = l; for (<66697>i=l+1; <1976624>i<=r; <1909927>i++) if (<1909927>a[i] < v) { <934565>t = a[i]; a[i] = a[++k]; a[k] = t; } <66697>t = a[k]; a[k] = a[l]; a[l] = t; <66697>quicksort(a, l, k-1); <66697>quicksort(a, k+1, r); } <133395>} --- Not much simpler, three times as many exchanges ---------- Improvements to Quicksort Median-of-three * partitioning element closer to center * estimate median with median of sample * number of comparisons close to N lg N with large enough sample * FEWER LARGE FILES * slightly more exchanges, more overhead Insertion sort small subfiles * even Quicksort has too much overhead for files of a few elements * use insertion sort for tiny files * (can wait until the end) Analysis gives best value of parameters * median of 3 elements * cut to insertion sort for < 10 elements ---------- Improvements to Quicksort Standard * might have bad partition (1st one in ex.) * might have good partition (2nd one in ex.) Cutoff for small subfiles Median-of-three ---------- Three-way partitioning Equal keys can affect performance Suppose all keys are the same * plain quicksort takes N lg N comparisons (!) * change partitioning to take N comparisons * naive method might use N^2 comparisons (!!) Suppose there are two distinct key values * reduces to above case for one subfile * better to complete sort with one partition stop right pointer on left value stop left pointer of right value exchange Large file, small number of key values * reduces to above cases Partition into three parts * elements between i and j equal to v * no larger element left of i * no smaller element right of j ---------- Three-way partitioning implementation -- void quicksort(Item a[], int l, int r) { int i, j, k, p, q; Item v; if (r <= l) return; v = a[r]; i = l-1; j = r; p = l-1; q = r; for (;;) { while (less(a[++i], v)) ; while (less(v, a[--j])) if (j == l) break; if (i >= j) break; exch(a[i], a[j]); if (eq(a[i],v)) { p++; exch(a[p],a[i]); } if (eq(v,a[j])) { q--; exch(a[q],a[j]); } } exch(a[i], a[r]); j = i-1; i = i+1; for (k = l ; k < p; k++, j--) exch(a[k], a[j]); for (k = r-1; k > q; k--, i++) exch(a[k], a[i]); quicksort(a, l, j); quicksort(a, i, r); } --- ---------- Sorting strings Use three-way partitioning on key characters -- actinian coenobite actinian jeffrey conelrad bracteal coenobite actinian coenobite conelrad bracteal conelrad secureness secureness cumin cumin dilatedly chariness chariness inkblot centesimal bracteal jeffrey cankerous displease displease circumflex millwright millwright millwright repertoire repertoire repertoire dourness dourness dourness centesimal southeast southeast fondler fondler fondler interval interval interval reversionary reversionary reversionary dilatedly cumin secureness inkblot chariness dilatedly southeast centesimal inkblot cankerous cankerous jeffrey circumflex circumflex displease --- Not many disadvantanges (to be continued: this is a radix sort) COS 226 Lecture 4: Mergesort Prototypical divide-and-conquer algorithm Guaranteed to run in O(N log N) steps Drawback: Linear extra space for the merge (can only sort half the memory) An "optimal" sorting method Leads us to consider recurrence relationships computational complexity deep hacking fractals ---------- Merging two sorted files -- #define T Item merge(T c[], T a[], int N, T b[], int M ) { int i, j, k; for (i = 0, j = 0, k = 0; k < N+M; k++) { if (i == N) { c[k] = b[j++]; continue; } if (j == M) { c[k] = a[i++]; continue; } if (less(a[i], b[j])) c[k] = a[i++]; else c[k] = b[j++]; } } --- Seems trivial, but try doing it without using linear extra space ---------- Abstract inplace merge -- Item aux[maxN]; merge(Item a[], int l, int m, int r) { int i, j, k; for (i = m+1; i > l; i--) aux[i-1] = a[i-1]; for (j = m; j < r; j++) aux[r+m-j] = a[j+1]; for (k = l; k <= r; k++) if (less(aux[i], aux[j])) a[k] = aux[i++]; else a[k] = aux[j--]; } --- Tricky way to avoid special tests for ends of arrays Still uses extra space, but easier for calling routine to assume inplace ---------- Mergesort Sort a file by (recursively) sorting two halves, then merge the results ---------- Mergesort implementation Direct recursive code Uses abstract inplace merge -- void mergesort(Item a[], int l, int r) { int m = (r+l)/2; if (r <= l) return; mergesort(a, l, m); mergesort(a, m+1, r); merge(a, l, m, r); } --- Tree structures describe merge file sizes ---------- Recurrences Direct relationship to recursive programs (most programs are "recursive") Easy telescoping recurrences T(N) = T(N-1) + 1 T(N) = N T(2^n) = T(2^(n-1)) + 1 T(N) = lg N if N=2^n Short list of important recurrences T(N) = T(N/2) + 1 T(N) = lg N T(N) = T(N/2) + N T(N) = N T(N) = 2T(N/2) + 1 T(N) = N T(N) = 2T(N/2) + N T(N) = N lg N Details in Chapter 2 THM: Mergesort uses N lg N comparisons (guaranteed worst-case bound) Divide-and-conquer "master theorem" gives similar results for other algorithms T(N) = aT(N/b) + N^c(lg N)^d Interested in learning more? Stay tuned for a few more in CS 226 Take CS 341, CS 423 Read "Introduction to the Analysis of Algs" by Sedgewick and Flajolet ---------- Computational Complexity Framework to study efficiency of algorithms Machine model: count fundamental operations Average case: predict performance (need input model) Worst case: guarantee performance (any input) Upper bound: algorithm to solve the problem Lower bound: proof that no algorithm can do better Caveats: worst-case may be unrealistic average-case bounds difficult to derive costs ignored in analysis may dominate machine model may be restrictive Complexity studies provide starting point for practical implementations indication of approaches to be avoided ---------- Complexity of sorting N lg N comparisons necessary and sufficient Upper bound: N lg N (Mergesort) Lower bound: N lg N - N/(ln 2) + lg N THM: All comparison-based sorting methods must use at least N lg N comparisons Proof: DECISION TREE (all sequences of comparisons) Claim 1: at least N! leaves Claim 2: height at least lg N! Claim 3: (Stirling's formula for lg N!) height at least N lg N - N/(ln 2) + lg N Caveat: what if we don't use comparisons?? (stay tuned for radix sort) ---------- Deep hacking on Mergesort inner loop Avoid move with recursive argument switch on call to straight merge. -- void mergesort(T a[], T b[], int l, int r) { int m = (l+r)/2; if (r-l <= 10) { insertion(a, l, r); return; } mergesort(b, a, l, m); mergesort(b, a, m+1, r); merge(a+l, b+l, m-l+1, b+m+1, r-m); } void sort(Item a[], int l, int r) { int i; for (i = l; i <= r; i++) aux[i] = a[i]; mergesort(a, aux, l, r); } --- Avoid sentinels with "up-down" trick Combine the two? Doable, but mindbending Inner loop: compare, store, two incs (Quicksort: compare, inc) ---------- Bottom-up mergesort Pass through the file merge adjacent subfiles size doubles each time through ---------- Bottom-up mergesort implementation -- void mergesort(Item a[], int l, int r) { int i, m; for (m = 1; m < r-l; m = m+m) for (i = l; i <= r-m; i += m+m) merge(a, i, i+m-1, min(i+m+m-1, r)); } --- Different set of merges than for top-down unless N is a power of two ---------- Mergesort and numbers Relationship to binary representation of numbers THM: The number of comparisons used by Mergesort for a file of size N is the same as the number of bits in the binary representations of all the numbers less than N (plus N-1). -- 0 0 0 0 0 0 0 1 0 0 1 0 0 0 1 1 0 1 0 0 0 1 0 1 0 1 1 0 0 1 1 1 1 0 0 0 1 0 0 1 1 0 1 0 1 0 1 1 --- Proof: They satisfy the same recurrence C(2N) = C(N) + C(N) + 2N C(2N+1) = C(N) + C(N+1) + 2N+1 ---------- Mergesort and fractals Divide-and-conquer algorithms can exhibit erratic periodic behavior number of bits in numbers less than N = number of 0 bits + number of 1 bits = (N lg N)/2 + periodic term + (N lg N)/2 + periodic term = N lg N + periodic term Details in Sedgewick-Flajolet ---------- Merging linked lists Problem: sort data on a linked list (rearrange list so items are in order) -- typedef struct node *link; struct node { Item item; link next; }; --- First step: merge implementation -- link merge(link a, link b) { struct node head; link c = &head; while ((a != NULL) && (b != NULL)) if (less(a->item, b->item)) { c->next = a; c = a; a = a->next; } else { c->next = b; c = b; b = b->next; } c->next = (a == NULL) ? b : a; return head.next; } --- Next step: sort implementation top-down and bottom-up methods both apply ---------- List mergesorts Top-down: split, sort, and merge -- link mergesort(link c) { link a, b; if (c->next == NULL) return c; a = c; b = c->next; while ((b != NULL) && (b->next != NULL)) { c = c->next; b = b->next->next; } b = c->next; c->next = NULL; return merge(mergesort(a), mergesort(b)); } --- Bottom-up: cycle through a circular list -- link mergesort(link t) { link u; for (initQ(); t != NULL; t = u) { u = t->next; t->next = NULL; putQ(t); } t = getQ(); while (!emptyQ()) { putQ(t); t = merge(getQ(), getQ()); } return t; } --- Mergesort is the method of choice for lists COS 226 Lecture 5: Priority Queues Abstract data types client interface implementation Priority queue ADT insert remove the largest Brute-force implementations HEAPS Heapsort BINOMIAL QUEUES ----- Abstract data types Separate INTERFACE and IMPLEMENTATION easier maintainence of large programs build layers of abstraction reuse software elementary example: pushdown stack INTERFACE: description of data type, basic operations CLIENT: program using operations defined in interface IMPLEMENTATION: actual code implementing operations Client can't know details of implementation (many implementations to choose from) Implementation can't know details of client needs (many clients use the same implementation) Modern programming languages support ADTs C++, Modula-3, Oberon, Java (C) Caveats perspective needed performance matters ----- Priority queue ADT Records with keys (priorities) Two basic operations * insert * delete the largest generic operations common to many ADTs * create * test if empty * destroy (often ignored if not harmful) INTERFACE for basic operations -- . void PQinit(); . void PQinsert(Item); . Item PQdelmax(); . int PQempty(); --- Should also specify constraints and error conditions Example applications * simulation * numerical computation * compression algorithms * graph searching algorithms ----- Sample client program Find the M SMALLEST of N items Typical values: M=100, N=1000000 -- PQinit(); for (k = 0; k < M; k++) PQinsert(nextItem()); for (k = M; k < N; k++) { PQinsert(nextItem()); t = PQdelmax(); } for (k = 0; k < M; k++) a[k] = PQdelmax(()); --- Cost? Depends on PQ implementation Reasonable assumption for any implementation: space proportional M Time bounds for standard implementations: brute-force: N M best: N lg M ----- Typical brute-force implementation Use an unordered array -- static Item *pq; static int N; PQinsert(Item v) { pq[N++] = v; } Item PQdelmax() { int j, max = 0; for (j = 1; j < N; j++) if (less(pq[max], pq[j])) max = j; exch(pq[max], pq[N]); return pq[--N]; } void PQinit(int maxN) { pq = malloc(maxN*sizeof(Item)); N = 0; } int PQempty() { return N == 0; } --- Should also test and report errors Other options for brute-force implementations * ordered array * unordered singly-linked list * ordered singly-linked list ----- Client/Interface/Implementation Separate compilation (Freudian slip: "complication") INTERFACE * define data types * declare functions * in C, use ".h" file (no executable code) CLIENT: * include ".h" file * call functions IMPLEMENTATION: * include ".h" file * give code for functions Client and implementation can be compiled at different times, then function calls LINKED to their implementations Details: Sedgewick, Chapter 4; COS 217 Modular programming separate compilation abstraction essential in implementing big systems useful in putting good algs to work ----- Priority queue ADT (continued) Other useful operations * construct a PQ from N items * return the value of the largest * delete a specified item * change an item's priority * merge together two PQs Interface more complicated need HANDLES for records need HANDLES for priority queues where's the data? (client, implementation, or both?) First-class ADT solution -- . typedef struct pq* PQ; . typedef struct PQnode* PQlink; . PQ PQinit(); . int PQempty(PQ); . PQlink PQinsert(PQ, Item); . Item PQdelmax(PQ); . void PQchange(PQ, PQlink, Item); . void PQdelete(PQ, PQlink); . PQ PQjoin(PQ, PQ); --- PQ and PQlink are pointers to structures to be specified in the implementation More info: section 4.8 in Sedgewick; lecture 7 ----- PQ implementations cost summary Worst-case per-operation time cost as a function of PQ size -- . delete find change /lineno lineno .4 sub def . insert max delete max key join ordered array N 1 N 1 N N list N 1 1 1 N N unordered array 1 N 1 N 1 N list 1 N 1 N 1 1 heap lg N lg N lg N 1 lg N N binomial queue lg N lg N lg N lg N lg N lg N best in theory 1 lg N lg N 1 1 1 --- Algorithm design success story: * nearly optimal worst-case cost * simple (but ingenious!) algorithms Costs lower in practice average-case analysis amortized analysis ----- Heap-ordered complete binary trees COMPLETE BINARY TREE: leaves on two levels, on left at bottom level HEAP-ORDERED: parent larger than both children therefore, largest at root can define for any tree, not just complete Array representation: root in a[1] children of root in a[2] and a[3] ... children of i in a[2i] and a[2i+1] parent of i in a[i/2] No explicit links needed for tree -- . 0 1 2 3 4 5 6 7 8 9 10 11 12 . X T O G S M N A E R A I --- ----- Promotion (bubbling up in a heap) Change key in node at the bottom of the heap To restore heap condition: exchange with parent if necessary "How to Succeed" model (Peter Principle) -- fixUp(Item a[], int k) { while (k > 1 && less(a[k/2], a[k])) { exch(a[k], a[k/2]); k = k/2; } } --- easy to make code twice as fast ala insertion ----- Demotion (sifting down in a heap) Change key in node at the top of the heap To restore heap condition: exchange with larger child if necessary "Boss's child" model (Royal succession?) -- fixDown(Item a[], int k, int N) { int j; while (2*k <= N) { j = 2*k; if (j < N && less(a[j], a[j+1])) j++; if (!less(a[k], a[j])) break; exch(a[k], a[j]); k = j; } } --- ----- PQ implementation with heaps PQinsert: add node at bottom PQdelmax: exch root with node at bottom -- static Item pq[maxPQsize+1]; static int N; void PQinit(int maxN) { pq = malloc(maxN*sizeof(Item)); N = 0; } int PQempty() { return N == 0; } void PQinsert(Item v) { pq[++N] = v; fixUp(pq, N); } Item PQdelmax() { exch(pq[1], pq[N]); fixDown(pq, 1, N-1); return pq[N--]; } --- No more than lg N steps (GUARANTEED) ----- Constructing a heap (top-down) ----- Sorting down a heap ----- Heapsort Abandon ADT concept to save space Faster to construct heap from the top down -- #define pq(A) a[l-1+A] void heapsort(Item a[], int l, int r) { int k, N = r-l+1; for (k = N/2; k >= 1; k--) fixDown(&pq(0), k, N); while (N > 1) { exch(pq(1), pq(N)); fixDown(&pq(0), 1, --N); } } --- Widely used sorting method * inplace (no extra space) * guaranteed NlgN time ----- Binomial queues Support ALL PQ operations in lgN steps (Heaps have slow merge) Def: In a LEFT HEAP-ORDERED tree, each node is larger than all nodes in left subtree Def: A POWER-OF-2 TREE is a binary tree with * left subtree of root complete * right subtree empty (therefore, 2^n nodes) Def: A BINOMIAL QUEUE of size N is a forest of left heap-ordered power-of-2 trees one for each 1 bit in binary rep. of N /lineno lineno 6.5 sub def Corresponds to heap-ordered forest: ----- Joining power-of-2 heaps Constant-time operation * larger of two roots at top * left subtree to right subtree of other root * result is left-heap-ordered if inputs are ----- Joining power-of-2 heaps (code) Representation * two pointers per node * need HANDLE (pointer to node) -- struct PQnode { Item key; PQlink l, r; }; struct pq { PQlink *bq; }; --- Join operation * takes two BQ nodes as input * returns one BQ nodes as output -- PQlink pair(PQlink p, PQlink q) { PQlink t; if (less(p->key, q->key)) { p->r = q->l; q->l = p; return q; } else { q->r = p->l; p->l = q; return p; } } --- ----- Joining binomial queues Corresponds to adding binary numbers * 1 bits correspond to power-of-2 heaps * 1+1=10 corresponds to carry 011 111 1010 * lgN time (guaranteed) * all PQ operations easy to implement with join ----- Joining binomial queues (code) -- c b a a c 0 0 0 a 0 0 0 1 a 0 0 1 0 b 0 0 1 1 0 a+b 1 0 0 c 0 1 0 1 0 a+c 1 1 0 0 b+c 1 1 1 a b+c --- -- #define test(C, B, A) 4*(C) + 2*(B) + 1*(A) void PQjoin(PQlink *a, PQlink *b) { int i; PQlink c = z; for (i = 0; i < maxBQsize; i++) switch(test(c != z, b[i] != z, a[i] != z)) { case 2: a[i] = b[i]; break; case 3: c = pair(a[i], b[i]); a[i] = z; break; case 4: a[i] = c; c = z; break; case 5: c = pair(c, a[i]); a[i] = z; break; case 6: case 7: c = pair(c, b[i]); break; } } --- ----- Binomial queues summary BQ of size N is array of power-of-two heaps one for each bit in binary rep. of N Joining two BQs is like adding binary numbers * insert is like incrementing * delete, delmax are like decrementing * heap-like promotion, demotion works for "change priority" All operations guaranteed lgN Actually faster, in "amortized" sense Example: PQinsert N items, then one more * N even, just insert item * N = ...01, just two steps * N = ..011, just three steps ... total cost LINEAR N/2 + 2(N/4) + 3(N/8) + 4(N/16) + ... Basis for modifications getting most operations down to constant worst-case time (some of which are practical) Good candidate for "system" PQ routine COS 226 Lecture 6: Radix sorting Bits and digits Binary Quicksort MSD radix sort Three-way radix Quicksort LSD radix sort Sorting in linear time ----- Bits and digits Extracting bits easy in C Radix: base of number system Power of 2 radix: groups of bits binary (radix-2): 1 bit at a time hexadecimal(radix-16): 4 bits at a time ascii(radix-256): 8 bits at a time -- bin 01100001011000100110001101100100 hex 6 1 6 2 6 3 6 4 ascii a b c d --- Easy to extract digits with macros -- #define bitsword 32 #define bitsbyte 8 #define bytesword 4 #define R (1 << bitsbyte) #define digit(A, B) ((A >> (bitsword-(B+1)*bitsbyte)) & (R-1)) --- Ex: x = 0X61626364 digit(x, 2) = (x >> 8) & 255 = c Ex: Single-bit access: bitsbyte = 1 digit(x, 11) = (x >> 20) & 1 = 0 ----- Binary Quicksort Partition file into two pieces * all keys with first bit 0 * all keys with first bit 1 Sort two pieces recursively -- quicksortB(int a[], int l, int r, int w) { int i = l, j = r; if (r <= l || w > bitsword) return; while (j != i) { while (digit(a[i], w) == 0 && (i < j)) i++; while (digit(a[j], w) == 1 && (j > i)) j--; exch(a[i], a[j]); } if (digit(a[r], w) == 0) j++; quicksortB(a, l, j-1, w+1); quicksortB(a, j, r, w+1); } --- Equivalent to partitioning on the VALUE 2^(bitsword-w+1) instead of some key in the file. Bad partition if all keys have same leading bit one subfile of size N, BUT keys one bit shorter ----- Binary Quicksort (continued) Problems: leading 0 bits cost of inner loop could be advantage if carefully done Worst case: all keys equal 32N passes on a 32-bit machine 64N passes on a 64-bit machine Random bits? should sort out after lgN bits examined Nonrandom bits? take bigger chunks ----- MSD radix sort Partition file into M buckets * all keys with first byte 0 * all keys with first byte 1 * all keys with first byte 2 ... * all keys with first byte M-1 Sort M pieces recursively Take M=2^bitsbyte Tradeoff large M space for buckets (too many empty buckets) small M too many passes (too many keys per bucket) Upper bound on running time: (bitsword/bitsbyte)*N (Worst case: all keys equal) 32 bits, 8 bits/byte 8: 4N 100-byte keys: could be 100N /lines 40 def ----- MSD radix sort example -- now a|ce ac|e ace| for a|go ag|o ago| tip a|nd an|d and| ilk b|et be|t bet| dim c|ab ca|b cab| tag c|aw ca|w caw| jot c|ue cu|e cue| sob d|im di|m dim| nob d|ug du|g dug| sky e|gg eg|g egg| hut f|or fe|w fee| ace f|ee fe|e few| bet f|ew fo|r for| men g|ig gi|g gig| egg h|ut hu|t hut| few i|lk il|k ilk| jay j|am ja|y jam| owl j|ay ja|m jay| joy j|ot jo|t jot| rap j|oy jo|y joy| gig m|en me|n men| wee n|ow no|w nob| was n|ob no|b now| cab o|wl ow|l owl| wad r|ap ra|p rap| caw s|ob sk|y sky| cue s|ky so|b sob| fee t|ip ta|g tag| tap t|ag ta|p tap| ago t|ap ta|r tar| tar t|ar ti|p tip| jam w|ee wa|d wad| dug w|as wa|s was| and w|ad we|e wee| --- ----- Key-indexed counting Method for sorting file on keys with R values Basis for radix sorts Basic method * count number of keys with each value * take sums to turn counts into indices * move keys to auxiliary array using indices -- void keycount(int a[], int l, int r) { int i, j, cnt[R+1]; int b[maxN]; for (j = 0; j < R; j++) cnt[j] = 0; for (i = l; i <= r; i++) cnt[a[i]+1]++; for (j = 1; j < R; j++) cnt[j] += cnt[j-1]; for (i = l; i <= r; i++) b[cnt[a[i]]++] = a[i]; for (i = l; i <= r; i++) a[i] = b[i]; } --- Keys must be integers (in a small range)! Need one counter for each different key value ----- Key-indexed counting example ----- MSD radix sort code Modify key-indexed counting to access bytes as keys Add a loop to do recursive calls -- #define bin(A) l+count[A] void radixMSD(Item a[], int l, int r, int w) { int i, j, count[R+1]; if (w > bytesword) return; if (r-l <= M) { insertion(a, l, r); return; } for (j = 0; j < R; j++) count[j] = 0; for (i = l; i <= r; i++) count[digit(a[i], w) + 1]++; for (j = 1; j < R; j++) count[j] += count[j-1]; for (i = l; i <= r; i++) b[l+count[digit(a[i], w)]++] = a[i]; for (i = l; i <= r; i++) a[i] = b[i]; radixMSD(a, l, bin(0)-1, w+1); for (j = 0; j < R-1; j++) radixMSD(a, bin(j), bin(j+1)-1, w+1); } --- Variable-length keys terminated with 0 (C strings) * remove first if statement * remove first recursive call ----- Recursive structure of MSD radix sort Tree structure to describe recursive call Paths in tree give keys Problem: algorithm touches empty nodes Tree can be as much as M times bigger Could use linked lists for nonempty buckets Better solution: 3-way Quicksort partition on bytes ----- 3-way radix Quicksort code -- #define ch(A) digit(A, D) void quicksortX(Item a[], int l, int r, int D) { int i, j, k, p, q; int v; if (r-l <= M) { insertion(a, l, r); return; } v = ch(a[r]); i = l-1; j = r; p = l-1; q = r; while (i < j) { while (ch(a[++i]) < v) ; while (v < ch(a[--j])) if (j == l) break; if (i > j) break; exch(a[i], a[j]); if (ch(a[i])==v) { p++; exch(a[p], a[i]); } if (v==ch(a[j])) { q--; exch(a[j], a[q]); } } if (p == q) { if (v != '\\0') quicksortX(a, l, r, D+1); return; } if (ch(a[i]) < v) i++; for (k = l; k <= p; k++, j--) exch(a[k], a[j]); for (k = r; k >= q; k--, i++) exch(a[k], a[i]); quicksortX(a, l, j, D); if ((i == r) && (ch(a[i]) == v)) i++; if (v != '\\0') quicksortX(a, j+1, i-1, D+1); quicksortX(a, i, r, D); } --- /lines 40 def ----- 3-way radix Quicksort example -- now gig ace ago a|go for for bet bet a|ce tip dug dug and a|nd ilk ilk cab ace b|et dim dim dim c|ab tag ago ago c|aw jot and and c|ue sob fee egg egg nob cue cue dug sky caw caw dim hut hut f|ee ace ace f|or bet bet f|ew men cab ilk egg egg gig few few hut jay j|ay ja|m owl j|ot ja|y joy j|oy jo|y rap j|am jo|t gig owl owl m|en wee wee now owl was was nob nob cab men men now wad wad r|ap caw sky sky sky sky cue nob was tip sob fee sob sob sob t|ip ta|r tap tap tap tap t|ap ta|p ago tag tag tag t|ag ta|g tar tar tar tar t|ar ti|p dug tip tip w|as and now wee w|ee jam rap wad w|ad --- /lines 40 def ----- LSD radix sort Ancient (older than computers) method used for card-sorting Consider digits from right to left use key-indexed counting (has to be stable) -- void radixLSD(Item a[], int l, int r) { int i, j, w, count[R+1]; for (w = bytesword-1; w >= 0; w--) { for (j = 0; j < R; j++) count[j] = 0; for (i = l; i <= r; i++) count[digit(a[i], w) + 1]++; for (j = 1; j < R; j++) count[j] += count[j-1]; for (i = l; i <= r; i++) b[count[digit(a[i], w)]++] = a[i]; for (i = l; i <= r; i++) a[i] = b[i]; } } --- Running time: N*(bitsword/bitsbyte) Disadvantage: doesn't work for variable-length keys totally out of order until MSD encountered /lines 42 def ----- LSD radix sort example -- now so|b c|ab |ace for no|b w|ad |ago tip ca|b t|ag |and ilk wa|d j|am |bet dim an|d r|ap |cab tag ac|e t|ap |caw jot we|e t|ar |cue sob cu|e w|as |dim nob fe|e c|aw |dug sky ta|g r|aw |egg hut eg|g j|ay |fee ace gi|g a|ce |few bet du|g w|ee |for men il|k f|ee |gig egg ow|l m|en |hut few di|m b|et |ilk jay ja|m f|ew |jam owl me|n e|gg |jay joy ag|o a|go |jot rap ti|p g|ig |joy gig ra|p d|im |men wee ta|p t|ip |nob was fo|r s|ky |now cab ta|r i|lk |owl wad wa|s a|nd |rap tap jo|t s|ob |raw caw hu|t n|ob |sky cue be|t f|or |sob fee yo|u j|ot |tag raw no|w y|ou |tap ago fe|w n|ow |tar tar ca|w j|oy |tip jam ra|w c|ue |wad dug sk|y d|ug |was you ja|y h|ut |wee and jo|y o|wl |you --- ----- Binary LSD radix sort example Cannot use Quicksort-style partitioning 0-1 sort has to be stable stable inplace 0-1 sort? (possible, but not easy) Proof that it works: if two keys differ on first bit 0-1 sort puts them in proper relative order if two keys agree on first bit stability keeps them in proper relative order Another proof that it works: if the bits not yet examined differ doesn't matter what we do now if the bits not yet examined agree later pass won't affect their order ----- Linear sorting method LSD radix sort! To sort N 64-bit keys take bitsbyte=16 4N steps, linear extra memory (plus 2^16) Does not violate NlgN lower bound because comparisons are not used LSD radix sort liabilities inner loop has a lot of instructions accesses memory "randomly" wastes time on low-order bits Therefore, use just "enough" bits MSD radix sort also linear Use LSD-MSD hybrid for random keys (assume fixed-size keys) use (lgN)/2 < bitsbyte < lgN Three passes LSD radix sort on 2nd byte LSD radix sort on 1st byte insertion sort to clean up COS 226 Lecture 7: Searching overview Abstract Data Types Revisited Symbol Table, Dictionary items with keys INSERT SEARCH Searching = "implement a symbol table ADT" ADT orientation integral to modern languages implementation in C possible Refs: Sedgewick section 4.8 Hanson, "C Interfaces and Implementations" COS 217: ADTs control complexity in large systems COS 226: ADTs precisely define implementation Both: ADTs help us build layers of abstraction ----- Items and keys ADT for use by clients and implementations of symbol table ADTs Sample INTERFACE for char items and keys -- . typedef char Item; . typedef char Key; . #define key(A) A . #define less(A, B) (A < B) . #define eq(A, B) ((A) == (B)) . #define maxKey 256 . #define NULLitem 0 . Item ITEMrand(void); . int ITEMscan(Item *); . void ITEMshow(Item); --- CLIENTS use functions to create and write keys manipulate items Search algorithms (ST IMPLEMENTATION) store and retrieve items use "key", "less" to extract, compare keys "key" and "less" could be in implementation GOAL: implementations independent of item type ----- Symbol table ADT Items with keys Two basic operations * insert * search (find item with given key) generic operations common to many ADTs * create * test if empty * destroy (often ignored if not harmful) other operations * construct (batch inserts) * sort (often a byproduct) * delete * find kth largest * join (ST > PQ) INTERFACE for basic and generic operations -- . void STinit(int); . void STinsert(Item); . Item STsearch(Key); . int STcount(); --- Could combine with datatype interface Should also specify constraints and error conditions ----- Typical brute-force implementation Use linear search in an ordered array -- static Item *st; static int N; void STinit(int maxN) { st = malloc((maxN)*sizeof(Item)); N=0; } int STcount() { return N; } void STinsert(Item item) { int j = N++; Key v = key(item); while (j>0 && less(v, key(st[j-1]))) { st[j] = st[j-1]; j--; } st[j] = item; } Item STsearch(Key v) { int j; for (j = 0; j < N; j++) { if (eq(v, key(st[j]))) return st[j]; if (less(v, key(st[j]))) break; } return NULLitem; } --- Other options for brute-force implementations * unordered array * unordered singly-linked list * ordered singly-linked list ----- Binary search Maintain ordered array Do search in lgN steps by divide-and-conquer -- Item search(int l, int r, Key v) { int m = (l+r)/2; if (l > r) return NULLitem; if eq(v, key(st[m])) return st[m]; if (l == r) return NULLitem; if less(v, key(st[m])) return search(l, m-1, v); else return search(m+1, r, v); } Item STsearch(Key v) { return search(0, N-1, v); } --- Ex: search for I -- . 0 1 2 3 4 5 6 7 8 9 10 11 12 . A A E G I M N O R S T X --- -- . l r m . 1 12 6 . 1 5 3 . 4 5 4 . 5 5 --- Worst-case number of steps for binary search = number of bits in binary representation of N Proof: they satisfy the same recurrence ----- Interpolation search Maintain ordered array Improve upon binary search by using key values to guess where to look Ex: search for I (I-A)/(X-A) = (9-1)/(24-1) = .35 therefore divide at .35, not .5 next probe will be on or close -- . 0 1 2 3 4 5 6 7 8 9 10 11 12 . A A E G I M N O R S T X --- -- . l r m . 1 12 5 --- THM: Interpolation search takes lglgN steps for random files Proof: difficult Less than 5 probes in practice (2^2^5 = billions) Caveats: files not random calculations cost more than comparisons ----- ST implementations cost summary Per-operation time costs as function of ST size -- . worst-case average-case . insert search insert search . ordered . array N N N/2 N/2 . list N N N/2 N/2 . unordered . array 1 N 1 N/2 . list 1 N 1 N/2 . binary search N lgN N/2 lgN . BST N N lgN lgN . red-black lgN lgN lgN lgN . randomized (N) (N) lgN lgN . hashing N N 1 1 --- Situation further complicated by * need to have many other operations * operations mix * space costs Another algorithm design success story: * nearly optimal worst-case cost * fast average-case cost * simple (but ingenious!) algorithms Costs lower in practice average-case analysis amortized analysis ----- First-class symbol-table ADT Basic operations * initialize * insert * search * select kth smallest * delete * join two STs * test if empty * visit items in order Handles (pointers to unspecified structs) for delete, client needs to handle recs for join, client needs to handle STs INTERFACE for basic operations -- . typedef struct st* ST; . typedef struct STnode* STlink; . ST STinit(int); . int STcount(int); . STlink STinsert(ST, Item); . Item STsearch(ST, Key); . Item STselect(ST, int); . void STdelete(ST, STlink); . ST STjoin(ST, ST); --- could have STsearch and STselect return handles ----- First-class ADT implementation Concrete implementation of abstract ADT specify data representation for records [ STlink is pointer to record ] specify data representation for table [ ST is pointer to symbol table ] Ex: Binary search trees -- struct STnode { Item item; struct STlink l, r; }; struct st { STlink head; } --- "What is a record in a BST?" a node (handle is pointer to record) "What is a symbol table?" a pointer to a node (root) (handle is pointer to struct with pointer to node) Providing true "abstract" data type challenging * precise ADT specification worth the trouble * new implementations are "plug-ins" * compiler type-checking can be helpful * not so challenging in some modern languages More details: Sedgewick, sections 4.8, 9.5 ----- Interface for standalone ADT Use if only one ADT in application (typical) -- . void STinit(int); . int STcount(); . void STinsert(Item); . Item STsearch(Key); . void STdelete(Item); . Item STselect(int); . void STsort(void (*visit)(Item)); --- * elimate ST handles in args * can specify DELETE in terms of item types * can specify JOIN in terms of ST types Stylized code setup gives true ADT (built-in support in C++, Java ...) Standalone ADT useful in most applications allows us to focus on algorithms ----- Symbol table ADT client example DEDUP a file of integers (remove duplicates) -- #include #include #include "Item.h" #include "ST.h" void main() { int v; Item item; STinit(maxN); while (scanf("%d", &v) == 1) { if (STsearch(v) != NULLitem) continue; key(item) = v; STinsert(item); printf("%d\n", v); } } --- Goal: independence from implementation Space costs: same options as for PQs copy of data for implementation implementation manipulates data via indices implementation manipulates pointers Examples of other clients: online dictionary compiler symbol table ----- BST data types NODES consist of item with key left link to BST for smaller keys right link to BST for larger keys count of number of nodes in BST (optional) -- typedef struct STnode* link; struct STnode { Item item; link l, r; int N }; static link head, z; link NEW(Item item, link l, link r, int N) { link x = malloc(sizeof *x); x->item = item; x->l = l; x->r = r; x->N = N; return x; } void STinit() { head = (z = NEW(NULLitem, 0, 0, 0)); } int STcount() { return head->N; } --- ----- Recursive BST implementation Search and insertion code directly follows from BST definition -- Item searchR(link h, Key v) { Key t = key(h->item); if (h == z) return NULLitem; if eq(v, t) return h->item; if less(v, t) return searchR(h->l, v); else return searchR(h->r, v); } Item STsearch(Key v) { return searchR(head, v); } link insertR(link h, Item item) { Key v = key(item), t = key(h->item); if (h == z) return NEW(item, z, z, 1); if less(v, t) h->l = insertR(h->l, item); else h->r = insertR(h->r, item); (h->N)++; return h; } void STinsert(Item item) { head = insertR(head, item); } --- ----- Nonrecursive BST implementation Travel down the tree move left if smaller than search key move right if larger than search key -- STinsert(Item item) { Key v = key(item); link p = head, x = p; if (head == z) { head = NEW(item, z, z); return; } while (x != z) { p = x; if less(v, key(x->item)) x = x->l; else x = x->r; } x = NEW(item, z, z, 1); if less(v, key(p->item)) p->l = x; else p->r = x; } --- Equivalent to recursive version (can derive automatically) Recursive versions of tree algs simpler to implement and understand ----- BST insertion example ----- BST construction ----- Cost of search in BSTs One-to-one correspondence between BST and Quicksort partitioning Total cost of searching for all nodes satisfies same recurrence as Quicksort C(N) = N+1 + 2( C(1) + C(2) + ... + C(N-1) )/N ... = 2N ln N THM: Search and insertion in BSTs involves examining 2lnN nodes on the average. Worst case: nodes in order degenerates to linked list, insertion cost N ----- Other operations in BSTs SORT: traverse tree in inorder -- void sortR(link h, void (*visit)(Item)) { if (h == z) return; sortR(h->l, visit); visit(h->item); sortR(h->r, visit); } void STsort(void (*visit)(Item)) { sortR(head, visit); } --- same cost as Quicksort (+ space for links) Useful as "free" ST operation GENERALIZED PQ: add tree size to nodes -- Item selectR(link h, int k) { int t = h->l->N; if (h == z) return NULLitem; if (t > k ) return selectR(h->l, k); if (t < k ) return selectR(h->r, k-t-1); return h->item; } Item STselect(int k) { return selectR(head, k); } --- special case: go left for smallest Delete arbitrary node? Join? ----- Deletion in BSTs To delete a node at the bottom (A E L P X) remove it To delete a node with one child (A A C G I M) pass the child up To delete a node with two children (S E H N) find the "next" node use right-left* OR left-right* swap, then remove (reduces to A or B) Problem: strategy not symmetric clumsy algorithm (see code in book!) Serious problem: trees not random after long sequence of insertions and deletions (!!) ----- BSTs by root insertion Idea: Insert such that new node stays at root Nonrecursive implementation possible pass link that would cross to parent Recursive implementation easy based on ROTATION Faster search for recently inserted nodes ----- Nonrecursive BST root insertion -- void BSTinsert(Item item) { Key v = key(item); int lr; struct BSTnode **t, **u, *x; x = NEW(rec, z, z); if (head == z) return x; if (less(v, key(head->item))) { x->r = head; head = x; t = &x->l; x = x->r; lr = 0; } else { x->l = head; head = x; t = &x->r; x = x->l; lr = 1; } while (x != z) if (less(v, key(x->item))) { if ( lr) { lr = !lr; *t = *u; t = u; } u = &x->l; x = *u; } else { if (!lr) { lr = !lr; *t = *u; t = u; } u = &x->r; x = *u; } *t = z; return head; } --- Challenge for the bored: check that this code is correct Conclusion for the rest of us: recursive implementations *are* simpler! ----- ROTATE operation in trees Fundamental operation rearrange nodes in trees local transformation (two links) maintain sorted order -- link rotR(link h) { link x = h->l; h->l = x->r; x->r = h; return x; } link rotL(link h) { link x = h->r; h->r = x->l; x->l = h; return x; } --- Right rotate Left rotate ----- Recursive BST root insertion code To insert at root insert at root of subtree (recursively) rotate to bring inserted key to root -- link insertT(link h, Item item) { Key v = key(item); if (h == z) return NEW(item, z, z, 1); if (less(v, key(h->item))) { h->l = insertT(h->l, item); h = rotR(h); } else { h->r = insertT(h->r, item); h = rotL(h); } return h; } void STinsert(Item item) { head = insertT(head, item); } --- Equivalent to insert the node use rotations to bring it up to the top ----- Rotate to the root ----- Top-down BST construction (root insertion) ----- Searching (to be continued) SUMMARY use elementary methods for small cases binary search guarantees lgN steps for search (but requires N steps for insertion) interpolation search can speed up search (but requires N steps for insertion) binary search trees give fast search and insert, average-case Binary search trees can do more * insert * search * delete * select kth smallest * ... all in O(lgN) steps, average-case with O(NlgN) sort as a byproduct Goal: Symbol table implementation with O(lgN) GUARANTEED performance for both search and insert COS 226 Lecture 8: Balanced trees Symbol Table, Dictionary records with keys INSERT SEARCH Goal: Symbol table implementation with O(lgN) GUARANTEED performance for both search and insert (and other ST operations) Approach 1: probabilistic "guarantee" Approach 2: amortized "guarantee" Approach 3: worst-case guarantee ----- Randomized BSTs Idea: new node should be at root with probability 1/(N+1), so put it there! -- link insertR(link h, Item item) { Key v = key(item), t = key(h->item); if (h == z) return NEW(item, z, z, 1); if (rand()< RAND_MAX/(h->N+1)) return insertT(h, item); if less(v, t) h->l = insertR(h->l, item); else h->r = insertR(h->r, item); (h->N)++; return h; } void STinsert(Item item) { head = insertR(head, item); } --- Trees have same shape as random BSTs ** no matter what the input ** Random BSTs are well studied exponentially small chance of bad balance ----- Randomized BST example Insert keys in order: tree shape random! ----- Other operations in randomized BSTs Find kth largest another use of size field already there Join disjoint STs straightforward recursive implementation to join STs A (of size M) and B (of size N) use A root with probability M/(M+N) use B root with probability N/(M+N) join other tree with subtree recursively Delete remove the node, do join (above) THM: Trees still random after delete (!!) Bottom line: randomized BSTs provide full symbol table ADT straightforward implementation O(log N) average case bad cases provably unlikely ----- Skip lists Idea: Add fast tracks to linked lists Nodes have variable number of links -- typedef struct STnode* link; struct STnode { Item item; link* next; int sz; }; link NEW(Item item, int k) { int i; link x = malloc(sizeof *x); x->next = malloc(k*sizeof(link)); x->item = item; x->sz = k; for (i = 0; i < k; i++) x->next[i] = z; return x; } --- Problems: how to maintain structure under insertion how many links in a particular node? ----- Skip list implementation Idea: put a RANDOM number of links in each node j links with probability 1/2^j -- void STinit(int max) { N = 0; lgN = 0; z = NEW(NULLitem, 0); head = NEW(NULLitem, lgNmax); } void insertR(link t, link x, int k) { Key v = key(x->item); if (less(v, key(t->next[k]->item))) { if (k < x->sz) { x->next[k] = t->next[k]; t->next[k] = x; } if (k == 0) return; insertR(t, x, k-1); return; } insertR(t->next[k], x, k); } void STinsert(Key v) { insertR(head, NEW(v, randX()), lgN); N++; } --- Bottom line: same properties as randomized BSTs plus: easier to understand minus: more pointer-chasing ----- Splay trees Idea: slight modification to root insertion Check two links above current node Orientations differ: same as root insertion Orientations match: do top rotation first Brings new node to root *Also* brings all nodes on search path closer to root ----- Splay tree example Splay rotations halve the search path Long paths can build up, BUT tree rebalances whenever we traverse them ----- Splay tree implementation -- link splay(link h, Item item) { Key v = key(item); if (h == z) return NEW(item, z, z, 1); if (less(v, key(h->item))) { if (hl == z) return NEW(item, z, h, h->N+1); if (less(v, key(hl->item))) { hll = splay(hll, item); h = rotR(h); } else { hlr = splay(hlr, item); hl = rotL(hl); } return rotR(h); } else { if (hr == z) return NEW(item, h, z, h->N+1); if (less(key(hr->item), v)) { hrr = splay(hrr, item); h = rotL(h); } else { hrl = splay(hrl, item); hr = rotR(hr); } return rotL(h); } } --- Guaranteed performance over SEQUENCE of ops ----- 2-3-4 trees Allow one, two, or three keys per node Keep link for every interval beteen keys 2-node: one key, two children 3-node: two keys, three children 4-node: three keys, four children SEARCH compare search key against keys in node find interval containing search key follow associated link (recursively) INSERT search to bottom for key 2-node at bottom: convert to a 3-node 3-node at bottom: convert to a 4-node 4-node at bottom: ?? ----- Top-down 2-3-4 trees Transform tree on the way down to ensure that last node is not a 4-node "Split" any 4-nodes encountered using one of two transformations Invariant: "current" node is not a 4-node Insertion at bottom therefore easy. ----- Top-down 2-3-4 tree construction Trees grow up from the bottom ----- Balance in 2-3-4 trees In top-down 2-3-4 trees, all paths from top to bottom are equal length (tree height) Tree height: worst case: lgN (all 2-nodes) best case: lgN/2 (all 4-nodes) between 10 and 20 for a million nodes between 15 and 30 for a billion nodes Comparisons within nodes not accounted for ----- Top-down 2-3-4 tree implementation Fantasy code (sketch) -- ST BSTinsert(ST head, recordType rec) { Key v = key(rec); struct BSTnode *x = head->r; while (x != z) { x = therightlink(x, v); if fourNode(x) then split(x); } if twoNode(x) then makeThree(x, v); else if threeNode(x) then makeFour(x, v); else return head; } --- Direct implementation complicated because of * "therightlink(x, v)" * maintaining multiple node types * large number of cases for "split" Search also more complicated than for BST ----- Red-black trees Represent 2-3-4 trees as binary trees with "internal" edges for 3- and 4-nodes Correspondence between 2-3-4 and RB trees Not 1-1 because 3-nodes swing either way SEARCH: use plain BST search (!) tree balance gives O(lgN) performance guarantee all comparisons accounted for INSERT: split reduces to fewer cases ----- Splitting nodes in red-black trees Two cases are easy (need only to switch colors) Two cases require ROTATIONS Can use red-black abstraction directly Invariant: never two consecutive red nodes on any path. ----- RB tree node split example ----- Red-black tree implementation Search code same as BST; insert code balances -- link RBinsert(link h, Item item, int sw) { Key v = key(item); if (h == z) return NEW(item, z, z, 1, 1); if ((hl->red) && (hr->red)) { h->red = 1; hl->red = 0; hr->red = 0; } if (less(v, key(h->item))) { hl = RBinsert(hl, item, 0); if (h->red && hl->red && sw) h = rotR(h); if (hl->red && hll->red) { h = rotR(h); h->red = 0; hr->red = 1; } } else { hr = RBinsert(hr, item, 1); if (h->red && hr->red && !sw) h = rotL(h); if (hr->red && hrr->red) { h = rotL(h); h->red = 0; hl->red = 1; } } return h; } void STinsert(Item item) { head = RBinsert(head, item, 0); head->red = 0; } --- ----- Red-black tree construction ----- Balance in red-black trees In red-black trees, the longest path is at most twice the length of the shortest path worst case: less than 2lgN Comparisons within nodes are accounted for ----- Summary Goal: Symbol table implementation with O(lgN) GUARANTEED performance for both search and insert probablistic guarantee: random BSTs, skip lists amortized guarantee: splay trees optimal guarantee: red-black trees Algorithms are varations on a theme rotations during insertion Different abstractions, but equivalent Ex: skip-list representation of 2-3-4 tree Are balanced trees optimal? worst-case: no (can get ClgN for C>1) average-case: open Extensions to search structures for huge files (stay tuned) COS 226 Lecture 9: Hashing Symbol Table, Dictionary records with keys INSERT SEARCH Balanced trees, randomized trees use O(lgN) comparisons Is lgN required? (no, and yes) Are comparisons necessary? (no) ---------- Hashing: basic plan Save keys in a table, at a location determined by the key HASH FUNCTION * method for computing table index from key COLLISION RESOLUTION STRATEGY * algorithm and data structure to handle two keys that hash to the same index Time-space tradeoff * No space limitation: trivial hash function with key as address * No time limitation: trivial collision resolution: sequential search * Limitations on both time and space hashing ---------- Hash Functions Short keys (fit in a word) treat key as integer h(K) = K mod M Use prime table size Ex: four-character keys, table size 101 -- bin 01100001011000100110001101100100 hex 6 1 6 2 6 3 6 4 ascii a b c d --- 0x61626364 = 1633831724 16338831724 % 101 = 11 Key "abcd" hashes to 11 0x64636261 = 1684234849 1633883172 % 101 = 57 Key "dcba" hashes to 57 "Collision" 0x61626263 = 1633837667 1633837667 % 101 = 57 Key "abbc" also hashes to 57 Obvious point: millions of keys, small table: most collide! ---------- Hash Functions (continued) Long keys (strings) treat key as integer h(K) = K mod M Multiprecision arithmetic calculation Use Horner's method Ex: (check with 4 chars; works for any length) -- hex 6 1 6 2 6 3 6 4 ascii a b c d --- 0x61626364 = 256*(256*(256*97+98)+99)+100 take mod after each multiplication 256*97+98 = 24930 % 101 = 84 256*84+99 = 21603 % 101 = 90 256*90+100 = 23140 % 101 = 11 -- int hash(char *v, int M) { int h, a = 117; for (h = 0; *v != '\0'; v++) h = (a*h + *v) % M; return h; } --- scramble by replacing 256 by 117 Uniform hashing: use a different random value for each digit ---------- Collisions Table size M How many insertions until the first collision? BIRTHDAY PARADOX (classical probability theory) Assume hash function "random" Expected insertions to first collision M sqrt(pi M/2) -- . 100 12 . 1000 40 . 10000 125 --- N keys Option 1: Allow N >> M: put keys hashing to i in a list about N/M keys per list Option 2: Keep N < M: put keys somewhere in table complex collision pattern ---------- Collisions (continued) Experiment 1: generate random probes between 0 and 100 84 35 45 32 89 1 58 16 38 69 5 90 16 16 53 61 ... collision at 13th as predicted Experiment 2: use hash function to scatter 4-char keys -- bcba 47 dbcb 24 bddc 43 dcbd 60 dabc 85 dbab 17 dbdb 78 bccc 2 ccad 1 dbdd 80 babb 74 adbc 31 cdcc 95 addd 39 bcbd 50 bcda 55 bdac 83 abad 4 baca 26 cada 85 dabb 84 dabd 86 --- collision after 20 probes still as predicted (standard dev. not small) ---------- Separate chaining Simple, practical, widely used Cuts search time by a factor of M over sequential search Method: M linked lists, one for each table slot -- . 0: * . 1: L A A A * . 2: M X * . 3: N C * . 4: * . 5: E P E E * . 6: * . 7: G R * . 8: H S * . 9: I * . 10: * --- Insert cost: 1 Avg. search cost (successful): N/2M Avg. search cost (unsuccessful): N/M M large: CONSTANT avg. search time independent of how keys are distributed (!) Worst case: N ("probabilistically" unlikely) Keep lists sorted? increases insert time to N/2M cuts unsuccessful search time to N/2M ---------- Linear Probing No links, keep everything in table Method: start linear search at hash position (stop when empty position hit) Still get O(1) avg. search time if table sparse -- void STinit(int max) { int i; N = 0; M = 2*max; st = malloc(M*sizeof(Item)); for (i = 0; i < M; i++) st[i] = NULLitem; } void STinsert(Item item) { Key v = key(item); int i = hash(v, M); while (!null(i)) i = (i+1) % M; st[i] = item; N++; } Item STsearch(Key v) { int i = hash(v, M); while (!null(i)) if eq(v, key(st[i])) return st[i]; else i = (i+1) % M; return NULLitem; } --- Very sparse table: like separate chaining As table fills up: CLUSTERING occurs (infinite loop on full table) ---------- Linear probing performance CLUSTERING bad phenomenon: items clump together long clusters tend to get longer avg. search cost grows to M as table fills Precise analysis very difficult. THM (Knuth) Insert cost: approx. (1+ 1/(1-N/M)^2)/2 Search cost (hit): approx. (1+ 1/(1-N/M))/2 Search cost (miss): same as insert Too slow when table gets 70%-80% full ---------- Double Hashing Avoid clustering by using 2nd hash to compute increment for seq. search Analysis extremely difficult Roughly equivalent to ideal (random probe) THM (Guibas-Szemeredi) Insert cost: approx. 1/(1-N/M) Search cost (hit): approx. ln(1+N/M)/(N/M) Search cost (miss): same as insert Not too slow until table gets 90%-95% full ---------- Amortized analysis of algorithms Measure running time for X operations by (total cost of all X operations)/ X Ex: insert N elements in a heap: (lg1 + lg2 + ... + lgN) / N = lgN + O(1) insert N elements in a binomial queue: (1*N/2 + 2*N/4 + 3*N/8 +...)/N < 2 Hashing: grow table while keeping search cost O(1) when number of keys in table doubles rebuild to double the size of the table Ex: separate chaining, avg search cost < 2 4M keys in table of size M proof by induction: amortized cost < 2 cost to build: x*4M cost to rebuild to new table size 2M: 4M amortized cost of first 8M insertions: (x*4M + 4M + 4M)/8M x/2 + 1 < x Same argument works for other basic ADTs! Ex: stacks, queues in arrays, double hashing Another way to evaluate algorithms worst-case > average-case > amortized ---------- Separate chaining vs. double hashing Assume same amount space for keys, links pointers for long or variable-length keys Space for separate chaining w/ rehashing 4M keys (or links to keys) M table links 4M links in nodes Total space: 9M words for 4M items Avg search time: 2 Double hashing in same space 4M items table size 9M avg search time: 1/(1-4/9) = 1.8 10% faster Double hashing in same time 4M items avg search time 2 space needed: 8M words (1/(1-4/8) = 2) 11% less space Separate chaining advantages idiot-proof (doesn't break) no large chunks of memory (is that good?) could avoid links to nodes ---------- Other ST ADT operations Deletion Separate chaining: trivial Linear probing: rehash keys in cluster or use indirect method (see below) Double hashing: no easy direct method mark deleted nodes as "deadwood" rebuild periodically to clear deadwood Sort, find kth largest Separate chaining w/ sorted lists Linear probing/double hashing have to do full sort Join Separate chaining: easy Linear probing/double hashing: rehash whole table ---------- Reasons not to use hashing Hashing implements ST ADT search and insert in constant time. Why use anything else? * no performance guarantee * too much arithmetic on long keys * takes extra space * doesn't support all ADT ops efficiently * compare abstraction works for partial order (searching without keys) ---------- Other hashing variants Perfect hashing fixed set of keys hash function with no collisions good hack for small tables not practical for large tables totally static Coalesced hashing properly account for link space mix hash table, storage allocation Ordered hashing cut costs in half as with ordered lists Brent's variation guarantee constant search cost up to M insert cost COS 226 Lecture 10: Radix trees and tries Symbol Table, Dictionary records with keys INSERT SEARCH Balanced trees, randomized trees use O(lgN) comparisons Hashing uses O(1) probes but time proportional to key length Is key length required? (no) Are comparisons necessary? (no) Best possible: examine lgN BITS ---------- Digital search trees (DSTs) Easy way to balance tree Use bits of key to direct search Otherwise identical to BST -- #define bit(A, B) digit(A, B) Item searchR(link h, Key v, int w) { Key t = key(h->item); if (h == z) return NULLitem; if eq(v, t) return h->item; if (bit(v, w) == 0) return searchR(h->l, v, w+1); else return searchR(h->r, v, w+1); } Item STsearch(Key v) { return searchR(head, v, 0); } link insertR(link h, Item item, int w) { Key v = key(item), t = key(h->item); if (h == z) return NEW(item, z, z, 1); if (digit(v, w) == 0) h->l = insertR(h->l, item, w+1); else h->r = insertR(h->r, item, w+1); return h; } void STinsert(Item item) { head = insertR(head, item, 0); } --- ---------- Digital tree insertion Trees NOT ordered can't support SORT and SELECT ---------- Tries Branch according to bits in keys No keys in internal nodes Records/keys in external nodes Structure depends on keys, not insertion order Examine lgN BITS to distinguish key independent of key length! Problems: 44% space waste from 1-way branching multiple node types ---------- Trie search implementation Branch according to bits, as in DST Three outcomes: null link key in external node matches search key key in external node differs from search key -- #define external(A) ((h->l == z) && (h->r == z)) Item searchR(link h, Key v, int w) { Key t = key(h->item); if (h == z) return NULLitem; if (external(h)) return eq(v, t) ? h->item : NULLitem; if (digit(v, w) == 0) return searchR(h->l, v, w+1); else return searchR(h->r, v, w+1); } Item STsearch(Key v) { return searchR(head, v, 0); } --- ---------- Trie insertion implementation Two possible search miss outcomes null link: replace with link to new node ext. node: recursive split to distinguish new key -- void STinit() { head = (z = NEW(NULLitem, 0, 0, 0)); } link split(link p, link q, int w) { link t = NEW(NULLitem, z, z, 2); switch(bit(p->item, w)*2+bit(q->item, w)) { case 0: t->l = split(p, q, w+1); break; case 1: t->l = p; t->r = q; break; case 2: t->r = p; t->l = q; break; case 3: t->r = split(p, q, w+1); break; } return t; } link insertR(link h, Item item, int w) { Key v = key(item), t = key(h->item); if (h == z) return NEW(item, z, z, 1); if (external(h)) { return split(NEW(item, z, z, 1), h, w); } if (digit(v, w) == 0) h->l = insertR(h->l, item, w+1); else h->r = insertR(h->r, item, w+1); return h; } void STinsert(Item item) { head = insertR(head, item, 0); } --- ---------- Patricia tries Digression: cute nomenclature Practical Algorithm To Retrieve Information Coded In Alphanumeric information reTRIEval (but pronounced "try") Patricia: * collapse one-way branches in tries * "thread" tree to eliminate multiple node types Quintessential search algorithm ---------- Patricia trie search and insertion SEARCH branch according to specified bit INSERTION search, then find leftmost bit that distinguishes new key from key found Easy (external) case Harder (internal) case Ingenious algorithm, worth learning (see book) ---------- R-way digital tree Generalize to R links per node R-way branching gives fast search -- struct STnode { Item item; link next[26]; }; link NEW(Item); Item searchR(link h, Key v, int w) { int i = digit(v, w); if (h == z) return NULLitem; if eq(v, key(h->item)) return h->item; return searchR(h->next[i], v, w+1); } Item STsearch(Key v) { return searchR(head, v, 0); } --- Simple, fast (but uses a lot of space) ---------- R-way digital tree example ---------- R-way digital tree analysis N keys N internal nodes R links per node Space: N*R Time: lgN/lgR comparisons Ex: R=26, N=20000 500,000 links tree height 3-4 Ex: R=16, N=1M 16M links tree height 5 Plus: one node type, easy implementation Minus: full comparison at each node can't support SORT, SELECT ---------- R-way trie Generalize trie to R-way branching Nodes contain characters R-way branching on next character -- typedef struct STnode* link; struct STnode { Item item; link next[R]; }; Item searchR(link h, Key v, int w) { int i = digit(v, w); if (h == z) return NULLitem; if (internal(h)) return searchR(h->next[i], v, w+1); if eq(v, key(h->item)) return h->item; return NULLitem; } Item STsearch(Key v) { return searchR(head, v, 0); } --- Return match on prefix, or make keys prefix-free end-of-key character suffix trie ---------- R-way trie example ---------- R-way trie existence table No data in trie Keys encoded in trie structure Easy to implement May use excessive space -- Item searchR(link h, Key v, int w) { int i = digit(v, w); if (h == z) return NULLitem; if (i == NULLdigit) return v; return searchR(h->next[i], v, w+1); } Item STsearch(Key v) { return searchR(head, v, 0); } --- ---------- R-way trie implementation Generalize binary trie implementation -- link split(link p, link q, int w) { link t = NEW(NULLitem); int pd = digit(p->item, w), qd = digit(q->item, w); if (pd == qd) { t->next[pd] = split(p, q, w+1); } else { t->next[pd] = p; t->next[qd] = q; } return t; } link insertR(link h, Item item, int w) { Key v = key(item); int i = digit(v, w); if (h == z) return NEW(item); if (!internal(h)) { return split(NEW(item), h, w); } h->next[i] = insertR(h->next[i], v, w+1); return h; } void STinsert(Item item) { head = insertR(head, item, 0); } --- ---------- R-way trie analysis Assume one-way links collapsed link to remainder of key in external nodes N keys, total of C characters in keys approx. N internal nodes R links per node Space: N*R + C Time: lgN/lgR CHARACTER comparisons Ex: M=26, N=20000 5R0,000 links tree height 3-4 Ex: R=16, N=1M 16M links tree height 5 Faster than hashing particularly for unsuccessful search Problems * eliminating 1-way branches * multiple node types * too much space for null links ---------- Correspondence with sorting algs BSTs correspond to Quicksort recursive structure Roles of TIME and SPACE interchanged TIME: path in tree size of subfile in sort SPACE: stack size in sort branching factor in tree Tries correspond to MSD radix sort 3-way tries correspond to 3-way Quicksort ---------- Three-way radix tries Nodes contain characters three-way branching on next character left: key character less middle: key character equal right: key character greater Equivalent to replacing trie node with BST on character Space savings: can use existence trie Adds factor of 2lnM to search cost constant no. of BYTES for unsucc. search Easy recursive sort Most important advantage: adapts well to strange keys Faster than hashing ---------- Three-way radix search trie example -- Item searchR(link h, Key v, int w) { int i = digit(v, w); if (h == z) return NULLitem; if (i == NULLdigit) return v; if (i < h->d) return searchR(h->l, v, w); if (i == h->d) return searchR(h->m, v, w+1); if (i > h->d) return searchR(h->r, v, w); } Item STsearch(char *v) { return searchR(head, v, 0); } --- ---------- Three-way trie insertion implementation -- typedef struct STnode* link; struct STnode { Item item; int d; link l, m, r; }; void STinit() { head = z; } link NEW(int d) { link x = malloc(sizeof *x); x->d = d; x->l = z; x->m = z; x->r = z; return x; } link insertR(link h, Item item, int w) { Key v = key(item); int i = digit(v, w); if (h == z) h = NEW(i); if (i == NULLdigit) return h; if (i < h->d) h->l = insertR(h->l, v, w); if (i == h->d) h->m = insertR(h->m, v, w+1); if (i > h->d) h->r = insertR(h->r, v, w); return h; } void STinsert(Key key) { head = insertR(head, key, 0); } --- ---------- Empirical studies random 32-bit keys -- BUILD SEARCH . bst dst trie pat bst dst trie pat . 5000 4 5 7 7 3 2 3 2 . 12500 18 15 20 18 8 7 9 7 . 25000 40 36 44 41 20 17 20 17 . 50000 81 80 99 90 43 41 47 36 .100000 176 167 269 242 103 85 101 92 .200000 411 360 544 448 228 179 211 182 --- words in Moby Dick -- BUILD SEARCH . bst hash rst fast bst hash rst fast . 5000 5 4 3 3 3 2 2 1 . 12500 11 8 9 7 9 5 5 3 . 25000 23 15 17 13 19 12 10 7 . 50000 50 29 31 25 43 25 21 15 --- library call numbers -- BUILD SEARCH . bst hash rst fast bst hash rst fast . 5000 19 16 21 20 10 8 6 4 . 12500 48 48 54 97 29 27 15 14 . 25000 118 99 188 156 67 59 36 30 . 50000 230 191 333 255 137 113 70 65 --- COS 226 Lecture 11: External sort/search Huge file: 1967: 1 million bytes 1997: 1 trillion bytes Use high-level abstraction local memory random access faster sequential access external memory far slower than local memory read/write in large blocks faster sequential access restrictions on access Huge file: thousands of times the size of local memory SORTING N: number of records M: size of memory 2P: number of external devices SEARCHING N: number of records M: page size ----- Algorithms SORTING balanced merge polyphase merge SEARCHING indexed sequential B-trees skip lists extendible hashing Many ancient algorithms still relevant ----- Balanced multiway merge Make multiple passes over the file * can MERGE huge sorted blocks in memory * use half the devices for "output" pass 0: sort the file into sorted blocks of size M on devices 0, 1, 2, ... P-1 pass 1: make blocks of size M*P P-way merge out to devices P, P+1, P+2, ... 2*P pass 2: make blocks of size M*P^2 P-way merge out to devices 0, 1, 2, ... P-1 ... pass t: make blocks of size M*P^t File is sorted when M*P^t > N t = log_P (N/M) passes through the data -- . PC sort workstation . file 1 billion 1 trillion . memory 1 million 1 billion . devices 2 10 . passes lg 1000 = 10 log_10 1000 = 3 --- Costs no. of passes (no. times each datum touched) no. of devices Bottom line: can sort a file in 3-10 times the amount of time it takes to read or write it ----- Balanced merge example input -- T H E Q U I C K B R O W N F X J M P S V L A Z Y D G --- sorted runs of size 3 -- E H T I Q U B C K O R W F N X J M P L S V A Y Z D G * --- sorted runs of size 9 -- B C E H I K Q T U F J M N O P R W X A D G L S V Y Z * --- output -- A B C D E F G H I J K L M N O P Q R S T U V W X Y Z * --- sorted runs (lengths) on devices -- . 0 1 2 3 4 5 . -------------------------------- . 3(3) 3(3) 3(3) 0 0 0 . 0 0 0 1(9) 1(9) 1(9) 27 . 1(27) 0 0 0 0 0 27 --- ----- Polyphase Merge Concept: cut the number of devices needed in half MERGE-UNTIL-EMPTY one device empty different number of runs on other devices merge to empty device until another device is empty iterate Strategically place runs on devices to make it so that last merge makes all devices empty at once Work backwards to figure placement Ex: 3 devices -- . 1 0 0 0 . 0 1 1 1 . 1 0 2 2 . 3 2 0 4 . 7 6 4 0 --- distribution for 17 runs General pattern is complicated! ----- Polyphase merge example input -- T H E Q U I C K B R O W N F X J M P S V L A Z Y D G --- sorted runs of size 3 -- 0 0 0 1 1 3 3 3 3 E H T I Q U B C K O R W F N X J M P L S V A Y Z D G * --- result of phase 1 -- 0 3 3 2 2 B C K A Y Z D G * E H J M O P R T W F I L N S Q U V X --- result of phase 2 -- 3 2 1 D G * F I L N S Q U V X A B C E H J K M O P R T W Y Z --- output -- A B C D E F G H I J K L M N O P Q R S T U V W X Y Z * --- sorted runs (lengths) on devices -- . 0 1 2 3 . ---------------------------- . 3(3) 2(3) 0 4(3) . 1(3) 0 2(9) 2(3) 18 . 0 1(15) 1(9) 1(3) 15 . 1(27) 0 0 0 27 --- ----- Polyphase merge analysis Ex. 34 initial sorted runs, 3 devices -- . 0 21 13 . 13 8 0 . 5 0 8 . 0 5 3 . 3 2 0 . 1 0 2 . 0 1 1 . 1 0 0 --- Fibonacci numbers phi = 1.618... (golden ratio) phi^N runs in phi phases each touching 1/phi of the data THM: Total number of passes is log_phi(N/M)/phi -- . PC sort workstation . file 1 billion 1 trillion . memory 1 million 1 billion . devices 3 3 . passes 8 8 --- Using half as many devices as balanced merge, still completes sort in fewer passes Precise analysis extremely complex ----- Virtual memory Another idea: use quicksort Paging system "automatically" takes care of reading data in blocks Requires ability to address huge files locality of reference in program Partitioning on such a system: read a block from the left read a block from the right write small elements on left write small elements on right Paging systems takes care of details Estimate of costs two devices working at a time 2 ln (N/M) passes Can customize to partition-based sort distribute files a la MSD sort utilize multiple devices ----- Basic external searching N records stored in pages of size M Goal: find item with given key read as few pages as possible INDEXED SEQUENTIAL ACCESS Store file sequentially on devices build index for each device build master index for devices Like encyclopedia Problem: index is hard to maintain under insertion- deletion real goal: *dynamic* index ----- B-trees Generalize 2-3-4 trees: up to M links per node Split full nodes on the way down Used for external search node size = page size typical: M=1000 search and insert 1 trillion records in four or five page acceses Key point: flexibility to do fast insert Red-black abstraction still works BUT might not use internal links (use binary search instead) Space--time tradeoff M large: only a few levels in tree M small: less wasted space Normally build as separate index Main issue is to minimize wasted space Bottom line roughly log_M N page accesses 3 or 4 in practice ----- B tree example ----- B tree example (continued) ----- B tree growth ----- Skip Lists Generalize skip lists nodes hold M records use sequential search within nodes INSERT algorithm node not full: add record to node node full: split into two half-full nodes give new node t links with probability 1/M^t Bottom line roughly log_M N page accesses 3 or 4 in practice randomized algorithm OK? ----- Key-indexed directory Use leading t bits of keys to find page with key capacity: M*2^t To increase capacity, double directory Problems when >M keys have same leading bits Ex: M+1 equal keys ----- Extendible hashing Generalize trie, hash search Hash keys to make them random bits Build two-level trie on hashed keys directory size 2^t page size M SEARCH i = first t bits of hashed key directory entry i gives page name search that page for the key INSERT same as search put key on page accessed if page full split into two pages **may need to split directory** Bottom line 2 page accesses per search 44 percent extra space randomized algorithm OK? ----- Extendible hashing example ----- Extendible hashing example (continued) COS 226 Lecture 12: String searching TEXT: N characters PATTERN: M characters (C occurrences of pattern in text) Existence Any occurrence of pattern in text? Enumerate How many occurrences of pattern in text? Search Find an occurence of pattern in text Search all Find all occurences of pattern in text Three parameters N, M, C start with N >> M >> C Ex: N = 100,000, M = 100, C = 5 Other factors multiple patterns (preprocessing) binary vs. ascii text avoid backup in text ---------- String searching examples Text string: find gjkmxorzeoa in -- kvjlixapejrbxeenpphkhthbkwyrwamnugzhppfx iyjyanhapfwbghxmshwrlyujfjhrsovkvveylnbx nawavgizyvmfohigeabgksfnbkmffxjffqbualey tqrphyrbjqdjqavctgxjifqgfgydhoiwhrvwqbxg rixydzbpajnhopvlamhhfavoctdfytvvggikngkw zixgjtlxkozjlefilbrboignbzsudssvqymnapbp qvlubdoyxkkwhcoudvtkmikansgsutdjythzlapa wlvliygjkmxorzeoafeoffbfxuhkzukeftnrfmoc ylculksedgrdivayjpgkrtedehwhrvvbbltdkctq --- Binary: find 101001111010111011 in -- 1000100101011010001001101011010110110110 0111111110101001110101000000010111101001 1011100000011000110000100110000111101101 1100000011011110101111101111000111100001 1011111001110011001111011001000000011101 1111100001011001110011010010110101001001 0001111111111011100111101100101010011110 1111001110110010010010011000100010011100 0010000011100010101110101001101100100011 --- Long text and pattern * fixed pattern, random text: return "not found" * fixed text, random pattern: return "not found" ---------- Brute-force string searching Check for pattern at every text position -- int brutesearch(char *p, char *a) { int i, j, cnt = 0; for (i = 0; i < strlen(a); i++) for (j = 0; j < strlen(p); j++) { if (a[i+j] != p[j]) break; if (j == strlen(p)-1) return i; } return strlen(a)+1; } --- DON'T USE THIS PROGRAM! (Why?) In C, strlen takes time proportional string length running time at least N^2 even for simpler programs (count the a's) Performance matters in ADT design! Exercise: implement string ADT with fast strlen need space to store length need to update length when changing string ... ---------- Brute-force algorithm (continued) Easy alternative (if bug is discovered!) -- int brutesearch(char *p, char *a) { int i, j; int M = strlen(p), N = strlen(a), cnt = 0; for (i = 0; i < N; i++) for (j = 0; j < M; j++) { if (a[i+j] != p[j]) break; if (j == M-1) return i; } return N+1; } --- Different implementation (same algorithm) char match: increment both i and j char mismatch: set j to 0, reset i -- int brutesearch(char *p, char *a) { int i, j, M = strlen(p), N = strlen(a); for (i = 0, j = 0; j < M && i < N; i++, j++) while (a[i] != p[j]) { i -= j-1; j = 0; } if (j == M) return i-M; else return i; } --- ---------- Analysis of brute-force algorithm Running time depends on text and pattern Worst case: search for 000001 -- . 000000000000000000000000000001 . 00000* . 00000* . 00000* . 00000* . 00000* --- M*N bit compares Average case (fixed pattern, random text) -- . 10011010010100010100111000111 . * . 00* . 0* . * . * . 0* . * . 001 --- long pattern: 2*N compares short pattern: * precise cost depends on pattern * (first 001 appears before first 000, on average) Ref: Flajolet and Sedgewick ---------- Rabin-Karp algorithm Idea: Use hashing * compute hash function for each text position * no table! (just compare with pattern hash) Ex: "table" size 97, M = 5 Search for 15926 = 18 (mod 97) -- . 31415926535897932384626433 . 31415 = 84 (mod 97) . 14159 = 94 (mod 97) . 41592 = 76 (mod 97) . 15926 = 18 (mod 97) . 59265 = 95 (mod 97) --- Problem: hash function depends on M characters Solution: compute hash for i+1 from hash for i (improves running time from N*M to N+M) -- 31415 = 84 (mod 97) 14159 = (31415 - 30000)*10 + 9 = (84 - 3*9)*10 + 9 (mod 97) = 579 = 94 (mod 97) 41592 = (94 - 1*9)*10 + 2 = 76 (mod 97) 15926 = (76 - 4*9)*10 + 6 = 18 (mod 97) 59265 = (18 - 1*9)*10 + 5 = 95 (mod 97) --- Problem: need full compare on collision -- 92653 = (95 - 5*9)*10 + 3 = 18 (mod 97) --- Solution: use giant (virtual) table! ---------- RK algorithm implementation Take table size as big as possible (not so big that overflow happens) -- #define q 3355439 #define d 256 int rksearch(char *p, char *a) { int i, j, dM = 1, h1 = 0, h2 = 0; int M = strlen(p), N = strlen(a); for (j = 1; j < M; j++) dM = (d*dM) % q; for (j = 0; j < M; j++) { h1 = (h1*d + p[j]) % q; h2 = (h2*d + a[j]) % q; } for (i = M; i < N; i++) { if (h1 == h2) return i-M; h2 = (h2 + d*q - a[i-M]*dM) % q; h2 = (h2*d + a[i]) % q; } return N; } --- Randomized algorithm: take random (big) table size ---------- Knuth-Morris-Pratt algorithm Text characters that match are in the pattern! Can precompute what to do on mismatch Ex: searching for 000001 mismatch 00000* implies * text had 000000 * if next text char is 1, full match found * if next text char is 0, back in same state mismatch 000* implies * text had 0001 * restart search on next text char Finite-state automaton for 000001 -- . 0 1 2 3 4 5 . 0 1 2 3 4 5 5 . 1 0 0 0 0 0 6 --- worst-case text -- . 000000000000000000000000000001 . 012345555555555555555555555556 --- random text -- . 100111010010100010100111000111 . 001200010120101230101200012300 --- ---------- KMP algorithm (continued) Finite-state automaton for 10100110 -- . 0 1 2 3 4 5 6 7 . 0 0 2 0 4 5 0 2 8 . 1 1 1 3 1 3 6 7 1 --- Search example: -- . 100111010010101010100110000111 . 0120111234562343434345678 --- KMP algorithm: build FSA (or C program) from pattern run FSA on text N+M running time no backup in text ---------- KMP automaton construction FSA builds itself (!) Successors to state i * match: i+1 * mismatch: FSA state corresponding to * next putative text match * take mismatch into account Ex: FSA for 10100110 State 6 1010011: go to state 7 1010010: go to state for 010010 (state 2) remember state for 010011 (X = state 1) State 7 10100110: go to state 8 10100111: go to state for 0100111 (state 1) remember state for 0100110 (X = state 2) main point: these are successors of old X Algorithm: X: mismatch state match: FSA[match][i] = i+1 mismatch: FSA[mismatch][i] = FSA[mismatch][X] X = FSA[match][X] /lines 35 def ---------- KMP automaton construction example Pattern: 10100110 -- . 0 1 . 0 0 2 1 . 1 1 1 01 --- -- . 0 1 2 . 0 0 2 0 00 . 1 1 1 3 000 --- -- . 0 1 2 3 . 0 0 2 0 4 011 . 1 1 1 3 1 0011 --- -- . 0 1 2 3 4 . 0 0 2 0 4 5 0101 . 1 1 1 3 0 3 00123 --- -- . 0 1 2 3 4 5 . 0 0 2 0 4 5 0 01000 . 1 1 1 3 0 0 6 001200 --- -- . 0 1 2 3 4 5 6 . 0 0 2 0 4 5 0 2 010010 . 1 1 1 3 0 0 6 7 0012012 --- -- . 0 1 2 3 4 5 6 7 . 0 0 2 0 4 5 0 2 8 0100111 . 1 1 1 3 0 0 6 7 1 00120111 --- ---------- KMP implementation Build FSA from pattern, run FSA on text string Improvement: match state in FSA unnecessary (always i+1) use mismatch state only ("next" array) small hack: next[0] = -1 -- int kmpsearch(char *p, char *a) { int i, j, M = strlen(p), N = strlen(a); initnext(p); for (i = 0, j = 0; j < M && i < N; i++, j++) while ((j>=0) && (a[i]!=p[j])) j = next[j]; if (j == M) return i-M; else return i; } --- Ultimate search program for pattern: machine language version of FSA -- int kmpsearch(char *a) { int i = 0; s0: if (a[i] != '1') goto s0; i++; s1: if (a[i] != '0') goto s1; i++; s2: if (a[i] != '1') goto s0; i++; s3: if (a[i] != '0') goto s1; i++; s4: if (a[i] != '0') goto s3; i++; s5: if (a[i] != '1') goto s0; i++; s6: if (a[i] != '1') goto s2; i++; s7: if (a[i] != '0') goto s1; i++; return i-8; } --- ---------- Right-left pattern scan Can get sublinear algorithms by moving right to left in pattern (left to right in text) Ex: Find string of 9 consecutive 0s -- . 100111010010100010100111000111 . 0 . 1 . 1 . 0 . 0 . 1 --- Same idea effective in large alphabet for every character in alphabet compute skip when mismatch on that char (skip at least to next text char!) Search time proportional to N/M for practical problems Time decreases as pattern length increases (!) Yet another parameter: alphabet size binary: 2 (group bits for this alg) text string: 256 (probably less) ---------- Right-left pattern scan examples Text character not in pattern skip forward M chars -- . now is the time for all good people to come . * * * * e . l . p . o . e . p --- Text character in pattern skip forward to putative pattern end -- . you can fool some of the people some of the time . * * * o . e . l . p . o . e . p --- ---------- Right-left pattern scan implementation -- initskip(char *p) { int j, M = strlen(p); for (j = 0; j < 256; j++) skip[j] = M; for (j = 0; j < M; j++) skip[p[j]] = M-j-1; } #define max(A, B) (A > B) ? A : B; int mischarsearch(char *p, char *a) { int i, j; int M = strlen(p), N = strlen(a); initskip(p); for (i = M-1, j = M-1; j >= 0; i--, j--) while (a[i] != p[j]) { i += max(M-j, skip[a[i]]); if (i >= N) return N; j = M-1; } return i+1; } --- Boyer-Moore algorithm mismatched character heuristic KMP-like computation for skip (pick the largest) ---------- Multiple patterns and text Related to symbol table ADT DICTIONARY build trie from text (preprocess) pattern lookups in O(lgN) steps EXCEPTION DICTIONARY build trie from set of patterns (preprocess) find patterns in given text in N steps generalization of KMP Ex: FSA for 000, 011, and 1010 -- . 0 1 2 3 4 5 6 . 0 1 3 5 7 5 3 9 . 1 2 4 2 4 8 6 8 --- Simultaneous search for all patterns in text -- . 111100100100101110100000 . 02222534534534568 --- COS 226 Lecture 13: Pattern matching Generalize string-searching problem to include incompletely specified patterns TEXT: N characters PATTERN: M-character REGULAR EXPRESSION (C occurrences of pattern in text) Ex: Is some substring of cdbcaaaaabcddbbc in the language described by the RE (a*b + ac)d ? As with string matching, extend to count matches, return all matches egrep Practical application of core principles of the theory of computation Based on simulating NONDETERMINISTIC finite-state automata Essential concept in computer science build intermediate abstractions pick the right ones! ----- "man grep" excerpts -- Name grep - search file for regular expression Syntax grep [option...] expression [file...] Description Commands of the grep family search the input files (standard input default) for lines matching a pattern. Normally, each line found is copied to the standard output. Take care when using the characters $ * [ ^ | ( ) and \ in the expression because they are also meaningful to the Shell. It is safest to enclose the entire expression argument in single quotes ' '. --- Basic elements of expressions (grep) c any non-special char matches itself r* zero or more occurences of r "Extended" regular expressions (egrep) (r) r1 | r2 ----- Regular expression Theoretician: language accepted by FSA Programmer: compact description of multiple patterns Concatenate -- abcda abcda --- OR -- a+b a b c(a+b)d cad cbd (ac+b)d acd bd (a+c)(b+d) ab ad cb cd --- Closure -- a* . a aa aaa aaaa aaaaa ca*b cb cab caab caaab caaaab (a+b)* . a b aa ab ba bb aaa aab aba ... c(a+b)*d cd cad cbd caad cabd cbad cbbd ... --- /lines 35 def ----- String searching FSAs Recall: KMP algorithm: FSA for one pattern AC algorithm: FSA for multiple patterns purpose of FSA is to avoid backing up in text RE searching is much harder check each text position for RE match N(N-1)/2 possible matches Ex: Is some prefix of -- . cdbcaaaaabcddbbc . dbcaaaaabcddbbc . bcaaaaabcddbbc . caaaaabcddbbc . ... --- in the language described by the RE -- (a*b + ac)d ? --- THM: There exists an FSA for every RE Ex: FSA for acb (string match) -- . 0 1 2 3 . a 1 1 1 3 . b 0 0 3 3 . c 0 2 0 3 --- ----- C program to simulate FSAs Ex: FSA for c(a+b)d -- . 0 1 2 3 4 . a 0 3 0 0 4 . b 0 2 0 0 4 . c 1 1 1 1 4 . d 0 0 3 3 4 --- -- struct state { char c; int next[256]; } state fsa[1000]; int t = start; while ((c = getchar()) != EOF) t = fsa[t].next[c]; if (t == final) printf("Accepted\n"); else printf("Rejected\n"); --- Different type of FSA can avoid large next table character in state one state on match another state on mismatch Possible approach to implementing grep: build FSA from RE use C program to simulate FSA Problems: FSA can be exponentially large how to build FSA from RE? ----- Nondeterministic FSAs Appropriate intermediate abstraction for egrep small in size can build directly from RE FSA that can guess the right answer More than one successor state Simplest version character in state next state on match NULL states two possible successor states Ex: NDFSA for ca*b -- . 0 1 2 3 4 . ch c a b . next1 1 2 1 4 4 . next2 1 3 1 4 4 --- Ex: NDFSA for c(a+b)d -- . 0 1 2 3 4 5 . ch c a b d . next1 1 2 4 4 5 5 . next2 1 3 4 4 5 5 --- ----- NDFSAs and FSAs Nondeterminism does not extend descriptive power of FSAs THM: Given an nondeterministic FSA, there is a deterministic FSA that recognizes the same language. Possible approach to implement grep: build NDFSA from RE transform to FSA use C program to simulate FSA Standard construction for NDFSA to FSA make one FSA state for each possible subset of NDFSA states Problem: FSA may be exponential in size Easier approach: use C program to simulate NDFSA ----- Building a NDFSA from a regular expression Each RE construct corresponds to the construction of a piece of the NDFSA Single character: Concatenation: Closure (*): OR: ----- NDFSA construction example A A* A*B A*B+AC (A*B+AC)D ----- NDFSA representation NDFSA for (A*B+AC)D -- . 0 1 2 3 4 5 6 7 8 9 . ch a b a c d . next1 5 2 3 4 8 6 7 8 9 0 . next2 5 2 1 4 8 2 7 8 9 0 --- ----- Recursive descent RE parser Goal: program to build NDFSA from RE Solve easier problem first: is RE legal? Use context free language to describe RE -- := | + := | := ( ) | c | ( )* | c* --- PARSER: determine structure of legal strings COMPILER: add semantic information to translate Top-down recursive descent parser: recursive programs directly derived from CFL Expression -- := | + --- -- expression() { term(); if (p[j] == '+') { j++; expression(); } } --- /lines 35 def ----- Recursive descent RE parser (continued) Term -- := | --- -- term() { factor(); if ((p[j] == '[') || letter(p[j])) term(); } } --- Factor: -- := ( ) | c | ( )* | c* --- -- factor() { if (p[j] == '[') { j++; expression(); if (p[j] == ']') j++; else error(); } else if (letter(p[j])) j++; else error(); if (p[j] == '*') j++; } --- Doesn't always work; -- := v | + --- -- badexpression(); { if (letter(p[j])) j++; else { badexpression(); if (p[j] == '+') { j++; term(); } else error(); } } --- leads to infinite recursive loop! ----- Parsing example RE: (A*B+AC)D -- expression() term() factor() [ expression() term() factor() A * term() factor() B + expression() term() factor() A term() factor() C ] term() factor() D --- ----- Recursive compiler to build FSA from RE Augment parser to put out state table for FSA state: next state to be filled in setstate: procedure to fill in state values Recursive routines return index of initial state -- int expression() { int t1 = term(), t2, r = t1; if (p[j] == '+') { j++; state++; t2 = state; r = t2; state++; setstate(t2, ' ', expression(), t1); setstate(t2-1, ' ', state, state); } return r; } --- -- next1[0] = expression(); setstate(0, ' ', next1[0], next1[0]); setstate(state, ' ', 0, 0); --- NDFSA for (a*b+ac) -- . 0 1 2 3 4 5 6 7 8 . ch a b a c . next1 5 2 3 4 8 6 7 8 0 . next2 5 2 1 4 8 2 7 8 0 --- ----- Simulating NDFSAs Maintain a DEQUE (doubly ended queue) with * possible states for CURRENT char * a scan marker, * * possible states for NEXT char -- 3 6 1 * 4 7 2 --- Take next state off beginning (stack, queue) * null: put both states at beginning (stack) * match: put new state at end (queue) -- int match(char *a) { int j = 0, N = strlen(a), state = next1[0]; dequeinit(); put(scan); while (state) { if (state == scan) { j++; put(scan); } else if (ch[state] == a[j]) put(next1[state]); else if (ch[state] == ' ') { push(next1[state]); push(next2[state]); } if (dequeempty() || j==N) return 0; state = pop(); } return j; } --- Don't bother checking next1==next2. Stay tuned. ----- NDFSA simulation example Simulation of NDFSA for pattern (a*b+ac)d running on text string aabd -- . 5 a . 2 6 a . 1 3 6 a . 1 3 6 a . 3 6 a 2 . 6 a 2 . a 2 7 . 2 7 a . 1 3 7 a . 3 7 a 2 . 7 a 2 . a 2 . 2 b . 1 3 b . 3 b . b 4 . 4 d . 8 d . d 9 . 9 * . 0 * --- ----- Problem: exponential growth NDFSA for (a*a)*b -- . 0 1 2 3 4 5 6 . ch a a b . next1 4 2 3 4 2 6 0 . next2 4 2 1 4 5 6 0 --- Simulation of machine running on aaaaaaab -- . 4 a . 5 2 a . 2 a . 1 3 a . 3 a 2 . a 2 4 . 2 4 a . 1 3 4 a . 3 4 a 2 . 4 a 2 4 . ... a 2 4 2 4 2 4 2 4 a ... . a 2 4 2 4 2 4 2 4 --- Deque can get exponentially large Easy fix: no duplicate states on deque keep "existence table", indexed by state also fixes next1=next2 code hack ----- Worst-case analysis of grep match N: length of text M: length of RE Number of states in NDFSA: single character: 2 concatenation AB: |A| + |B| - 1 closure A*: |A| + 1 OR A+B: |A| + |B| + 1 Total: less than or equal to M+1 Time for match: not more than M+1 states for each char run alg on each text position total not more than (M+1)(N + N-1 + N-2 + ...) = O(MN^2) (linear on real problems) Quintessential tool, classic algorithm ----- Context string searching brute force implementation O(MN) KMP FSA saves a factor of M grep implementation abstract machine: NDFSA pattern (word in CFL) --> NDFSA simulate NDFSA compiler abstract machine: computer program (word in CFL) --> machine code run program (hardware simulation of abstract machine) grep is simple compiler converts CFL program to machine code parser: check if pattern is in CFL grep: check if text is in RE Same function at different levels of hierarchy Faster algorithm than grep? Yes, using tries (ref. Gonnet and Baeza-Yates) COS 226 Lecture 14: File compression Compression: reduce the size of a file to save TIME when transmitting the file to save SPACE when storing it away Widely used despite technological advances (Why?) Basic concepts ancient (1952) Best technology recently developed Simple methods Huffman codes Information theory Arithmetic codes Adaptive (LZ) codes ----- Simple methods Step 1: avoid doing something dumb Step 2: try to do something clever Character count encoding Ex: ASCII binary file Ex: ACGT molecular biology sequences Ex: five-bit code for text Run length encoding Ex: blanks in text files Ex: black-and-white graphics How to represent counts? escape characters separate alphabet (may waste bits) different language entirely General representation problem annoying practical problems usually work with either * bitstream/bytestream * separate language entirely Special hacks Ex: encode /usr/dict/words ----- Run-length encoding Remove blanks in files ex: tabular data without tabs ex: (ancient) card images Record lengths of 0s and 1s in binary files ex: black-and-white graphics -- .000000000000011111111111111000000000 13 14 9 .000000000001111111111111111110000000 11 18 7 .000000001111111111111111111111110000 8 24 4 .000000011111111111111111111111111000 7 26 3 .000001111111111111111111111111111110 5 30 1 .000011111110000000000000000001111111 4 7 18 7 .000011111000000000000000000000011111 4 5 22 5 .000011100000000000000000000000000111 4 3 26 3 .000011100000000000000000000000000111 4 3 26 3 .000011100000000000000000000000000111 4 3 26 3 .000011100000000000000000000000000111 4 3 26 3 .000001111000000000000000000000001110 5 4 23 3 1 .000000011100000000000000000000111000 7 3 20 3 3 .011111111111111111111111111111111111 1 35 .011111111111111111111111111111111111 1 35 .011111111111111111111111111111111111 1 35 .011111111111111111111111111111111111 1 35 .011111111111111111111111111111111111 1 35 .011000000000000000000000000000000011 1 2 31 2 --- Compression factor increases with resolution No help for random files! ----- Huffman code Variable length code, defined by trie Ex: abracadabra 88 bits in byte code 55 bits in 5-bit code 33 bits in 3-bit code (only 5 different letters) Huffman code: -- a 1 5 5 b 001 2 6 c 0000 1 4 d 0001 1 4 r 01 2 4 23 bits 1*001*01*0*0000*0*0001*0*001*01*0 10010110000100011001011 --- not necessarily unique -- a 0 5 5 b 110 2 6 c 100 1 3 d 101 1 3 r 111 2 6 23 bits 01101110100010101101110 --- Depends on encoding frequently used chars with fewer bits than infrequently used chars ----- Huffman code implementation ENCODE: count frequency of occurrence of characters build trie by combining two smallest frequencies trie defines code pass through source, output code DECODE: use bits to guide trie search output char when reached, restart at top Have to transmit trie! suffices to transmit code relatively insignificant for big messages Issues: trie representation data structure to hold nodes being processed ----- Huffman code construction example ----- Huffman code construction example (cont.) ----- Huffman code implementation (continued) Trie representation parent links right-left bit Data structure for nodes being processed "indirect" priority queue To build code: put all chars with nonzero counts in PQ while PQ has two or more nodes take two smallest off PQ make new node with sum of counts put on PQ Another algorithm to build code: sort initial frequencies merge with generated frequencies no PQ needed THM (Huffman, 1952): Huffman code is optimal (no variable-length code uses fewer bits) Problems: have to transmit code expensive computationally two-pass not optimal (!) ----- a simple string to be encoded using a minimal number of bits -- c f p d g l r u a b o t m s e n i * 1 1 1 2 2 2 2 2 3 3 3 3 4 4 5 5 6 11 1 1 1 2 2 2 2 2 3 3 3 3 4 4 5 5 6 11 1 2 2 2 2 2 3 3 3 3 4 4 5 5 6 11 * 2 2 2 2 2 3 3 3 3 4 4 5 5 6 11 * 2 3 2 2 3 3 3 3 4 4 5 5 6 11 * 2 3 4 3 3 3 3 4 4 5 5 6 11 * 2 3 4 4 3 3 3 4 4 5 5 6 11 * 3 4 4 5 3 4 4 5 5 6 11 * 3 4 4 5 6 4 4 5 5 6 11 * 4 4 5 6 6 5 5 6 11 * 4 4 5 6 6 8 5 5 6 11 * 5 6 6 8 8 6 11 * 5 6 6 8 8 10 11 * 6 6 8 8 10 11 11 * 8 8 10 11 12 11 * 10 11 12 16 * 11 12 16 21 16 21 23 37 23 60 --- ----- Huffman trie representation Links point up the tree one index per node to parent one bit per node (left or right child) -- .28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 .27 27 28 28 29 29 30 31 31 32 32 34 35 36 38 39 . * * * * * * * * --- To get code for given char * move up the trie * collect left-right bits * reverse string when top reached ----- Entropy Basic concept of information theory formal model of source formal concept of "information content" First order model: chars generated randomly with fixed probability (independently) THM: Huffman is optimal in first order model Entropy concept confirms intuition * random data cannot be compressed * data compressed by one optimal compressor won't be compress further by another * no guarantee of specific performance on all data BUT * "average" case results * depends on mathematical model * real data may not fit any model Example: short program generates large data file compression alg has to discover the program! Statistical modeling find short program (or model parameters) ex: speech recognition ----- A difficult file to compress One million random characters -- ylculksedgrdivayjpgkredehwhrvvbbltdkctqtoctnnhylmsgvwachzjvwhzabzrhsyqxnzgkpufx dytmoajjsorphchnxlygnezneofgidqfyqfmvldxreuseufgryyvuzsrdreilknwvordwsfblvhinaw ezslumidtljqydtbprazipawngrubtqyjfiowyaktrwuopoohtkuaenzxcuivbjekdimqdqtxxcwkau ndcobuekavqepoxmidcmdrahhuzdjmjxkeoyimflsoyxknqkqsgwgjofhakqrhmbfhgvtgsousmvcuw uungwqzzhllznusllaepdphwojuiorbpsuxxebnszlktzbapzpscprlltgiklkgczvxmrgoywywjdga pspqslilkpktzjarozmfvaqjmnlqpesbvdgpkkxrxeaphrlfabegwprtqnywwxhoysbionurepogfuc qfnehrquwdyiuurcrpylxzzkaystirtemsgaplcpcfnwmgfokcfyigsoudbsemxcenhofagogdidfqn ... --- A lower bound for compression: 130 chars! -- #include "math.h" main () { int i, j; char x; for (i = 0; i < 12500; i++) { for (j = 0; j < 80; j++) { x = 'a' + 26*drand48(); printf("%c", x); } } } --- Ex: Postscript file?? ----- Arithmetic Coding Represent source with a real number! Ex: abracadabra -- a 0.00000 .45454 5/11 chars are a's b .45454 .63636 2/11 chars are b's c .63636 .72727 d .72727 .81818 r .81818 1.00000 --- ENCODE: rescale interval for each letter Ex: code for "abracadabra" is .2787887 -- a .00000000 .45454547 (0, 4/11) in (0, 1) b .20661159 .28925622 (5/11, 7/11) in (0, 4/11) r .27422991 .28925622 a .27422991 .28106004 c .27857634 .27919728 a .27857634 .27885857 d .27878159 .27880725 a .27878159 .27879325 b .27878690 .27878901 r .27878863 .27878901 a .27878863 .27878881 --- Can rescale and output code digit when leading digits of intervals match (have to do so to handle long messages) ----- Arithmetic Coding (decode) Given code (real number between 0 and 1) it falls in one of the intervals (l, r) associated with the characters output that character rescale to (0,1) -- code = (code-l)/(r-l); --- repeat to get next char Ex: -- .27878872 a since .27... is in (0, 5/11) .61333513 b (.27...- 0)/(5/11); in (5/11, 7/11) .87334329 r (.61...- 5/11)/(2/11); in (9/11, 1) .30338812 a .66745383 c .34199208 a .75238258 d .27620819 a .60765803 b .84211922 r .13165572 a --- Need to adjust algorithm to match rescaling used on decode (can be tricky to do) Bottom line: slightly better than Huffman like having fractional bits ----- Pure Adaptive Coding (LZ) To encode the next character(s) * if no match of length >1 in previous string, output the next char * otherwise, output pointer to longest previous match length of longest previous match char causing the mismatch -- . 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 . t h i s i s t h e t h i r d t i m e . . t h i s * 2 3 t h e 4 3 i r d 6 2 i m e . . . 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 . 0 1 1 0 0 1 1 0 0 0 0 1 1 0 1 1 1 1 0 1 . . 0 1 1 0 4 4 0 6 5 1 4 2 1 5 3 --- Problems: big dictionary (N^2 prev substrings) online dictionary update not easy ----- Dictionary-based Adaptive Coding (LZW) -- . 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 . t h i s i s t h e t h i r d t i m e . . 0 t t . 1 h h . 2 i i . 3 s s . 4 * * . 5 is 2 s . 6 *t 4 t . 7 he 1 e . 8 *th 6 h . 9 ir 2 r . 10 d d . 11 *ti 6 i . 12 m m . 13 e e . . t h i s * 2 s 4 t 1 e 6 h 2 r d 6 i m e --- ----- Binary example of LZW -- . 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 . 0 1 1 0 0 1 1 0 0 0 0 1 1 0 1 1 1 1 0 1 . . 0 0 1 0 . 1 1 1 1 . 2 2 2 10 1 0 . 3 4 2 01 0 1 . 4 2 3 100 2 0 . 5 9 2 00 0 0 . 6 11 2 11 1 1 . 7 13 3 011 3 1 . 8 16 3 110 6 1 . . 1 0 0 1 2 0 0 0 1 1 3 1 6 1 --- ----- LZW implementation Use trie for dictionary (standard representation) -- .0001111001110110110110001001101111000101111010 --- ENCODE loop lookup string suffix in trie output dictionary index at bottom add new node to trie DECODE options build trie (faster, takes space) build dictionary (slower) ----- Empirical results Methods are available as UNIX tools pack: Huffman compress: LZ gzip: LZW Experiment: Compress the text of "Moby Dick" -- . original 1220K . 5-bit 778K 63% . Huffman 692K 56% . LZ 523K 42% . LZW 493K 40% --- gzip the results pack the results -- . Huffman 692K 624K 10% 692K 687K 1% . LZ 523K 520K 523K "no savings" . LZW 493K 493K 493K "no savings" --- Experiment: Compress a 1MB file of random chars -- . original 1012K . 5-bit 638K 63% . Huffman 610K 60% . LZ 695K 68% . LZW 646K 63% --- ----- Summary Don't ignore the obvious Huffman: represent fixed-length bitstrings with variable-length codes Lempel-Ziv: represent variable-length bitstrings with fixed-length codes Not covered: codes like JPEG "lossy" codes for graphics (don't really need all the bits) Other questions compute on compressed data? archiving compression code must be truly portable Ex: replace processor Ex: send files to different processor massive compression needed on some data COS 226 Lecture 15: Geometric algorithms Important applications involve geometry * models of physical world * computer graphics * mathematical models Ancient mathematical foundations Most geometric algorithms less than 20 years old Knowledge of fundamental algorithms is critical * use them directly * use the same design strategies * know how to compare and evaluate algs ----- Warning: intuition may mislead Humans have spatial intuition in 2D and 3D computers do not! neither have good intuition in high dimensions Ex: Is a polygon convex? we think of this alg sees this or even this Ex: Find intersections among set of rectangles we think of this algorithm sees this ----- Geometric algorithms: overview New primitives points, lines, planes; polygons, circles Primitive operations distance, angles "compare" point to line do two line segments intersect? Problems extend to higher dimensions (algorithms sometimes do, sometiems don't) Higher level intrinsic structures arise Basic problems intersection proximity point location range search Standard approaches incremental (brute-force) divide-and-conquer sweep line algs multidimensional tree structures randomized algs discretized algorithms online and dynamic algs ----- Algorithm design paradigms Draw from knowledge about fundamental algs Move up one level of abstraction use fundamental algs and data structures know their performance characterisitics More complicated primitives lead to wider range of problems Some problems too complex to admit simple algorithms For many important problems classical approaches give good algorithms need research to find "best" algorithms no excuse for using "dumb" algorithms Progression of algorithm design -- all possibilities double recursion 2^N brute force nested for loops N^2 divide-and-conquer recursion, trees N log N elegant idea 1 for loop N randomization random choices N --- Oversimplified scenario, but fits many problems Many examples in geometric algorithms ----- Geometric primitives point two numbers (x, y) line two numbers a and b [ax + by = 1] line segment four numbers (x1, y1) (x2, y2) polygon sequence of points No shortage of other geometric shapes triangle, square, circle 3D and higher dimensions more complex Primitive operations can be challenging is polygon simple? is point on line? is point inside polygon? do two line segments intersect? do two polygons intersect? Algorithms search through SETS of primitives all points in specified range closest pair in set of points intersecting pairs in set of line segments overlapping areas in set of polygons ----- Line segment intersection Do two line segments intersect? To implement INTERSECT(l1, l2) use simpler primitive SAME(p1, p2, l): Given two points p1, p2 and a line l, are p1 and p2 on the same side of l? yes no no no To implement SAME use simpler primitive CCW(p1, p2, p3): Given three points p1, p2, p3, is the route p1-p2-p3 a ccw turn? yes no two ccw tests to implement SAME four ccw tests to implement INTERSECT ----- CCW implementation Compare dx, dy -- int ccw(struct point p0, struct point p1, struct point p2) { int dx1, dx2, dy1, dy2; dx1 = p1.x - p0.x; dy1 = p1.y - p0.y; dx2 = p2.x - p0.x; dy2 = p2.y - p0.y; if (dx1*dy2 > dy1*dx2) return 1; if (dx1*dy2 < dy1*dx2) return -1; return 0; } --- Still not quite right! Bug in degenerate case points on the same line Does AB intersect CD? on the line in the order ABCD: NO on the line in the order ACDB: YES Can't just return 0 if dx1*dy2 = dx2*dy1 (see book) CCW is a useful basic primitive Ex: is point inside convex N-gon? N CCW tests Lesson: geometric primitives are tricky to implement can't ignore degenerate cases ----- Convex hull of a point set Basic property of a set of points CONVEX HULL: * smallest convex polygon enclosing the points * shortest path surrounding the points * intersection of halfplanes defined by point pairs -- --- Running time of algorithm can depend on N: number of points M: number of points on the hull point distribution ----- Package-wrap algorithm Operates like selection sort Abstract idea sweep line anchored at current point CCW first point hit is on hull Implementation compute angle to all points pick smallest angle larger than current one -- int wrap(struct point p[], int N) { int i, min, M; float th, v; struct point t; for (min = 0, i = 1; i < N; i++) if (p[i].y < p[min].y) min = i; p[N] = p[min]; th = 0.0; for (M = 0; M < N; M++) { t = p[M]; p[M] = p[min]; p[min] = t; min = N; v = th; th = 360.0; for (i = M+1; i <= N; i++) if (theta(p[M], p[i]) > v) if (theta(p[M], p[i]) < th) { min = i; th = theta(p[M], p[min]);} if (min == N) return M; } } --- Use pseudo-angle to save time ----- Package-wrap example ----- Graham Scan Sort points on angle with bottom point as origin forms simple closed polygon Proceed through polygon discard points that would cause a CW turn -- int grahamscan(struct point p[], int N) { int i, min, M; struct point t; for (min = 1, i = 2; i <= N; i++) if (p[i].y < p[min].y) min = i; for (i = 1; i <= N; i++) if (p[i].y == p[min].y) if (p[i].x > p[min].x) min = i; t = p[1]; p[1] = p[min]; p[min] = t; shellsort(p, N); p[0] = p[N]; for (M = 3, i = 4; i <= N; i++) { while (ccw(p[M],p[M-1],p[i]) >= 0) M--; M++; t = p[M]; p[M] = p[i]; p[i] = t; } return M; } --- Running time: N log N for sort; N for scan ----- Graham scan example ----- Divide-and-conquer convex hull algorithms divide points divide space switch dimensions to avoid strange degeneracies Relatively complex to code in both cases ----- Incremental convex hull algorithm Consider next point if inside hull of previous points, ignore if outside, update hull Two subproblems to solve test if point inside or outside polygon update hull for outside points Both subproblems can be solved by looking at all hull points can be improved with binary search Randomized algorithm consider points in random order N + M log M ----- "Sweep line" convex hull algorithm Sort points on x-coordinate first Eliminates inside test Total time proportional to N log N (for sort) ----- Quick elimination Improve the performance of any convex hull algorithm by quickly eliminating most points (known not to be on the hull) Use points at "corners" max, min x+y, x-y Check if point inside quadrilateral four CCW tests Check if point inside rectangle four comparisons Almost all points eliminated if points random number of points left proportional to N^(1/2) Total time proportional to N ----- Summary of 2D convex hull algs Package wrap NM Graham scan N log N (sort time) Divide-and-conquer N log N (with work) Quick elimination N (fast average-case) Brute-force N log M Sweep line N log N (sort time) How many points on the hull? Worst case: N Average case: depends on distribution * uniform in a convex polygon: log N * uniform in a circle: N^(1/3) Average-case analysis requires understanding of basic properties of DATA, not just algs ----- Higher dimensions Multifaceted convex polytope encloses points NOT a simple object Ex: N points d dimensions d=2: convex hull d=3: Euler's formula (v - e + f = 2) d>3: exponential number of facets at worst EXTREME POINTS return points on the hull, not necc in order Package-wrap Divide-and-conquer Randomized Interior elimination ----- Geometric models of mathematical problems Impact of geometric algorithms extends far beyond models of physical objects Two important examples: Geometric problem find point where two lines intersect in 2D find point where three planes intersect in 3D ... Mathematical equivalent solve simultaneous equations algorithm: gaussian elimination Geometric problem find convex polytope defined by intersecting half-planes find vertex hit by line of given slope moving in from infinity Mathmatical equivalent linear programming algorithm: simplex (stay tuned) ----- Linear programming example Maximize a+b subject to the constraints b - a < 5 a + 4b < 45 2a + b < 27 3a - 4b < 24 a > 0 b > 0 COS 226 Lecture 16: Geometric search Extend search ADT to geometric data Range search Intersections among geometric objects Near neighbor search Point location Multidimensional problems Trees Divide-and-conquer Discretized algorithms ----- Range searching Possible addition to symbol-table ADT: -- . void STinit(); . void STinsert(Item x); . Item STsearch(Key v); . int STempty(); . int STrange(Key v1, Key v2); --- Count the number of records with key values between v1 and v2 Options to actually process the records * pass a procedure to be called for each record in the range * return list of records (possibly sorted) Depends on how many records expected (count them first) Implementations ARRAY: do binary search on both v1 and v2 return difference between indices HASH TABLE: no easy algorithm BST, TRIE: recursive traversal works ----- BST 1D range searching Recursive traversal * eliminate subtrees with no keys in interval * root may or may not be in interval * traverse both subtrees if it is -- Key v1, v2; int count = 0; int BSTrangeR(link h) { int tx1 = (h->key >= v1); int tx2 = (h->key <= v2); if (tx1 && (h->l != z)) BSTrangeR(h->l); if (tx1 && tx2) count++; if (tx2 && (h->r != z)) BSTrangeR(h->r); } --- ----- 2D Range searching Same basic method works in higher dimensions (!!) (discovered by an undergraduate) interval in 1D is rectangle in 2D 2D TREE: alternate between x and y for the key Recursive traversal * eliminate subtrees with no keys in rectangle * root may or may not be in rectangle * traverse both subtrees if it is Corresponds to recursive subdivision of the plane alternating horizontal and vertical lines Trivial generalization to k dimensions ----- 2D tree construction Each node corresponds to an area in the plane Each internal node divides its area into two subdivisions Switch between horizontal and vertical dividing lines Quad tree: use 4-way tree divide on both coordinates at once ----- 2D tree range searching -- int x1, y1, x2, y2, count = 0; TDTrangeR(link h, int d) { int t1,t2,tx1,tx2,ty1,ty2; if (h == z) return; tx1 = x1 < h->p.x; tx2 = h->p.x <= x2; ty1 = y1 < h->p.y; ty2 = h->p.y <= y2; t1 = d ? tx1 : ty1; t2 = d ? tx2 : ty2; if (t1 && (h->l != z)) TDTrangeR(h->l, !d); if (tx1 && tx2 && ty1 && ty2) count++; if (t2 && (h->r != z)) TDTrangeR(h->r, !d); } --- ----- Manhattan line intersection problem N lines, either horizontal or vertical How many pairs intersect? Call a given procedure for each intersection As with other search problems usually no harder to REPORT all intersections (call a given procedure for each) ----- Manhattan line intersection Dynamic SWEEP LINE algorithm Horizontal line sweeps from bottom to top At any given instant * vertical data line represents "point" * horizontal data line represents "interval" There is an h-v intersection if "point" is in "interval" Reduces 2D line intersection problem to 1D range searching! ----- Sweep line implementation Uses both PQ and ST (with range search) ADT Consider y coordinates of endpoints in increasing order (use PQ) Three types of "events" B: bottom of vertical line T: top of vertical line H: horizontal line Maintain dynamic ST where an item with key x means that some vertical line intersects the sweep line at x B: INSERT x T: DELETE x H: RANGE (x1, x2) Generalizes to give fast algorithm for * rectangles * general lines, circles * convex polygons Generalizes to higher dimensions "sweep hyperplane" many applications ----- Near neighbor searching Another possible addition to Search ADT: -- . void STinit(); . void STinsert(Item x); . Item STsearch(Key v); . int STempty(); . Item STnearest(Key v); --- Find the record with key value closest to v Need a concept of "distance", not just "less" easy if keys are numbers, or points in space Implementations ARRAY: do binary search on v compare with points on either side HASH TABLE: no easy algorithm BST, TRIE: recursive traversal works ----- 1D BST near neighbor searching Test search value for proximity with every node touched Go down the both subtrees if close to the edge -- void BSTnear(link h) { if (h == z) return; if (dist(v, h->key) < min) { best = h; min = dist(v, best->key); } if (v < h->key || (v - h->key) < min) BSTnear(h->l); if (v > h->key || (h->key - v) < min) BSTnear(h->r); } --- Multidimensional near neighbor searching: same algorithm, but switch dimensions at each level, just as for range searching ----- Voronoi diagram Given: set S of N points VORONOI DIAGRAM: for each point x, identify the set of points in the plane closer to x than to any other point y in S Ultimate "closest point" structure Voronoi edges perpendicular bisectors of point pairs Voronoi points centroids of point triple triangles Challenge to compute Representation? Degenerate cases? ----- Delaunay triagulation Given: set S of N points DELAUNAY TRIANGULATION: connect x to every point y in S with a Voronoi edge separating x and y Outer boundary is convex hull Representation easier: no extra points Voronoi diagram and Delauney triangulation can be computed in N log N steps (!!) Approaches * divide and conquer * sweep line * randomize * discretize ----- 2D divide-and-conquer Ex: To find the closest pair in N points * sort on x * divide into two sets of N/2 points * find closest pair in each half * find closest pair crossing boundary After pairs in halves found, checking boundary crossing seems easy Actually need to sort on y (comes for free) to make boundary check easy Boundary check MUST be carefully done because it terminates recursion ----- 2D divide-and-conquer code Use the same code (mergesort) to * sort on x * sort on y and work on boundary Call this twice (change key from x to y): -- struct node *sort(struct node *c, int N) { int i; struct node *a, *b; float middle; struct point p1, p2, p3, p4; a = c; if (c->next == z) return c; for (i = 2; i <= N/2; i++) c = c->next; b = c->next; c->next = z; if (pass == 2) middle = b->p.x; c = merge(sort(a, N/2), sort(b, N-(N/2))); if (pass == 2) { p1 = z->p; p2 = z->p; p3 = z->p; p4 = z->p; for (a = c; a != z; a = a->next) if (fabs(a->p.x - middle) < min) { check(a->p, p1); check(a->p, p2); check(a->p, p3); check(a->p, p4); p1 = p2; p2 = p3; p3 = p4; p4 = a->p; } } return c; } --- Same general strategy works to compute Voronoi ----- Grid methods Grids : geometric search :: tries : search ADT Grid method * define uniform grid of fixed-size squares * put points in lists associated with squares * ignore points in faraway grid squares Time-space tradeoff like MSD sort * grid too fine: empty cells * grid too coarse: lists too long Use 2- or 3-level grids, or recurse ala quad trees Ex: range searching For graphics applications ultimate grid is PIXEL ARRAY leads to "discretized algorithms" ----- Point location problem Ex: find state corresponding to point on map Planar subdivision * 2D tree planar decomposition * N lines * Voronoi diagram * grid Which division contains the given point? Difficult in general if only because of difficulty of representing planar subdivisions ----- Discretized line intersection p-by-p bit raster, p^2 pixels N lines Draw rasterized verson of line report intersection if pixel already 1 Cost: p^2 to initialize pixels to 0 number of pixels on lines Cost dominated by p^2 Line intersection same cost as drawing blank picture! ----- Discretized Voronoi diagram put 1 pixels on a priority queue priority: distance to closest point Algorithm: * remove pixel from priority queue * check all neighbor pixels * closer or same: ignore * farther: check pixel value * if 0, set to 1 and put back on pq * if 1, must be on a voronoi edge! Time proportional to number of pixels on diagram times diameter of largest cell (plus time to initialize pixels to 0) Idea: use or refine triangulation from discretized diagram to compute real diagram COS 226 Lecture 17: Graph algorithms GRAPH: a set of OBJECTS with CONNECTIONS Interesting and useful abtract concept Study of graph algorithms a challenging branch of computer science Study of mathematical properties a challenging branch of discrete math Hundreds of interesting algorithms known Important applications abound * operations research * circuit simulation * software systems ----- Glossary of terms Node Edge Graph Dense Sparse Path Cycle Tree Spanning tree Connected Connected component Directed Undirected Weighted Network ----- Graph examples (geometric problems) ----- More graph examples Anything involving relationships among items can be modeled as a graph Ex: Transportation network Nodes: cities; Edges: roads Ex: Electric circuit Nodes: devices; Edges: wires Warning: geometric intuition may mislead Ex: Airline fares (triangle equality might not hold) Graphs are ABSTRACT objects, and model other abstractions Ex: Scheduling Nodes: tasks Edges: must do task A before task B Ex: Programming system Nodes: subroutine; Edges: A can call B Ex: Finite state automaton Ex: CFG Nodes: symbols; Edges: productions Ex: Game graphs Nodes: board positions; Edges: moves ----- Representing graphs Graph are abstract mathematical objects Algorithms have to work with concrete representations Many different representations possible Efficiency depends on matching representation with algorithm Nothing new here: standard first step in algorithm design is to pick representation (data structure) * array vs. linked list * integers * points, lines, polygons * trees ----- Representing graphs (examples) Node names A B C D E F G H I Adjacency lists A: F C B G B: A C: A D: F E E: G F D F: A E D G: E A H: I I: H J: K L M K: J L: J M M: J L ----- Representing graphs (more examples) Set of edges AB AG AC LM JM JL JK ED FD HI FE AF GE Adjacency matrix -- A B C D E F G H I J K L M A 1 1 1 0 0 1 1 0 0 0 0 0 0 B 1 1 0 0 0 1 1 0 0 0 0 0 0 C 1 0 1 0 0 0 0 0 0 0 0 0 0 D 0 0 0 1 1 1 0 0 0 0 0 0 0 E 0 0 0 1 1 1 1 0 0 0 0 0 0 F 1 0 0 1 1 1 0 0 0 0 0 0 0 G 1 0 0 0 1 0 1 0 0 0 0 0 0 H 0 0 0 0 0 0 0 1 1 0 0 0 0 I 0 0 0 0 0 0 0 1 1 0 0 0 0 J 0 0 0 0 0 0 0 0 0 1 1 1 1 K 0 0 0 0 0 0 0 0 0 1 1 0 0 L 0 0 0 0 0 0 0 0 0 1 0 1 1 M 0 0 0 0 0 0 0 0 0 1 0 1 1 --- Implicit edges "calculate" edges only when needed ----- Summary of costs E edges, V vertices Space requirements: Adjacency lists: V+E Adjacency matrix: V^2 Set of edges: E (+V) Choice of representation affects algorithm efficiency Trivial examples: Is there an edge from A to B? lists: O(E) matrix: O(1) Is there an edge from A to anywhere? lists: O(1) matrix: O(E) ----- Basic graph problems (short list) Is there a path from A to B? CYCLES Does the graph contain a cycle? CONNECTIVITY (SPANNING TREE) Is there a way connect the vertices? BICONNECTIVITY Will the graph become disconnected if one vertex is removed? PLANARITY Is there a way to draw the graph without edges crossing? SHORTEST PATH What is the shortest way from A to B? LONGEST PATH What is the longest way from A to B? MINIMAL SPANNING TREE What is the best way connect the vertices? HAMILTONIAN CIRCUIT Is there a way to visit all the vertices without visiting the same vertex twice? TRAVELING SALESMAN What is the shortest route to connect the vertices without visiting the same vertex twice? ISOMORPHISM Do two given representations represent the same abstract graph? ----- Traversing graphs Goal: "visit" every node on the graph Depth-first search (DFS) Recursive traversal algorithm * Mark all nodes as "unvisited" * To visit a node k * mark it * (recursively) visit all unmarked nodes connected to k by an edge * Visit node 1, then visit any unvisited node, continue until all nodes visited Solves some simple graph problems * connectivity * cycles Basis for solving difficult graph problems * biconnectivity * planarity ----- DFS implementation (recursive) -- int val[maxV]; int id = 0; dfs() { int k; for (k = 1; k <= V; k++) val[k] = 0; for (k = 1; k <= V; k++) if (val[k] == 0) visit(k); } --- Adjacency matrix -- visit(int k) { int t; val[k] = ++id; for (t = 1; t <= V; t++) if (a[k][t] != 0) if (val[t] == 0) visit(t); } --- Adjacency list -- visit(int k) { struct node *t; val[k] = ++id; for (t = adj[k]; t != z; t = t->next) if (val[t->v] == 0) visit(t->v); } --- ----- DFS example (adjacency lists) A: F C B G B: A C: A D: F E E: G F D F: A E D G: E A H: I I: H J: K L M K: J L: J M M: J L ----- DFS example (adjacency lists) visit A visit F (first on A's list) check A on F's list (been there) visit E (second on F's list) visit G (first on E's list) check E on G's list (been there) check A on G's list (been there) check F on E's list (been there) visit D (third on E's list) check F on D's list (been there) check E on D's list (been there) check D on F's list (done that) visit C (second on A's list) visit B (third on A's list) check G on F's list (done that) ... "been there": currently working on it "done that": totally finished dealing with it ----- DFS example (adjacency matrix) Adjacency matrix -- A B C D E F G H I J K L M A 1 1 1 0 0 1 1 0 0 0 0 0 0 B 1 1 0 0 0 0 0 0 0 0 0 0 0 C 1 0 1 0 0 0 0 0 0 0 0 0 0 D 0 0 0 1 1 1 0 0 0 0 0 0 0 E 0 0 0 1 1 1 1 0 0 0 0 0 0 F 1 0 0 1 1 1 0 0 0 0 0 0 0 G 1 0 0 0 1 0 1 0 0 0 0 0 0 H 0 0 0 0 0 0 0 1 1 0 0 0 0 I 0 0 0 0 0 0 0 1 1 0 0 0 0 J 0 0 0 0 0 0 0 0 0 1 1 1 1 K 0 0 0 0 0 0 0 0 0 1 1 0 0 L 0 0 0 0 0 0 0 0 0 1 0 1 1 M 0 0 0 0 0 0 0 0 0 1 0 1 1 --- ----- DFS example (adjacency matrix) visit A check A on A's list (been there) visit B (second on A's list) check A on B's list (been there) check B on B's list (been there) visit C (third on A's list) check A on C's list (been there) check C on C's list (been there) visit F (fourth on A's list) check A on F's list (been there) visit D (second on F's list) check D on D's list (been there) visit E (second on D's list) check D on E's list (been there) check E on E's list (been there) check F on E's list (been there) visit G (fourth on E's list) check A (been there) check E (been there) check G (been there) check F on D's list (been there) check E on F's list (done that) check F on F's list (been there) check G on A's list (done that) ... ----- DFS implementation (stack-based) Use explicit stack instead of recursive calls -- int val[maxV]; int id = 0; visit(int k) { struct node *t; push(k); while (!stackempty()) { k = pop(); val[k] = ++id; for (t = adj[k]; t != z; t = t->next) if (val[t->v] == 0) { push(t->v); val[t->v] = -1; } } } listdfs() { int k; stackinit(); for (k = 1; k <= V; k++) val[k] = 0; for (k = 1; k <= V; k++) if (val[k] == 0) visit(k); } --- ----- DFS example (adjacency lists, stack-based) visit A push F, push C, push B, push G visit G push E, been to A visit E been to G, been to F, push D visit D been to F, been to E visit B been to A visit C been to A visit F been to A, done with E, done with D ----- Stack-based search NOT the same as recursive DFS. Why? Algs differ in treatement of nodes adjacent to partially visited nodes DFS: visits such a node stack-based: treats it as "visited" Automatic recursion removal would stack local variable (next node on adj. list) Can get same effect by deleting (if present) before pushing (more general ADT than stack) No particular reason to use stack other ADTs work as well (stay tuned) Analogous to traversing a maze ----- Graphs and mazes Nodes: intersections Edges: hallways DFS mark ENTRY and EXIT halls at each vertex leave by ENTRY when no unmarked halls Stack-based? light each intersection if you can see a light when arriving at an intersection, turn back leave by ENTRY when no unmarked halls What characterizes a "recursive" method?? ----- Breadth-first search Put unvisited nodes on a QUEUE, not a stack -- visit(int k) { struct node *t; put(k); while (!queueempty()) { k = get(); val[k] = ++id; for (t = adj[k]; t != z; t = t->next) if (val[t->v] == 0) { put(t->v); val[t->v] = -1; } } } bfs() { int k; queueinit(); for (k = 1; k <= V; k++) val[k] = 0; for (k = 1; k <= V; k++) if (val[k] == 0) visit(k); } --- ----- BFS vs DFS example Depth-first search Breadth-first search DFS: "sequential" BFS: "parallel" ----- DFS example ----- DFS example (continued) Same graph, different order of edges on adjacency lists ----- DFS example (continued) Same graph, "randomized" choice of edges on adjacency lists COS 226 Lecture 18: Minimal Spanning Trees Classic algorithms for solving "network" problems MINIMAL SPANNING TREE Kruskal's algorithm Prim's algorithm SHORTEST PATHS Dijkstra's algorithm Floyd's algorithm Related to generalized graph search using a priority queue Ex: telephone wiring Ex: supplies to troops ----- Generalized search (Priority-first) Use PQ, rather than stack or queue TREE vertices visited FRINGE vertices adjacent to visited UNSEEN vertices others Put all fringe vertices on PQ Search algorithm * take next vertex off PQ * change status to TREE * for each adjacent vertex UNSEEN: move to FRINGE FRINGE: update prioirity Encompasses several basic graph algorithms depth-first search (use decreasing counter for priority) breadth-first search (use increasing counter for priority) minimal spanning tree shortest path tree ----- Minimal spanning tree Weighted graph (edges have values) Ex: Vertices: points in space Edges: lines connecting them Weights: lengths of lines MST: minimal weight set of edges that connects all the vertices Theorem: Divide the vertices into two sets A and B. Shortest edge X connecting a vertex in A with a vertex in B must be in the MST Proof: by contradiction. If X not in MST, add it to MST * creates a cycle * some other edge Y from A to B must be on cyccle Delete Y and add X to get better tree. ----- sample graph a spanning tree minimal spanning tree ----- Priority-first search MST algorithm Set A: TREE Set B: UNSEEN+FRINGE Smallest edge in FRINGE must be in MST -- #define P t->w visit(link* adj, int k) { link t; dad[k] = k; PQreduce(k, 0); while (PQminkey() < maxNum) { k = PQdelmin(); printf("%d-%d\n", k, dad[k]); for (t = adj[k]; t != z; t = t->next) if (onPQ(t->v) && (P < val[t->v])) { PQreduce(t->v, P); dad[t->v] = k; } } } listpfs(link* adj, int V, int E) { int k; val = malloc((V+1)*sizeof(weightType)); dad = malloc(V*sizeof(int)); PQinit(val, V); for (k = 0; k < V; k++) if (onPQ(k)) visit(adj, k); } --- /lines 35 def ----- MST example use line lengths as edge weights Edges Adjacency lists -- 0-6 5.3 0: 7 5 2 1 6 0-1 3.1 1: 7 0 0-2 3.6 2: 7 0 4-3 2.2 3: 5 4 5-3 2.2 4: 6 5 7 3 7-4 2.2 5: 0 4 3 5-4 3.1 6: 4 0 0-5 6.0 7: 1 2 0 4 6-4 3.6 7-0 3.6 7-2 1.4 7-1 1.0 --- MST -- 1-0 7-1 2-7 4-7 3-4 5-3 6-4 --- ----- PFS MST example -- . 0 1 2 3 4 5 6 7 min out updates . --------------------------------------- . 0 * * * * * * * 0 0-0 0-1 < 0-*, etc. . 0 0 * * 0 0 0 1 1-0 1-7 < 7-0 . 0 * * 0 0 1 7 7-1 7-2 < 2-0, 7-4 < 4-0 . 7 * 7 0 0 2 2-7 none . * 7 0 0 4 4-7 4-6 < 6-0, 4-5 < 5-0, 4-3 < 3-* . 4 4 4 3 3-4 3-5 < 5-4 . 3 4 5 5-3 none . 4 6 6-4 none --- ----- Priority queue issues Need to implement DELETE MINIMUM UPDATE Priority queue holds non-tree vertices * priority is shortest edge to tree vertex * for each vertex added to the tree each of its edges may give a shorter connection to a non-tree vertex Requires handles! given vertex, have to change its value Example: heap implementation priorities move around in heap INDIRECT heap: keep track of indices (see book for details) -- | 9.9 | 9.9 | 1.4 | 3.1 | 3.1 | 5.5 | 1.1 | 1.4 | 6.7 | 6.7 | 4.2 | 4.2 | 5.5 | --- Another approach INDEXED TOURNAMENT Separate array on top of data keep track of winners do not move data itself ----- Indexed tournament implementation -- int *pq; int pqV; numType* pqval; void PQinit(numType val[], int V) { int i; pqV = V; pqval = val; pqval[V] = maxNum; pq = malloc((V+V)*sizeof(link)); for (i = 0; i < V; i++) pq[V+i] = V; for (i = V-1; i > 0; i--) pq[i] = pq[i+i]; } int onPQ(int v) { return pq[pqV+v] == v; } numType PQminkey() { return val[pq[1]]; } void promote(int k) { int j, l, r; pq[pqV+k] = k; for (j = (pqV+k)/2; j > 0; j = j/2) if (val[pq[j+j]] < val[pq[j+j+1]]) pq[j] = pq[j+j]; else pq[j] = pq[j+j+1]; } int PQdelmin() { int k = pq[1]; pq[pqV+k] = pqV; promote(k); return k; } PQreduce(int v, numType x) { val[v] = x; promote(v); } --- /lines 56 def ----- Indexed tournament example -- | 0.0 0 | 8 | 0 | 1 | *** 8 | 3.1 1 | 0 | 1 | *** 8 | 3.6 2 | 8 | 2 | *** 8 | *** 8 | 0 | 1 | *** 8 | *** 8 | 8 | 5 | *** 8 | 6.0 5 | 8 | 7 | *** 8 | 5.3 6 | 8 | 7 | *** 8 | 3.6 7 | 8 | 8 | 8 | 0 | 8 | 8 | 2 | 2 | 3.6 2 | 1.4 2 | 2 | 2 | *** 8 | *** 8 | 7 | 2 | *** 8 | 2.2 4 | 5 | 4 | 6.0 5 | 6.0 5 | 7 | 4 | 5.3 6 | 5.3 6 | 7 | 6 | 1.0 7 | 8 | 8 | 8 | 8 | 8 | 8 | 8 | 8 | 3 | 8 | 8 | 8 | 3 | *** 8 | 2.2 3 | 4 | 3 | 2.2 4 | 8 | 4 | 5 | 6.0 5 | 3.1 5 | 4 | 5 | 5.3 6 | 3.6 6 | 6 | 6 | *** 8 | *** 8 --- ----- Larger PFS MST example ----- Prim's algorithm Use adjacency matrix While going through the matrix to find the vertices connected to the current one, find the minimum at the same time val[k] neg: FRINGE; pos: TREE; -unseen: UNSEEN -- matrixpfs() { int k, t, min = 0; for (k = 1; k <= V; k++) { val[k] = -unseen; dad[k] = 0; } val[0] = -(unseen+1); // sentinel for (k = 1; k != 0; k = min, min = 0) { val[k] = -val[k]; if (val[k] == unseen) val[k] = 0; for (t = 1; t <= V; t++) if (val[t] < 0) { if (a[k][t] && (val[t] < -priority)) { val[t] = -priority; dad[t] = k; } if (val[t] > val[min]) min = t; } } } --- Invented in 1957 ----- Prim's algorithm vs. PFS Depends on whether graph is DENSE or SPARSE To within a constant factor PFS: (E + V) log V Prim's V*V Ex: 2 thousand vertices, 1 million edges PFS: 11 million steps Prims's: 4 million steps Ex: 1 million vertices, 2 million edges PFS: 60 million steps Prims's: 1 trillion steps Ex: 100 thousand vertices, 1 million edges PFS: 17 million steps Prim's: 100 billion steps Clever algorithms can reduce PFS to E + V log V Bottom line: * Prim's optimal for dense graphs * don't use Prim's for sparse graphs (space might be prohibitive, anyway) ----- Kruskal's algorithm UF algorithms from lecture 1 find a spanning tree THEOREM: If the edges come in order of their length, the spanning tree from UF algs is a MST! Algorithm: * sort the edges * use weighted quick union In Postscript: -- /kruskal { ufwinit 0 2 E E add 1 sub { /i exch def edges i get /v1 exch def edges i 1 add get /v2 exch def v1 v2 2 copy ufw { pop pop }{ drawedge } ifelse } for } def --- Invented in 1956. ----- Kruskal MST example Take edges in order, skip those that make cycles -- 7-1 1.0 * 7-2 1.4 * 4-3 2.2 * 5-3 2.2 * 7-4 2.2 * 5-4 3.1 0-1 3.1 * 0-2 3.6 6-4 3.6 * 7-0 3.6 0-6 5.3 0-5 6.0 --- ----- Larger Kruskal example ----- Kruskal's algorithm vs PFS Depends on structure of graph To within a constant factor PFS: E log V + V log V fancy PFS: E + V log V Kruskal: E + M log E where M is the number edges in the graph shorter than the longest edge in the MST Do Kruskal with heap built from the bottom up otherwise E log E for the sort Performance differences depend upon: * dense vs. sparse * structure of graph * priority queue algorithm * priority queue implementation All require space for whole graph (UF finds a spanning tree in O(V) space) MST for huge graphs?? COS 226 Lecture 19: Shortest paths Classic algorithms for solving "network" problems MINIMAL SPANNING TREE Kruskal's algorithm Prim's algorithm SHORTEST PATHS Dijkstra's algorithm Floyd's algorithm Related to generalized graph search using a priority queue Ex: telephone wiring Ex: supplies to troops ----- Generalized search (Priority-first) Use PQ, rather than stack or queue TREE vertices visited FRINGE vertices adjacent to visited UNSEEN vertices others Put all fringe vertices on PQ Search algorithm * take next vertex off PQ * change status to TREE * for each adjacent vertex UNSEEN: move to FRINGE FRINGE: update prioirity Encompasses several basic graph algorithms depth-first search (use decreasing counter for priority) breadth-first search (use increasing counter for priority) minimal spanning tree shortest path tree ----- Shortest paths SINGLE-SOURCE shortest paths: shortest path from x to each other node Algorithm: PFS starting at x val[k]: shortest distance from x to k priority on FRINGE: shortest known distance to node smallest fringe value: shortest distance to *any* node not on tree (won't find a shorter path) -- #define priority val[k]+t->w --- DFS and BFS also immediate from PFS BFS: priority is "time since start" DFS: priority is "time to finish" ----- PFS SPT example ----- PFS BFS and DFS examples ----- Shortest paths in Euclidean graphs Problem: find shortest path from x to y Algorithm: start shortest-path PFS at x stop when reaching y val[k]: shortest distance from x to k in TREE PLUS distance from k to y as the crow flies priority on FRINGE: shortest possible way from x to y -- #define priority val[k]+t->w + distance(t_>w, y) --- ----- All shortest paths Problem: Given weighted graph, compute a table giving the shortest weighted path from each node to each other node Ex: map of New England nodes: cities edges: roads connecting cities -- . P W L N . Providence 0 53 54 48 . Westerley 53 0 18 101 . newLondon 54 18 0 12 . Norwich 48 101 12 0 --- Norwich-Westerly: 101 miles?? 12 miles Norwich-New London 18 miles New London-Westerly --------- 30 miles ----- Floyd-Warshall algorithm Another ancient algorithm (1962) Want shorter path from x to j? take x to y, then y to j, if shorter For each x-y pair, do all j -- for (y = 1; y <= V; y++) for (x = 1; x <= V; x++) for (j = 1; j <= V; j++) if (a[x][y] + a[y][j] < a[x][j]) a[x][j] = a[x][y] + a[y][j]; --- Correctness proof: induction on y Running time: O(V^3) Negative edge weights? * much more difficult to solve * could get negative cycles COS 226 Lecture 20: Network Flow Advanced graph problems NETWORK FLOW MATCHING OPERATIONS RESEARCH Classical problems (1940s) Graph algorithm technology PQ and data structure design crucial Optimal solutions still not known Examples of REDUCTION BIPARTITE MATCHING no harder than network flow (actually, much less general) LINEAR PROGRAMMING no easier than network flow (actually, much more general) Network flow, matching, linear programming are each fields of study in their own right ----- Network flow NETWORK: weighted graph Consider network as abstraction for material FLOWING through the edges Interpret weights as CAPACITIES nodes as FLOW CONTROL SWITCHES Ex: oil flowing in pipes Ex: commodities flowing on roads and rails Ex: bits flowing in Internet Identify SOURCE: node where all material originates SINK: node where all material goes MAX FLOW PROBLEM: assign flows to edges that maximize total flow through the network dumb naive best ----- Increasing flow in a network Assume directed graph: all links one-way (downhill in examples) AUGMENTING PATH: path from source to sink with unused capacity on every edge Easy case all edges on path carry positive flow (up to capacity) More complicated case negative flow in BACKWARD edges (up to amount of flow) ABDF ACEF ACDBEF ----- Ford-Fulkerson algorithm Start with 0 flow everywhere Increase the flow along any augmenting path until no augmenting paths are left Problem 0: Does this lead to the maximum flow? Problem 1: Not really an algorithm, since many details are unspecified * How do we find an augmenting path? * Which augmenting path is best? Problem 2: Cost can be proportional to max capacity Ex: ABCD ACBD ABCD ACBD ABCD ... ----- Max flow/Min cut theorem CUT: set of edges separating source from sink THM: Finding the maximum flow is equivalent to finding the minimum cut Proof: Omitted THM: Ford-Fulkerson method gives maximum flow Proof sketch: If there is no augmenting path, identify the first full forward or empty backward edge on every path That set of edges defines a min cut Applies to an entire family of algorithms No matter how the augmenting paths are chosen, the flow is maximal when no more can be found Algorithm design goals * find paths quickly * use as few iterations as possible ----- Edmonds-Karp algorithms Idea 1: use BFS to find augmenting path Idea 2: find path that increases the flow the most BOTH easy to implement with PQ graph search (!) Change basic graph search routine to * use positive weights for forward path, negative weights for backward path * ignore full forward, empty backward edges * take source, sink as args, return 0 if no path -- for (;;) { if (!matrixpfs(1,V)) break; y = V; x = dad[V]; while (x != 0) { flow[x][y] = flow[x][y]+val[V]; flow[y][x] = -flow[x][y]; y = x; x = dad[y]; } } --- ----- Network flow example ----- Analysis of network flow algorithms Edmonds-Karp worst-case running time BFS: V E^2 (VE iterations, each cost E) best path: C(M, F) E log V depends on cut size and flow amount (M/(M-1))^C = F Ex: M = 1000, F = 1000: C = 6900 Ex: M = 10, F = 1000: C = 72 New algorithms * use better priority queue implementations * remember facts about search for last path to find new path * find multiple paths on one search Running times 1970 V^2 E 1977 V^2 E^(1/2) 1978 V^3 1978 V^(5/3) E^(2/3) 1980 V E log V 1986 V E log(V^2/E) Objection! real graphs not "average" or "worst case" simple (but not dumb) algorithms may be preferred in practice BUT simple optimal algorithm could still exist ----- Matching MATCHING set of edges with no vertex included twice MAXIMUM MATCHING no matching contains more edges WEIGHTED MATCHING weighted graph no matching has greater sum of edge weights BIPARTITE GRAPH two types of nodes all edges connect nodes of differing type BIPARTITE MATCHING maximum matching in bipartite graph Ex: graph: E5 A2 A1 C1 B4 C3 D3 B2 A4 D5 E3 B1 matching: A1 B4 C3 D5 E6 F2 ASSIGNMENT PROBLEM weighted bipartite matching Matching is difficult to solve in general Related to network flow ----- Matching examples Matching Bipartite graph ----- Reducing A to B SOLUTION: efficient algorithm that can give the answer for any problem instance Ex: Given any network, Edmonds-Karp algorithms solve network flow problem for that network REDUCTION: a way to solve problems by converting them to other problems Ex: Bipartite matching reduces to network flow Convert instance of bipartite matching to instance of network flow Solve network flow instance Convert network flow solution to bipartite matching solution Ex: Network flow reduces to linear programming (stay tuned) Reduction is not necessarily EFFICIENT Ex: Everything computable reduces to Universal Turing Machine BUT reduction, in practice, * is better than brute force * is better than investing years of research to understand details of a problem * gives insight into difficulty (partially order problems by difficulty) ----- Reducing bipartite matching to network flow Create two new nodes SOURCE with edges to all nodes of first type SINK with edges to all nodes of second type Make all capacities 1 Network flow easier for unit capacities NOTE: Reduction doesn't work for weighted case (assignment problem) ----- Linear programming Maximize OBJECTIVE FUNCTION subject to CONSTRAINTS Ex: maximize x+y subject to the constraints -x + y <= 5 x + 4y <= 45 2x + y <= 27 3x - 4y <= 24 x, y >= 0 Vast number of applications Early application: DIET problem min cost diet with adequate nutrients "Programming": formulate the equations (reduce given problem to linear programming) ----- Reducing network flow to linear programming Express capacities as inequalities flow at nodes as equalities Ex: maximize AB + AD subject to the constraints AB <= 6 CD <= 3 AC <= 8 CE <= 3 BD <= 6 DF <= 8 BE <= 3 EF <= 6 BD + BE = AB CD + CE = AC BD + CD = DF BE + CE = EF all >= 0 ----- Geometric interpretation Ex: (two variables) objective function maximized at vertex Ex: (three variables) Higher dimensions multifacted convex polytope: SIMPLEX objective function maximized at vertex ----- Approaches to linear programming Simplex method classical approach like Gaussian elimination pivot steps correspond to moving on simplex vertices exponential in worst case fast in practice Duality is point inside simplex? Ellipsoid methods new algorithmic approach geometric divide-and-conquer polynomial time (!) Challenge: make new methods faster than simplex for practical problems ----- Perspective Nearing the deep waters of NP-completeness Ex: INTEGER programming NP complete Ex: TRIpartite matching NP-complete 1960s Operations research 1970s Analysis of algorithms 1980s Geometric algorithms All relevant to wide variety of important practical problems Consider ALL approaches for new problems Reduction may help determine complexity may provide usable solution for practical instances Specialized algorithms may provide elegant optimal solution for all instances General methods may solve specific practical instance quickly (even if can't be proven fast) COS 226 Lecture 21: Multiplication How fast can we multiply? Small numbers: basic machine operation Approximate solution OK: "floating point" MIPS vs. FLOPS Exact solution required for numbers with millions of digits? small ex: 98734019238470981723409182730491827 times 66753875434605462654864811227186687 equals 6590878421402831216001745213641931592836606299109 07873145346994326436 Basic question about difficulty of computations ARITHMETIC COMPLEXITY also consider other operations add, multiply, exponentiate, divide, logarithm ... Applications more numerous than we might imagine Arithmetic coding ($%#?!) Cryptography Signal processing ----- Multiplying complex numbers Given two complex numbers x = a + b i y = c + d i Product x*y real part: a*c - b*d imaginary part: a*d + b*c Four multiplications, two additions Typical old-fashioned machine: 1 time unit for add 4 time units for multiply therefore, 18 time units for complex multiply Faster way to compute product (!) t = (a+b) * (c+d) = a*c + a*d + b*c + b*d Product x*y real part: a*c - b*d imaginary part: t - a*c - b*d THREE multiplications, five additions 17 time units for complex multiply Only works if multiply expensive relative to add software floating point old machines ----- Exponentiation How many multiplications to compute x^N ? If N is a power of 2, use successive squaring 1, x, x^2, x^4, x^8, x^16, x^32, ... If N is not a power of 2, do successive squaring then pick out numbers to multiply together by * divide-and-conquer * binary representation of N Ex: x = 2, N = 21 -- 1 2 4 8 16 32 64 128 256 512 1024 2048 4096 8192 16384 32768 65536 --- binary rep. of N: 2^21 = 2^16*2^4*2^1 = 65536*16*2 = divide-and-conquer: 2^21 = 2^10*2^11 = 2^5*2*5*2*5*2^6 = ... Unrealistic to consider if x and/or N large Realistic version (which is of practical use) THM: Less than 2 lg N multiplications are required to compute x^N mod M Linear in the number of bits to store N DISCRETE LOGARITHM problem (difficult) given x^N mod M, find x ----- Multiplying polynomials To multiply integers may as well consider polynomial multiplication finish up with Horner's method grade school grade school, no carries -- . 1 2 3 4 1 2 3 4 . 5 6 7 8 5 6 7 8 . ------- --------------- . 9 8 7 2 8*1 8*2 8*3 8*4 . 8 6 3 8 7*1 7*2 7*3 7*4 . 7 4 0 4 6*1 6*2 6*3 6*4 . 6 1 7 0 5*1 5*2 5*3 5*4 . ------------- --------------------------- . 7 0 0 6 6 5 2 --- polynomial -- . 3 2 1 0 . 1x + 2x + 3x + 4x . 3 2 1 0 . 5x + 6x + 7x + 8x . ---------------------- . 3 2 1 0 . 8x +16x +24x +32x . 4 3 2 1 . 7x +14x +21x +28x . 5 4 3 2 . 6x +12x +18x +24x . 6 5 4 3 . 5x +10x +15x +20x . ---------------------------------------- . 6 5 4 3 2 1 0 . 5x +16x +34x +60x +61x +52x +32x --- ----- Cost of multiplication Polynomial multiplication -- for (i = 0; i < 2*N-1; i++) r[i] = 0; for (i = 0; i < N; i++) for (j = 0; j < N; j++) r[i+j] += p[i]*q[j]; --- N^2 coefficient multiplications Integer multiplication R-L: -- . 32 . 52 . 61 . 60 . 34 . 16 . 5 . 7006652 --- L-R: use Horner's method to evaluate at x=10 -- . 5*10 = 50+16 = 66 . 66*10 = 660+34 = 6944 . 694*10 = 6940+60 = 7000 . 7000*10 = 70000+61 = 70061 . 70061*10 = 700610+52 = 700662 . 700662*10 = 7006620+32 = 7006652 --- Extra factor of log N might be required Bottom line: quadratic algorithms can multiply thousand-digit numbers but not million-digit numbers ----- Divide-conquer polynomial multiplication Example -- . 2 2 . ( x ( x+2) + (3x+4) ) * ( x (5x+6) + (7x+8) ) . 4 . x ( x+2)*(5x+6) . 2 . + x ( x+2)*(7x+8) . 2 . + x (5x+6)*(3x+4) . + (3x+4)*(7x+8) . 6 5 4 . 5x +16x +12x . 4 3 2 . 7x +22x +16x . 4 3 2 . 15x +38x +24x . 2 . 21x +52x +32 . ---------------------------------------- . 6 5 4 3 2 . 5x +16x +34x +60x +61x +52x +32 --- In general, if p and q are degree 2N -- . N N . p(x)*q(x) = (x ph(x)+pl(x))*(x qh(x)+ql(x)) . 2N . p(x)*q(x) = x ph(x)*qh(x) . N . + x (ph(x)*ql(x) + pl(x)*qh(x)) . + pl(x)*ql(x) --- Scalar multiplication count m(0) = 1 m(2N) = 4*m(N) Solution: m(N) = N^2, since (2N)^2 = 4N^2 No savings from divide and conquer? ----- Polynomial multiplication (continued) Use same technique as for complex numbers In general, if p and q are degree 2N -- . N N . p(x)*q(x) = (x ph(x)+pl(x))*(x qh(x)+ql(x)) . t(x) = (ph(x) + pl(x)) * (qh(x) + ql(x)) . 2N . p(x)*q(x) = x ph(x)*qh(x) . N . + x (t(x)-ph(x)*qh(x) + pl(x)*ql(x)) . + pl(x)*ql(x) --- Ex: -- . 2 2 . ( x ( x+2) + (3x+4) ) * ( x (5x+6) + (7x+8) ) . 2 . ( x+2)*( 5x+ 6) = 5x + 16x + 12 . 2 . (3x+4)*( 7x+ 8) = 21x + 52x + 32 . 2 . t(x) = (6x+8)*(10x+12) = 60x + 152x + 96 . 4 2 . x ( 5x + 16x + 12) . 2 2 . + x (34x +84x+52) . 2 . + 21x + 52x + 32 . ---------------------------------------- . 6 5 4 3 2 . 5x +16x +34x +60x +61x + 52x + 32 --- Scalar multiplication count m(0) = 1 m(2N) = 3*m(N) Solution: m(N) = N^lg3, since (2N)^lg3 = 3N^lg3 Bottom line: can do tens of thousands of digits but not millions of digits ----- Basic computations on polynomials A polynomial of degree N-1 2 3 N-1 p0 + p1 x + p2 x + p3 x + ... + p(N-1) x is defined by N numbers (the coefficients) p0 p1 p2 ... p(N-1) The polynomial associates any N numbers x0 x1 x2 ... x(N-1) with another set of N numbers y0 y1 y2 ... y(N-1) simply by p(x0) = y0 p(x1) = y1 p(x2) = y2 p(x3) = y3 ... EVALUATION find the y's, given the p's and x's ROOT EXTRACTION find the x's, given the p's and y's INTERPOLATION find the p's, given the x's and y's ----- Fast polynomial multiplication EVALUATE-MULTIPLY-INTERPOLATE 2 Ex: (1+x)*(2+3x) = 2 + 5x + 3x Pick three distinct numbers: 0, 1, 2 (say) EVALUATE -- . x 0 1 2 . ----------------------- . 1 + x 1 2 3 . 2 + 3x 2 5 8 --- MULTIPLY -- . x 0 1 2 . ------------------------------- . (1 + x)*(2 + 3x) 2 10 24 --- INTERPOLATE -- . (x-1)(x-2) (x-0)(x-2) (x-0)(x-1) . 2 ---------- + 10 ---------- + 24 --------- . (0-1)(0-2) (1-0)(1-2) (2-0)(2-1) . 2 2 2 x - 3x + 2 -10 (x - 2x) + 12 (x - x) . 2 . 3x + 5x + 2 --- Bad news * evaluate, interpolate, seem to take N^2 steps Good news * method works for ANY set of distinct points ----- Complex roots of unity Ex: (1-i)(1-i) = 1 -2i -1 = -2i (1-i)^4 = 4 (.707.. - .707..i)^4 = 1 For each N, there are N different complex numbers that evaluate to N when raised to the Nth power -- . 0 1 2 3 4 5 6 7 . W W W W W W W W . 8 8 8 8 8 8 8 8 --- . 5 1 w -1 w w: PRINCIPAL Nth root of unity . k (2 pi i k)/N . W = e . N x -- . 0 1 2 3 4 5 6 7 . W W W W W W W W . 8 8 8 8 8 8 8 8 --- -- . 0 1 2 3 0 1 2 3 . W W W W -W -W -W -W . 8 8 8 8 8 8 8 8 --- x^2 -- . 0 1 2 3 0 1 2 3 . W W W W W W W W . 4 4 4 4 4 4 4 4 --- ----- EVALUATE in NlogN steps p(x) -- . 1 2 3 4 5 6 7 . p + p x + p x + p x + p x + p x + p x + p x . 0 1 2 3 4 5 6 7 --- split into even, odd coefficients -- . 2 4 6 2 4 6 . p + p x + p x + p x + x(p + p x + p x + p x ) . 0 2 4 6 1 3 5 7 --- . 2 2 . p(x) = p (x ) + xp (x ) . e o To evaluate at eighth roots of unity use values of even, odd polynomials at fourth roots of unity -- . 0 0 0 0 . p(w ) = p (w ) + w p (w ) . 8 e 4 8 o 4 . 1 1 1 1 . p(w ) = p (w ) + w p (w ) . 8 e 4 8 o 4 . 2 2 2 2 . p(w ) = p (w ) + w p (w ) . 8 e 4 8 o 4 . 3 3 3 3 . p(w ) = p (w ) + w p (w ) . 8 e 4 8 o 4 . 4 0 4 0 . p(w ) = p (w ) + w p (w ) . 8 e 4 8 o 4 . 5 1 5 1 . p(w ) = p (w ) + w p (w ) . 8 e 4 8 o 4 . 6 2 6 2 . p(w ) = p (w ) + w p (w ) . 8 e 4 8 o 4 . 7 3 7 3 . p(w ) = p (w ) + w p (w ) . 8 e 4 8 o 4 --- multiplication count m(0) = 1 m(N) = 2 m(N/2) + N m(N) = N lg N ----- EVALUATE implementation Use the perfect shuffle -- . p p p p p p p p . 0 1 2 3 4 5 6 7 --- -- . p p p p p p p p . 0 2 4 6 1 3 5 7 --- To evaluate inplace unshuffle evaluate halves inplace recursively use values to compute result ----- Interpolating at roots of unity For the Nth roots of unity, it turns out that INTERPOLATE is truly the inverse of EVALUATE r(x) -- . 1 2 3 4 5 6 7 . r + r x + r x + r x + r x + r x + r x + r x . 0 1 2 3 4 5 6 7 --- s(x) -- . 1 2 3 4 5 6 7 . s + s x + s x + s x + s x + s x + s x + s x . 0 1 2 3 4 5 6 7 --- -- . 0 0 . r(w ) = s s(w ) = r . 8 0 8 0 . 1 -1 . r(w ) = s s(w ) = r . 8 1 8 1 . 2 -2 . r(w ) = s s(w ) = r . 8 2 8 2 . 3 -3 . r(w ) = s s(w ) = r . 8 3 8 3 . 4 -4 . r(w ) = s s(w ) = r . 8 4 8 4 . 5 -5 . r(w ) = s s(w ) = r . 8 5 8 5 . 6 -6 . r(w ) = s s(w ) = r . 8 6 8 6 . 7 -7 . r(w ) = s s(w ) = r . 8 7 8 7 --- Fundamental property (Fourier inversion theorem) To INTERPOLATE at roots of unity, use the EVALUATE routine (!) ----- Fast Fourier transform To multiply to polynomials of degree N-1 EVALUATE both at (2N-1) roots of unity MULTIPLY (2N-1) results term-by-term INTERPOLATE by evaluating at (2N-1) inverse roots of unity Total time proportional to N lg N Shuffling keeps space proportional to N Bottom line: can multiply numbers with millions of digits "multiply" = "convolution" basic mathematical operation on series basic operation in digital signal processing allows "real time" processing COS 226 Lecture 22: Cryptology Cryptology: Cryptography + Cryptanalysis Protocols and algorithms * for sending and receiving secret messages * for reading someone else's secret messages Traditional applications: military Modern applications: commercial Never can "prove" security: too much can go wrong Role of efficient algorithms * practical to encode messages? * implement codebreaking attacks * (new meaning to "easy", "hard") Parallel universe (primarily secret until 1980s) Old world: good guys and bad guys generated $$ for supercomputers machine code and machines codes and attacks New world: everyone (Alice, Bob, and Eve) generates venture capital $$ C++ and Java protocols Refs: Codebreakers, D. Kahn Applied Cryptography, B. Schneier ----- Unbreakable cryptosystem ONE-TIME PAD PROTOCOL * Alice and Bob exchange keys by messenger (time passes) * Bob encrypts message with key (ciphertext) * Bob sends ciphertext to Alice * Alice decrypts ciphertext with key * Bob and Alice destroy key Eve can read ciphertext, but not message Ex: -- A T T A C K A T D A W N --- encrypt with random key -- A T T A C K A T D A W N message M A K G A U E B H M C V V M add key N S D H D E E C A M G W R ciphertext --- send this message -- N S D H D E E C A M G W R --- decrypt with same key (communicated previously) -- N S D H D E E C A M G W R ciphertext M A K G A U E B H M C V V M subtract key A T T A C K A T D A W N message --- PROBLEMS * need random key as long as message * need secure protocol to distribute key ----- Linear feedback shift registers Solution to key length problem Small key, very large number of random bits Register with bit values at time T+1 determined from values at time T Ex: 11-bit register -- T 9 8 7 ... 1 0 T+1 10 9 8 2 10+3 (XOR) --- -- 0 1 1 0 1 0 0 0 0 1 0 initial value 1 1 0 1 0 0 0 0 1 0 0 1 0 1 0 0 0 0 1 0 0 1 0 1 0 0 0 0 1 0 0 1 0 1 0 0 0 0 1 0 0 1 0 0 0 0 0 0 1 0 0 1 0 0 1 0 0 0 1 0 0 1 0 0 1 1 0 0 1 0 0 1 0 0 1 1 0 0 1 0 0 1 0 0 1 1 0 0 1 0 0 1 0 0 1 1 0 0 1 0 0 1 0 0 1 1 0 0 1 0 --- Bit 0 values comprise pseudo-random bitstream N-bit initial value gives sequence of length 2^N provided bit taps are "primitive" ----- Stream cipher example To transmit the message "SEND FOOD" ENCODE with S = 10011, E = 00101, N = 01110, etc. -- . S E N D F O O D . 100100010101110001000000000110011100111000100 --- ENCRYPT with key from LFBSR message -- . 100100010101100001000000000110011100111000100 --- key -- . 001001100100001101010100001111010100011100101 --- ciphertext -- . 101101110001101100010100001001001000100100001 --- TRANSMIT ciphertext -- . V * M Q H I D I A --- DECRYPT with keystream from same initial value ciphertext -- . 101101110001101100010100001001001000100100001 --- key -- . 001001100100001101010100001111010100011100101 --- message -- . 100100010101100001000000000110011100111000100 --- decode -- . S E N D F O O D --- ----- Digression: random number generators Can't generate random numbers on a computer von Neumann: "anyone who considers arithmetical methods of producing random digits is, of course, in a state of sin" Linear congruential generators Additive congruential generators True random bits hard to find clock time between keystrokes cosmic rays Pseudo-random more convenient for keystream Cheap source of random bits: some encrypted message? ----- Simple attack on LFBSR If cryptanalyst knows part of the message, the entire keystream amy be available Ex: operator puts two spaces at the beginning -- . S E N D F O . 000000000010010001010111000100000000011001110 --- encrypts with key from LFBSR -- . 000000000010010001010111000100000000011001110 . 001001100100001101010100001111010100011100101 . 001001100110011100000011001011010100000101011 --- transmits this ciphertext -- . 001001100110011100000011001011010100000101011 . D Y S P F K J A K --- First 10 bits of ciphertext are the key! Any 11 bits of key (plus a machine) gives ALL the key bits Cryptanalyst * knows from spies that machine is 11-bit LFBSR * guesses that first two chars might be spaces * tries both possibilities for 11th initial bit * generates key * reads message ----- WW-II ENIGMA Typewriter-like machine Symmetric code type A, it prints G type G, it prints A Particular code depends on machine settings KEY DISTRIBUTION: "codebook" Sender types message, machine prints ciphertext Ciphertext transmitted in the clear Receiver types ciphertext, machine prints message Typewriter-like technology + rotors and plugboards Over 137629917937512000 (10^17) different settings Thousands of machines, millions of messages Good guys broke the code (and won the war) by * getting some machines * getting some rotors * getting a codebook * figuring out what some messages were * building a computer to try remaining possibilities One of the good guys: Alan Turing (!) Ref: "Alan Turing, the Enigma" A. Hodges ----- Modern applications Widespread application of computing creates opportunities New research in cryptography provides necessary technology * ecommerce * communication among citizens * better military systems Challenges: (social and political issues) key distribution efficient encode/decode ----- One-way trapdoor functions Concept that opened the door to modern cryptography (Diffie-Hellman, 1975) One-way functions easy to compute difficult to compute inverse Ex: write message on mirror, smash mirror One-way trapdoor functions easy to compute difficult to compute inverse easy to compute inverse with key Ex: put message in locked mailbox person with key can get message Key to cryptography is to find * good one-way functions * trapdoors FACTORING p*q = N: easy find p given N: difficult find p given q, N: easy DISCRETE LOG x^t mod p = M: easy find t given x, p, M: difficult ----- Diffie-Hellman key exchange Publicly available keys: large integers N and g (100 digits) Alice chooses random x and computes X = g^x mod N Bob chooses random y and computes Y = g^y mod N Alice and Bob exchange X and Y (but keep x and y secret) Alice computes Y^x = g^(xy) mod N = K Bob computes X^y = g^(yx) mod N = K Computations all "easy" hundreds of 100-digit multiplications Both Alice and Bob have K Eve needs to solve discrete logarithm on 100 digit numbers to get K ----- Public-key cryptosystem Every registered user is issued P: public (encryption) key S: secret (decryption) key Public keys are published (phone book) User responsible for keeping private key secret PROTOCOL * Alice computes C = P(M) using Bob's public key * Alice transmits ciphertext to Bob * Bob computes M = S(C) using his private key Eve can read ciphertext, but not message (without Bob's private key) Works if all (S, P) pairs satisfy 1. S(P(M)) = M for every message M 2. All (S, P) pairs are distinct 3. Deriving S from P is as hard as reading M 4. Both S and P are easy to compute 1. and 2. easily arranged 3. and 4. hard to design ----- RSA encryption/decryption Every registered user gets encryption key P: two integers N and p decryption key S: N and a third integer s Assume all numbers 100s of digits Encode the message as a number Break into pieces smaller than N (less than lgN bits) Alice computes C = P(M) using Bob's public key by computing M^p mod N Bob computes M = S(C) using his private key by computing C^s mod N N, p, and s chosen to make the system work as follows: * generate three large primes s, x, y * take N = x*y * choose p so that ps mod (x-1)(y-1) = 1 ----- RSA example Suppose x = 47 y = 79 s = 97 Then compute N = 3713 and p = 37 -- A T T A C K A T D A W N --- ENCODE: A = 01, T = 20, C = 03, K = 11, etc. -- . A T T A C K A T D A W N . 0120200103110001200004012314 --- ENCRYPT using public key 37 0120^37 = 1040, 2001^37 = 2932, (mod 3713) ... -- . 0120 2001 0311 0001 2000 0401 2314 . 1404 2932 3536 0001 3284 2280 2235 --- TRANSMIT ciphertext -- . 1404293235360001328422802235 . N D * * * * A * * V * V * --- DECRYPT using secret key 97 1404^97 = 1040, 2001^97 = 2932, (mod 3713) ... -- . 1404 2932 3536 0001 3284 2280 2235 . 0120 2001 0311 0001 2000 0401 2314 --- DECODE -- . 0120200103110001200004012314 . A T T A C K A T D A W N --- ----- RSA algorithms ENCRYPT, DECRYPT Use successive squaring (see Lecture 20) THM: Less than 2 lg N multiplications are required to compute x^N mod M FIND RANDOM 100-DIGIT PRIME Hard way: generate random 100-digit number, try to factor it. Fast way: probabilistic method FIND p GIVEN s, x, and y: Extended Euclidean algorithm RSA applicability rests on fast multiplication (!) present: use hybrid method (ex: PGP) future: unnoticed special-purpose chips Bottom line: practical algorithm widely applicable depends on fundamental research results in theoretical CS and number theory (RSA inc. went public for $250 million) COS 226 Lecture 23: Dynamic Programming DYNAMIC PROGRAMMING * old term from operations research * recursion revisited Modern definition of dynamic programming: "Bottom-up implementation of recursive programs with overlapping subproblems" Top-down implementation sometimes easier Typically improves running time of an algorithm from exponential time to polynomial time (!) Many applications ----- Fibonacci numbers F(i) = F(i-1) + F(i-2) -- 0 0 10 55 20 6765 1 1 11 89 21 10946 2 1 12 144 22 17711 3 2 13 233 23 28657 4 3 14 377 24 46368 5 5 15 610 25 75025 6 8 16 987 26 121393 7 13 17 1597 27 196418 8 21 18 2584 28 317811 9 34 19 4181 29 514229 --- Numbers grow exponentially F(i)/F(i-1) approaches 1.618... (golden ratio) F(45) more than one billion Recursive program to compute F(i) -- int F(int i) { if (i < 1) return 0; if (i == 1) return 1; return F(i-1) + F(i-2); } --- Problem: program is slow! Easy proof: running time satistfies the recurrence F(i) = F(i-1) + F(i-2) ----- Overlapping subproblems Fibonacci computation is slow because subproblems in recursive computation overlap To compute F(4) compute F(3) compute F(2) compute F(1) = 1 compute F(0) = 0 return 1 compute F(1) = 1 return 2 compute F(2) compute F(1) = 1 compute F(0) = 0 return 1 return 3 return 5 Fix by remembering solutions already computed ----- Avoid recomputing known results Maintain array (indexed by parameter value) * zero if recursive routine not yet called for that value yet * result to be returned otherwise First call on procedure for a given value: compute as before, save value Second and subsequent calls: use value from first call Ex: Fibonacci numbers less than 10^9 -- for (i = 0; i < 45; i++) known[i] = 0; int F(int i) { int t; if (known[i]) return known[i]; if (i == 0) t = 0; if (i == 1) t = 1; if (i > 1) t = F(i-1) + F(i-2); known[i] = t; return t; } --- LINEAR running time LOGARITHMIC amount of extra space ----- Recursive call structures Straight recursive algorithm Remembering known results To compute F(5) compute F(4) compute F(3) compute F(2) compute F(1) = 1 compute F(0) = 0 return 1 use known F(1) = 1 return 2 use known F(2) = 1 return 3 use known F(3) = 2 return 5 ----- Bottom-up approach DYNAMIC PROGRAMMING * Tabulate answers to subproblems * Build table in increasing order of problem size * Use early entries to compute later ones (no recomputation needed) Ex: Fibonacci numbers -- F[0] = 0; F[1] = 1; for (j = 2; j < i; j++) F[j] = F[j-1] + F[j-2]; --- "Dynamic programming solution to the problem of computing Fibonacci numbers" Perhaps the simplest nontrivial example of dynamic programming Reduces running time from exponential to linear Similar savings are available for a wide variety of important practical problems Applicable to *any* pure recursive algorithm with * no effects on global variables * enough space to save solutions to subproblems No overlap: bottom-up divide-and-conquer ----- Knapsack problem Given N types of food items, where each item * takes a certain amount of space * provides a certain amount of nutrients KNAPSACK PROBLEM Fill a knapsack of capacity M with items such that total amount of nutrients is maximized Ex: -- size value apple 3 4 banana 4 5 chocolate 7 10 donut 8 11 egg 9 13 --- In a knapsack of capacity 17 two apples + banana + chocolate gives 23 donut + egg gives 24 (maximum) Applications * shipping cargos * job scheduling ... ----- Recursive solution of knapsack problem For each item type, compute the value obtained if the last item added is of that type Pick the maximum (also could return ID of last item added) -- int knap(int capacity) { int i, j, max, t; for (i = 0, max = 0; i < N; i++) if ((j = capacity-size[i]) >= 0) if ((t = knap(j) + val[i]) > max) max = t; return max; } --- DON'T USE THIS PROGRAM! overlapping subproblems excessive recomputation exponential time Use dynamic programming (TD or BU) to avoid recomputation for overlapping subproblems ----- "Known answers" (TD) approach To avoid recomputation in knapsack algorithm remember, for every knapsack capacity * total cost for best way to fill knapsack * last item used in filling it -- int knap(int C) { int i, j, max, maxj, t; if (costKnown[C]) return costKnown[C]; for (i = 0, max = 0, j = 0; i < N; i++) if ((j = cap-size[i]) >= 0) if ((t = knap(j) + val[i]) > max) { max = t; maxj = j; } costKnown[C] = max; bestKnown[C] = maxj; return max; } --- Use bestKnown to recover set of items Maintain elegance of recursive formulation without sacrificing efficiency ----- Bottom-up approach to knapsack problem Generalized version of Fibonacci numbers program Use previous solutions to compute next one -- for (i = 0; i < C; i++) cost[i] = 0; for (i = 0; i < C; i++) for (j = 1; j <= N; j++) if (i >= size[j]) if (cost[i] < cost[i-size[j]]+val[j]) { cost[i] = cost[i-size[j]]+val[j]; best[i] = j; } --- -- 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 4 5 5 5 10 11 13 14 15 a b b b c d e a a --- Total time proportional to N*C Another approach N passes through the cost array [just interchange "for" loops] ----- Dynamic programming summary Many problem solutions are elegantly described as recursive programs If subproblems overlap, programs may require exponential time TOP-DOWN (known answers) approach Mechanically modify recursive routine to * test if called yet for the given parameter value * if so, return previously computed value * if not, compute value and save it BOTTOM-UP (standard DP) approach * Tabulate answers to subproblems * Build table in order of problem size * Use early entries to compute later ones Further improvements often possible * do computations in particular order * use specialized knowledge to eliminate more computations ----- Dynamic programming with two variables OPTIMAL BST problem Given access frequencies, minimize average search cost (weighted path length) in BST Ex: weighted path length 1*1 + 2*(4+2) + 3*(2+3+1) + 4*5 = 51 weighted path length 1*3 + 2*(4+5) + 3*(2+2) + 4(1+1) = 41 ----- Top-down solution to optimal BST problem Recursive program with two arguments best subtree for ith through jth keys -- int BST(int i, int j) { int k, min, t; for (k = i, min = 0; k <= j; k++) if (t = BST(i, k-1) + BST(k+1, j) < min) min = t; return min; } --- DON'T USE THIS PROGRAM overlapping subproblems, exponential time Known values (top-down) approach * precisely the same as for one-variable * eliminate recomputation for overlapping subproblems by saving and recalling known values in 2D array ----- Bottom-up solution to optimal BST problem Two-dimensional array * diagonal has subproblems of size 1 * one up from diagonal has subproblems of size 2 * two up from diagonal has subproblems of size 3 ... -- for (i = 1; i <= N; i++) for (j = i+1; j <= N+1; j++) cost[i][j] = MAX; for (i = 1; i <= N; i++) cost[i][i] = f[i]; for (i = 1; i <= N+1; i++) cost[i][i-1] = 0; for (j = 1; j <= N-1; j++) for (i = 1; i <= N-j; i++) { for (k = i; k <= i+j; k++) { t = cost[i][k-1]+cost[k+1][i+j]; if (t < cost[i][i+j]) { cost[i][i+j] = t; best[i][i+j] = k; } } for (k = i; k <= i+j; ) cost[i][i+j] += f[k++]; } --- ----- Optimal BST example "cost" array -- . 4 8 11 19 31 37 41 . 2 4 10 20 25 28 . 1 5 14 18 21 . 3 11 15 18 . 5 9 12 . 2 4 . 1 --- "best" array -- . A A B D D D . B D D E E . D E E E . E E E . E E . F --- ----- Optimal recursive decomposition problems Trees model many other computation problems "Optimal divide-and-conquer" MATRIX CHAIN PRODUCT problem Given a list of matrices, find optimal prenthiesization for pairwise multiplication TRIANGULATION Divide a convex polygon into triangles using minimal lines of total length Correspondence between trees and * parenthesizations * triangulated polygons ... ----- String processing DP example LONGEST COMMON SUBSEQUENCE problem Find the longest common subsequence in 2 strings Applications: molecular biology, text editors, .... subsequence: subset of string chars, in order 2^N different subseqs in string of length N Ex: LCS in ABCBDAB and BDCAGA is BCBA Recursive program is a description of a recursively defined solution -- int LCSlen(int i, int j) { if ((i == -1) || (j == -1)) return 0; if (a[i] == b[j]) return 1+LCSlen(i-1,j-1); t = LCSlen(i, j-1); u = LCSlen(i-1, j); if (t > u) return t; else return u; } --- As usual, DON'T USE THIS PROGRAM It takes exponential time as is, but we can * change to use known values OR * compute values bottom-up ----- "Far too many" subproblems Dynamic programming cannot be used unless space is available to hold answers to subproblems Three important cases illustrated by different versions of knapsack problem EXAMPLE 1: Sizes are reals, not integers -- size value apple 30000 .0004 banana .0001 12345 chocolate .0002 21244 donut .1234 .2125 egg .0012 123 --- Table size depends on precision, relative sizes typically much more precision than space EXAMPLE 2: Allow at most one item of each type in knapsack Different subproblem for each SUBSET of item types 2^N subsets: EXPONENTIAL space EXAMPLE 3: Bananas and eggs can't go on the bottom, etc. Different subproblem for each PERMUTATION of item types N! permutations: superEXPONENTIAL space ( N! grows like 2^(N lgN) ) ----- NP-complete problems For a large class of important problems no algorithms are known that are guaranteed to be fast EFFICIENT, TRACTABLE running time bounded by polynomial on input size, no matter what the input INEFFICIENT, INTRACTABLE running time exponential in input size on some input Dynamic programming no help because exponential SPACE required for DP solution Can't guarantee solution to intractable problems even with a supercomputer!! * tough to explain to the boss... * source of frustration for early programmers * theory provides some consolation Includes a very large class of natural problems (all equivalent wrt difficulty) Fundamental issue in the theory of computation ----- Examples of NP-complete problems * TRAVELING SALESMAN A salesman needs to visit a set of N cities. Find a route that minimizes travel distance. * SCHEDULING A set of jobs of varying length need to be done on two identical machines before a certain deadline. Can the jobs be arranged so that the deadline is met? * SEQUENCING A set of four-character fragments have been obtained by breaking up a long string into overlapping pieces. Can the fragments be reconstituted into the long string? * TRIPARTITE MATCHING Given three sets of N individuals, and a set of triples with one from each set, is there a subset of the triples containing each individual just once? * SATISFIABILITY Is there a way to assign truth values to a given logical formula that makes it true?