ALGORITHM DESIGN
Greedy algorithms. A greedy algorithm solves an
optimization problem by making locally optimal choices at each step.
Greedy algorithms typically do not compute globally optimal
solutions. In some cases they do (but typically require a non-trivial argument to show why), and in others they can still produce good (but not optimal) results.
Familiar examples (all of which do compute globally optimal solutions):
- Kruskal's minimum spanning tree algorithm (repeatedly add lowest-weight edge that does not create a cycle)
- Prim's minimum spanning tree algorithm (repeatedly add lowest-weight edge that connects a new vertex to the spanning tree)
- Huffman coding (repeatedly combine two smallest-frequency tries)
- Dijkstra's shortest path algorithm (repeatedly relax un-visited vertex with shortest distance to source)
- Ford-Fulkerson max flow algorithm (repeatedly add augmenting paths)
Network flow.
Many problems can be modeled as problems on edge-weighted graphs and digraphs:
- Shortest paths
- Maxflow
- Minimum spanning tree
Reducing a problem to one of these fundamental network
problems is often an effective strategy.
Familiar examples:
- Compute the mincut of a flow network by finding its maxflow
- Bipartite matching (connect source to one cell of the bipartition and sink to the other, and compute a maximum flow. Edges with non-zero flow belong to maximal matching)
- Find paths of least importance in an image via shortest paths (seam carving)
Divide-and-conquer. Divide-and-conquer algorithms solve a
problem by breaking it into into subproblems, recursively solving
each subproblem, and combining the results.
Familiar examples:
- Mergesort (divide array in half, sort each half, and merge the sorted halves)
- Quicksort (select pivot, partition array into elements ≤ and ≥ the pivot, sort each in place)
Dynamic programming. Dynamic programming is design
strategy that is similar to divide-and-conquer. The defining
characteristic of dynamic programming is that the subproblems
overlap, and we store the solution to each subproblem to avoid the
cost of re-computing it.
Familiar examples:
- Shortest paths in directed acyclic graphs by relaxing
vertices in topological order
- Bellman-Ford
Randomization.
A randomized algorithm is an algorithm whose run-time (or output) depends on the results of random coin flips.
Randomized algorithms are typically evaluated on the basis of their expected running time
the average of all its possible run-times weighted by their probability).
Familiar examples:
- Quicksort (avoid worst-case performance in practice by shuffling input array)
- Quickselect (avoid worst-case performance in practice by shuffling input array)
Recommended Problems
C level
- Let G be a directed graph. A coloring of G is a function mapping each vertex to a color, so that no two adjacent vertices are assigned the same color.
Greedy graph coloring coloring is an algorithm that computes a coloring as follows. Call the available colors the palette, and suppose that there is a natural order on the colors (e.g., the smallest color is red, followed by blue, followed by green, and so on). Traverse the vertices of the graph (in any order), at each step assigning the vertex the lowest color in the palette that is not already assigned to one of its neighbors. Does greedy graph coloring produce a globally optimal result (i.e., using the fewest colors from the palette)? Why or why not?
Answers
Greedy coloring does always produce globally optimal results. For example, consider a graph with four vertices v1, v2, v3, and v3, with
edges (v1,v3), (v2,v4), and (v3,v4). Coloring the vertices ascending order: v1 is assigned red,
v2 is assigned red, v3 is assigned blue, and finally v4 is assigned green. However, two colors is sufficient (assign v1 and v3 red, v2 and v4 blue).
Determining whether there is a coloring of a graph that uses fewer
than a given number of colors is NP-complete. Greedy graph
coloring (with a few improvements) produces good results in
practice, and is the basis of the register allocation algorithms
used in many compilers.
B level
- Design an algorithm that generates a number uniformly at random between 0 and n that is not divisible by 7 or 11, and analyze its running time.
Answers
Generate a random number between 0 and n uniformly at
random. If it is not divisible by 7 or 11, then return it; otherwise,
select a new random number.
Since the probability of generating a number that is divisible by 7 or 11 is at most 1/7 + 1/11 = 18/77, the expected number of numbers we need to generate before finding one that is not divisible by 7 or 11 is at most 1/(1-18/77), or ~1.3.
- Imagine that we have an n-by-n grid, such that some squares on the grid are filled with obstacles. A robot located at in the bottom left corner of the grid, and wishes to move to the top right corner by a sequence of moves (go up, go right, go diagonally up and to the right) while avoiding squares that contain obstacles. Design an algorithm to determine the number of ways a robot may accomplish this task.
Answers
Divide-and-conquer: let paths(i,j) be the number of ways to get to square (i,j) (taking the bottom left to be (0,0) and upper right to be (n,n)). Our goal is to compute paths(n,n). Compute paths(i,j) as follows:
- Base case: paths(0,0) = 1. For simplicity, define paths(-1,j) = paths(i,-1) = 0 for all i and j.
- Recursive case:
If square (i,j) contains an obstacle, paths(i,j) = 0. Otherwise,
paths(i,j) is the sum of paths(i-1,j), paths(i,j-1), and paths(i-1,j-1).
This algorithm runs in exponential time, but using dynamic programming it can be reduced to O(n
2) time
A level
- Suppose that G is a graph where each edge e is associated with a positive capacity ce and each vertex v is associated with an integer demand dv (note that dv can be positive (indicating a demand), negative (indicating a supply), or zero). A circulation is an function f mapping vertices to integers such that
-
For each edge e we have that 0 ≤ f(e) ≤ ce
-
For each vertex v we have that the sum of the flows assigned to the outgoing edges of v minus the sum of the flows assigned to the incoming edges of v is exactly dv
Design an algorithm to determine whether a graph has a circulation.
Answers
Reduce to maxflow. Construct a graph by adding a virtual source node with an edge to each vertex v with dv < 0 that has capacity -dv, and an virtual sink node with an edge to each
u with du > 0 that has capacity du. Compute the max flow of the the resulting graph. If it is equal to the sum of du over u with du > 0, then the original graph has a circulation, otherwise it does not.