Implement the classic Dijkstra's shortest path algorithm and optimize it for maps. Such algorithms are widely used in geographic information systems (GIS) including MapQuest and GPS-based car navigation systems.
Maps. For this assignment we will be working with maps, or graphs whose vertices are points in the plane and are connected by edges whose weights are Euclidean distances. Think of the vertices as cities and the edges as roads connected to them. To represent a map in a file, we list the number of vertices and edges, then list the vertices (index followed by its x and y coordinates), then list the edges (pairs of vertices), and finally the source and sink vertices. For example, input6.txt represents the map below:
Dijkstra's algorithm. Dijkstra's algorithm is a classic solution to the shortest path problem. It is described in Sedgewick, Chapter 21. The basic idea is not difficult to understand. We maintain, for every vertex in the graph, the length of the shortest known path from the source to that vertex, and we maintain these lengths in a priority queue. Initially, we put all the vertices on the queue with an artificially high priority and then assign priority 0.0 to the source. The algorithm proceeds by taking the lowest-priority vertex off the PQ, then checking all the vertices that can be reached from that node by one edge to see whether that edge gives a shorter path to the node from the source than the shortest previously-known path. If so, it lowers the priority to reflect that fact.
Here is a step-by-step description that shows how Dijkstra's
algorithm finds the shortest path 0-1-2-5 from 0 to 5 in the example above.
This process gives the length of the shortest path. To keep track of the
path, we also maintain for each vertex, its predecessor on the shortest
path from the source to that vertex.
The files
graph.h,
graph.c,
point.c,
point.h,
pqindex.c,
pqindex.h,
paths.c
and
plotit.c
provide a bare bones implementation of Dijkstra's algorithm for maps
which you should use as a starting point.
The client paths.c solves many shortest path problems;
it uses the function GRAPHshowsp() to print the shortest
paths.
The client plotit.c solves a single shortest path problem;
it uses the function GRAPHplot() to plot the shortest path
in red and the vertices examined by the algorithm in blue. It plots
using turtle graphics - use
turtle.c to translate from turtle graphics to PostScript.
process 0 (0.0)
lower 3 to 3841.9
lower 1 to 1897.4
process 1 (1897.4)
lower 4 to 3776.2
lower 2 to 2537.7
process 2 (2537.7)
lower 5 to 6274.0
process 4 (3776.2)
process 3 (3841.9)
process 5 (6274.0)
Your goal. Optimize Dijkstra's algorithm so that it can process thousands of shortest path queries for a given map. Once you read in (and optionally preprocess) the map, your program should solve shortest path problems in sublinear time. One method would be to precompute the shortest path for all pairs of vertices; however you cannot afford the quadratic space required to store the results. So, your goal is to reduce the amount of work involved per shortest path computation. We suggest a number of potential ideas below which you may choose to implement. Or you can develop and implement your own ideas.
Idea 1. The naive implementation of Dijkstra's algorithm examines all V vertices in the graph. An obvious strategy to reduce the number of vertices examined is to stop the search as soon as you discover the shortest path to the destination. With this approach, you can make the running time per shortest path query proportional to E' log V' where E' and V' are the number of edges and vertices examined by Dijkstra's algorithm.
Idea 2. You can cut down on the search time further by exploiting the Euclidean geometry of the problem, as described in Sedgewick 21.5. For general graphs, Dijkstra's algorithm maintains a priority for each vertex v on the PQ that is the length of some path from s to v. This satisfies the key property that any path from s to d which goes through v must have length greater than or equal to the priority of v. For maps, we can use the length of some path from s to v plus the Euclidean distance from v to d. The key property continues to hold if we examine the vertices according to these modified priorities.
Idea 3. Change the search to use a symmetric strategy: keep two priority queues, one for the source and one for the destination, and alternate picking the best point off each queue until they "meet in the middle." Determining exactly what to do at this point (and what is meant by when they meet the middle) to guarantee that you have a shortest path is the most difficult part of this task.
Idea 4. Use a faster priority queue. Consider a multiway heap as in Sedgewick Program 20.7.
Idea 5. Preprocess the graph by dividing it into a W-by-W grid and precompute the distance between each pair of grid cells i and j and use these distances to guide the search as described in Sedgewick exercise 21.77. This idea is probably most useful when the source and destination are far apart.
Idea 6. Many of the vertices have exactly two neighbors. If a vertex v has exactly two neighbors u and w, then you can replace the two edges u-v and v-w with a single edge u-w whose length is the sum of the lengths of the two original edges. This reduces the number of vertices and edges in the graph. The main challenge here is to print out the shortest path in the original graph after you've compute the shortest path in the reduced graph.
Testing. The file usa.txt contains 87,575 intersections and 121,961 roads in the continental United States. The graph is very sparse - the average degree is 2.8. Your main goal should be to answer shortest path queries quickly for pairs of vertices on this network. Your algorithm will likely perform differently depending on whether the two vertices are nearby or far apart. We provide input files that test both cases. You may assume that all of the x and y coordinates are integers between 0 and 10,000.
Deliverables. Improve the bare bones implementation by using some of the ideas described above together with your ingenuity. Your code should work with any of our three client programs. Submit all of the files needed to compile your program along with the client program distances.c. Also, submit a readme.txt file. Describe and justify your approach. Also, compare your approach to the bare bones implementation. In particular give the average number of vertices examined for the two types of US road networks.