COS 226 Programming Assignment Checklist: Point Location

Frequently Asked Questions

What should I do if many lines separate a query pair? Output any one such line.

Where can I get a PostScript viewer for Windows? Go to the Ghostscript home page. Download and install the latest version of AFPL Ghostscript and GSView.

Do I need to use the search tree approach described in the assignment? No. However, if you use another method, be sure to provide a good justification in your readme.txt.

If one or two of the points lie on an input line, does that line separates the two points? Our reference solution does not count this is as separation, although I suppose there could be applications where you would want to count it as a separation. So you can use your judgment on such cases, but you should try to be consistent.

OK, but in input5-99998yes.txt, the reference solution claims that the points (0.725, 0.644) and (0.461, 0.816) are separated by the line between (1.000, 0.200) and (0.300, 1.000), which appears inconsistent with the previous answer. Yes, you're right. The reference solution gives an inconsistent answer because of floating point precision. The line is given exactly by y = -8/7 x + 47/35 so (461/1000, 816/1000) falls exactly on the line. However, if you calculate -(8/7)*0.461 + 47/35 using floating point precision (in Maple) you get 0.8160000001 which would lead the computer to believe that the line does separate the two points.

What is meant by the external path length? See Sedgewick Definition 5.6.

Input, Output, and Testing

Input and output. Here are a number of sample input files. We also encourage you to create your own (possibly pathological) inputs to help test your program. We'll consider awarding extra credit if you submit inputs that cause your program (and other students' programs) problems. The input should be very small, and it should expose a potential flaw that other programs are likely to have. We'll post these input files for other students to use.

Reference solutions. For reference, we provide executable code for our solution in Windows, Solaris, and OS X. Because of degenerate cases (and varying ways to deal with them), we cannot guarantee that our solutions are always accurate. If you discover any potential bugs, please let us know.

Timing. Unix users may find the shell script timeit.csh useful for computing a table of CPU times.

Submission and readme

We will compile your program with "gcc *.c" so you are free to organize your program as you see fit. As usual, we encourage and reward sound design principles.

Here is a template readme.txt file. It should contain the following information:

A high-level description for the design of your algorithm, and what influenced your decisions.

A table of the number of external nodes, and average path length of these nodes for various size input files. Discuss the growth of these numbers.

A (brief) discussion of any vulnerability which your program has as far as degenerate cases or numerical inaccuracy.

Getting started

Download the directory location. This directory is mirrored on arizona at /u/cs226/pub/location. It contains reference solutions and sample input files.

Create a data type for points and lines. Read in the input lines and point pairs and print them out to make sure everything is working. If you like, use the program psout.c to create a PostScript diagram of the input lines and points.

As a warmup, write a brute force routine for answering queries (i.e. one which compares the query points individually to each input line). Although this will not involve preprocessing the input lines, it will force you to implement some of the necessary geometric primitives.

As a warmup, print out all of the intersection points among the N line segments. Once you have this working, you should be ready to build the tree.

Enrichment

What's the most regions that can be created with N intersecting lines in the plane? (N^2 + N + 2) / 2. This is the famous pancake cutting problem.

Can you get logarithmic query times in the worst case? Yes. This was first discovered at Princeton by Dobkin and Lipton. They designed an algorithm that achieves O(log N) query time with O(N^2) space and preprocessing time. They also showed that if binary decisions are used to locate the query points, then you can't do better than log N time per query.

Quadratic space seems excessive. Can you do better? Yes, there are several more complicated algorithms that achieve the logarithmic query time in only linear space. The first such method was discovered at Princeton by Lipton and Tarjan. Chazelle (another Princeton professor) created something called a "hive graph" that also solves the problem in this bound. A simpler algorithm using persistent search trees discovered by Sarnak and Tarjan is the focus of a COS 423 lecture. The preprocessing step uses O(N) space and O(N log N) time; queries take O(log N) time.

COS 226 Home Page
wayne@cs.princeton.edu

Last modified: April 1, 2003