COS 226 Programming Assignment Checklist: Pattern Recognition

Frequently Asked Questions

Can the same point appear more than once in the input file? You may assume the data has been preprocessed so that this does not happen.

Can I draw a line segment containing 4 (or more) points by drawing several subsegments? No, you should draw one (and only one) line segment for each set of collinear points discovered: If there is a line segment pqrs, you should draw it with either p.drawTo(s) or s.drawTo(p).

How do I sort a subarray? Arrays.sort(a, lo, hi) sorts the subarray from a[lo] to a[hi-1] according to the natural order of a[]. Use a Comparator as the fourth argument to sort according to an alternate order.

Where can I see examples of Comparable and Comparator? See the lecture slides and textbook. We assume this is new Java material for most of you, so don't hesitate to ask if you are unsure of how to use it.

My program works correctly except on (some) vertical line segments. What could be going wrong? Are you dividing by zero? With integers, this produces a runtime exception. With floating-point numbers, 1.0/0.0 is positive infinity and -1.0/0.0 is negative infinity.

I'm having trouble avoiding subsegments Fast.java when there are 5 or more points on a line segment. Any advice? Not handling the 5-or-more case will lead to only a minor deduction, so don't kill yourself over it.

Can I sort by slope instead of angle? Yes, the atan() function is monotonic, so sorting by either y/x or atan(y/x) leads to the same result. Just be careful when x is zero.

Does (Math.atan(y/x) == Math.atan((c*y) / (c*x))) if x, y, and c are integer-valued doubles between 0 and 32,767 with c not equal to zero? Yes. In IEEE floating point, integer are represented exactly (assuming no overflow). Thus c*x and c*y are represented exactly. The result of an IEEE floating-point operation is the nearest representable value. Thus, y/x and (c*x)/(c*y) will yield exactly the same value. Therefore, the two expression will be exactly equal. It's sometimes ok to compare floating-point numbers for exact equality (but be sure that you know what you're doing!)

How do I access the mathematical constant π? It's Math.PI, but don't use it or you'll likely encounter floating-point roundoff issues.

Why are most of the line segments in the input files horizontal and vertical? We generate most of the data sets at random. Since they have integer coordinates in a small range, there are more opportunities to form horizontal and vertical lines.

I created a nested Comparator class within Point. Within the nested Comparator class, the keyword this refers to the Comparator object. How do I refer to the Point instance of the outer class? Use Point.this instead of this. Note that you can directly refer to instance variables and methods of the outer class; with proper design, you shouldn't need this awkward notation.

Testing

Input.   The directory collinear contains some sample input files. Thanks to Jesse Levinson '05 for the remarkable input file rs1423.txt.

Reference solutions.   Some of the input files have associated .png files that contain the desired output. You can use these for checking your work.

Possible Progress Steps

These are purely suggestions for how you might make progress. You do not have to follow these steps.

  1. Getting started. Download the directory collinear to your system. It contains a number of sample input files and some helper programs.

    The files Point.java and PointPlotter.java comprise a program that reads in a list of points and plots the results. To plot the points, type the following at the command line

    % javac PointPlotter.java
    % java PointPlotter < input56.txt
    
    This opens a graphical window where you will see the points plotted.

  2. Point data type. Before writing any code, think carefully about the point data type operations that you will need to support the brute-force algorithm. A minimalistic approach might involve creating a function
    public static double angle(Point a, Point b)
    
    that returns the tangent of the angle that b makes with a. You can use the Java function Math.atan() for this: it returns an angle between -pi/2 and pi/2. Then, use this helper function to create a function
    public static boolean areCollinear(Point p, Point q, Point r)
    
    that returns true if the three points are collinear, false otherwise. Also, implement a 4-argument version.
    public static boolean areCollinear(Point p, Point q, Point r, Point s)
    

    Be sure to thoroughly test your code before continuing.

  3. Brute force. Now, iterate through all 4-tuples and check if the 4 points are collinear. To draw the line segment, you need to know the endpoints. One approach is only to print out a line segment if the 4 points are in ascending order (say relative to y-coordinate, breaking ties by x-coordinate), in which case, the endpoints are the first and last points.

    Hint: don't waste time micro-optimizing the brute-force solution.

  4. Sorting solution. Before writing any code, think carefully about the point data type operations that you will need to support. The complicating issue is that the function needed to compare the angles of two points q1 and q2 depends on a third point p, which changes from sort to sort. One solution is to include a public and final (but not static) instance variable BY_POLAR_ORDER in Point of type Comparator<Point>. This Comparator has a compare() method so that compare(q1, q2) compares the angle that q1 and q2 make with the invoking object p.

Enrichment

Can the problem be solved in quadratic time and linear space? Yes, but the only algorithm I know of is quite sophisticated. It involves converting the points to their dual line segments and topologically sweeping the arrangement of lines.

Can the decision version of the problem be solved in subquadratic time? The original version of the problem cannot be solved in subquadratic time because there might be a quadratic number of line segments to output. (See next question.) The decision version asks whether there exists a set of 4 collinear points. This version of the problem is a longstanding open research question that belongs to a group of problems that are known as 3SUM-hard. It is conjectured that such problems have no subquadratic algorithms. Thus, the sorting algorithm presented above is about the best we can hope for (unless the conjecture is wrong). Under a restricted decision tree model of computation, Erickson proved that the conjecture is true.

What's the maximum number of (maximal) collinear sets of points in a set of N points in the plane? It can grow quadratically as a function of N. Consider the N points of the form: (x, y) for x = 0, 1, 2, and 3 and y = 0, 1, 2, ..., N / 4.


wayne@cs.princeton.edu