Frequently Asked Questions (General)

What should I do if a point is inserted twice in the data structure? The data structure represents a symbol table, so you should replace the old value with the new value.

How do I return an Iterable<Point2D>? Add the Point2D objects you want to a Stack<Point2D> or Queue<Point2D> and return that. Of course, your client code should not depend on whether the iterable returned is a stack or queue (because it could be any iterable).

What should points() return if there are no points in the data structure? What should range() return if there are no points in the range? The API says to return an Iterable<Point2D>, so you should return an iterable with zero points.

What should nearest() return if there are two (or more) nearest points? The API does not specify, so you may return any nearest point (up to floating-point precision).

I run out of memory when running some of the large sample files. What should I do? Be sure to ask Java for additional memory, e.g., java-algs4 -Xmx1600m RangeSearchVisualizer input1M.txt.

Frequently Asked Questions (Point2D and RectHV)

Can I use the distanceTo() method in Point2D and RectHV? No, you may use only the subset of the methods listed in the assignment specification. You should be able to accomplish the same result (more efficiently) with distanceSquaredTo().

Can I use the X_ORDER() and Y_ORDER() comparators in Point2D? No, you may use only the subset of the methods listed in the assignment specification. You should be able to accomplish the same result by calling the methods x() and y().

Is a point on the boundary of a rectangle considered inside it? Do two rectangle intersect if they have just one point in common? Yes and yes. Here are the APIs for Point2D and RectHV.

How can I create a RectHV for the entire plane or a halfplane? You can use the values Double.POSITIVE_INFINITY or Double.NEGATIVE_INFINITY for one (or more) of the coordinates when you create a RectHV.

What does the notation [0.5, 0.75] × [0.25, 0.375] mean when specifying a rectangle? It is the Cartesian product of the x-interval [0.5, 0.75] and the y-interval [0.25, 0.375]: the rectangle that includes all points with both 0.5 ≤ x ≤ 0.75 and 0.25 ≤ y ≤ 0.375. Note that the arguments to the RectHV constructor are in the order xmin, ymin, xmax, and ymax but the toString() method uses the Cartesian product notation.

Frequently Asked Questions (PointST)

In which order should the points() method in PointST return the points? The API does not specify the order, so any order is fine.

Frequently Asked Questions (KdTreeST)

What makes KdTreeST difficult? How do I make the best use of my time? Debugging performance errors is one of the biggest challenges. It is very important that you understand and implement the key optimizations described in the assignment specification:

Do not begin range() or nearest() until you understand these rules.

I'm nervous about writing recursive search tree code. How do I even start on KdTreeST.java? Use BST.java as a guide. The trickiest part is understanding how the put() method works. You do not need to include code that involves storing the subtree sizes (since this is used only for ordered symbol table operations).

Will I lose points for a non-recursive implementation of range search? No. While we recommend using a recursive implementation (both for elegance and as a warmup for nearest-neighbor search), you are free to implement it without using recursion.

What should I do if a point has the same x-coordinate as the point in a node when inserting or searching in a 2d-tree? Go to the right subtree as specified in the assignment under Search and insert.

Testing

Sample input files.   Download kdtree.zip. It contains sample input files in the specified format.

Testing the bounding boxes.   If you include the RectHV bounding boxes in the k-d tree nodes, you want to make sure that you got it right. Otherwise, the mistake might not manifest itself until either range search and/or nearest neighbor search. Here are the bounding boxes corresponding to the nodes in input5.txt:

Here, we are following the toString() method format of RectHV which is \([x_{min}, x_{max}] \; \times \; [y_{min}, y_{max}]\) instead of \((x_{min}, y_{min})\) to \((x_{max}, y_{max})\).

Testing put() and points() in KdTreeST. The client KdTreeVisualizer.java reads a sequence of points from a file (given as a command-line argument) and draws the corresponding k-d tree. It does so by reconstruting the k-d tree from the level-order traversal returned by points(). Note that it assumes all points are inside the unit square.

  input5.txt  

input5.txt
  input8.txt  

input8.txt
  input10.txt  

input10.txt

Testing range() and nearest() in KdTreeST. A good way to test these methods is to perform the same sequence of operations on both the PointST and KdTreeST data types and identify any discrepancies. The key is to implement a reference solution in which you have confidence. The brute-force implementation PointST can serve this purpose in your testing.

Warning: both of these clients will be slow for large inputs because (1) the methods in the brute-force implementation are slow and (2) drawing the points is slow.

Frequently Asked Questions (Timing)

How do I measure the number of calls per second to nearest()? Here is one reasonable approach.

To get a reliable estimate, choose \(m\) so that the CPU time \(t\) is neither negligible (e.g., less than 1 second) nor astronomical (e.g., more than 1 hour). When measuring the CPU time, Do not include the time to read in the 1 million points or construct the k-d tree.

How do I generate a uniformly random point in the unit square? Make two calls to StdRandom.uniformDouble(0.0, 1.0)—one for the x-coordinate and one for the y-coordinate.

Possible Progress Steps

These are purely suggestions for how you might make progress on KdTreeST.java. You do not have to follow these steps.

  1. Implement PointST. This should be straightforward if you use either RedBlackBST or TreeMap and are familiar with the subset of the Point2D and RectHV APIs that you may use. After completing this step, you are only about 15% done with the assignment.

  2. Complete the k-d tree worksheet. Here is a set of practice problems for the core k-d tree methods. Here are the answers.

  3. Node data type. There are several reasonable ways to represent a node in a 2d-tree. One approach is to include the point, a link to the left/bottom subtree, a link to the right/top subtree, and an axis-aligned rectangle corresponding to the node.
    private class Node {
       private Point2D p;     // the point
       private Value val;     // the symbol table maps the point to this value
       private RectHV rect;   // the axis-aligned rectangle corresponding to this node
       private Node lb;       // the left/bottom subtree
       private Node rt;       // the right/top subtree
    }
    

  4. Writing KdTreeST.

Optimizations

These are many ways to improve performance of your 2d-tree. Here are some ideas.