COS 226 Kd-Trees |
Checklist assignment |
Is a point on the boundary of a rectangle considered inside it? Do two rectangle intersect if they have just one point in common? Yes and yes, consistent with the implementation of RectHV.java.
Can I use the distanceTo()
method in Point2D
and RectHV
?
No, you may use only the subset of the methods listed. You should be able to accomplish
the same result (more efficiently) with distanceSquaredTo()
.
Can I use the X_ORDER()
and Y_ORDER()
comparators
in Point2D
?
No, you may use only the subset of the methods listed. You should be able to accomplish
the same result by calling the methods x()
and y()
.
What should I do if a point is inserted twice in the data structure? The data structure represents a symbol table, so you should replace the old value with the new value.
What should points()
return if there are no points in the data structure?
What should range()
return if there are no points in the range?
the API says to return an Iterable<Point2D>
, so you should return
one with zero points.
What should nearest()
return if there are two (or more) nearest points?
The assignment does not specify, so any nearest point (up to floating-point precision) is fine.
How much memory does a Point2D
object use?
For simplicity, assume that each Point2D
object uses 32 bytes—in reality,
it uses a bit more because of the Comparator
instance variables.
How much memory does a RectHV
object use?
You should look at the code and analyze its memory usage.
I run out of memory when running some of the large sample files. What should I do?
Be sure to ask Java for additional memory,
e.g., java-algs4 -Xmx1600m RangeSearchVisualizer input1M.txt
.
In which order should the points()
method in PointST
return the points?
The assignment does not specify the order, so any order is fine.
What makes KdTreeST difficult? How do I make the best use of my time? Debugging performance errors is one of the biggest challenges. It is very important that you understand and implement the key optimizations described in the assignment specification:
range()
or nearest()
until you understand these rules.
I'm nervous about writing recursive search tree code. How do I even start on KdTreeST.java?
You can use BST.java
as a guide. The trickiest part is understanding how the put()
method works. You do not need to
include code that involves storing the subtree sizes
(since this is used only for ordered symbol table operations).
What should I do if a point has the same x-coordinate as the point in a node when inserting or searching in a 2d-tree? Go to the right subtree as specified in the assignment under Search and insert.
Testing.
A good way to test KdTreeST
is to perform
the same sequence of operations on both the PointST
and
KdTreeST
data types and identify any discrepancies. Indeed, this is how
most of the autograder test are performed. They key is to implement a reference
solution in which you have confidence. The brute-force implementation
PointST
can serve this purpose in your testing.
range()
in PointST
and blue the points returned by
the method range()
in KdTreeST
.
nearest()
in PointST
and blue the point returned by
the method nearest()
in KdTreeST
.
Sample input files. Download kdtree.zip. It contains sample input files in the specified format.
These are purely suggestions for how you might make progress on KdTreeST.java
.
You do not have to follow these steps.
Since we don't need to implement the rank and select operations, there is no need to store the subtree size.private class Node { private Point2D p; // the point private Value value; // the symbol table maps the point to this value private RectHV rect; // the axis-aligned rectangle corresponding to this node private Node lb; // the left/bottom subtree private Node rt; // the right/top subtree }
isEmpty()
and size()
. These should be very easy.
put()
which does everything except set up the
RectHV
for each node. Write the get()
and contains()
method, and use these to test that
put()
was implemented properly. Note that put()
and get()
are best implemented by using private helper methods analogous to those found on page 399
of the book or by looking at BST.java.
We recommend using the orientation (vertical or horizontal) as an argument to these helper methods.
A common error is to not rely on your base case (or cases). For example, compare the following two
get()
methods for searching in a BST:
In the latter method, extraneous null checks are made that would otherwise be caught by the base case. Trust in the base case. Your method may have additional base cases, and code like this becomes harder and harder to read and debug.private Value get(Node x, Key key) { if (x == null) return null; int cmp = key.compareTo(x.key); if (cmp < 0) return get(x.left, key); else if (cmp > 0) return get(x.right, key); else return x.value }private Value overlyComplicatedGet(Node x, Key key) { if (x == null) return null; int cmp = key.compareTo(x.key); if (cmp < 0) { if (x.left == null) return null; else return overlyComplicatedGet(x.left, key); } else if (cmp > 0) { if (x.right == null) return null; else return overlyComplicatedGet(x.right, key); } else return x.value }
points()
method. Use this to check the structure of your
kd-tree.
put()
which sets up the RectHV
for each Node
.
range()
method.
Test your
implementation using RangeSearchVisualizer.java,
which is described in the testing section.
nearest()
method.
Test your
implementation using
NearestNeighborVisualizer.java,
which is described in the testing section.
A video, worksheet, and coding tips are provided for those wishing additional assistance. Be forewarned that videos were made in early 2014 and is somewhat out of date. For example the API has changed.
These are many ways to improve performance of your 2d-tree. Here are some ideas.
distanceSquaredTo()
but not distanceTo()
.
RectHV
in each 2d-tree node (though it
is probably wise in your first version).