Assignment 5: k-d Tree
Objectives
- to implement a variant of binary search trees
Introduction


Imagine an invasive species tracking app that allows users to submit the geographic coordinates of the locations of invasive species. Monitoring organizations could then receive alerts when species they are tracking show up in new locations. But when invasive species are present in frequently-visited locations, there will be many reports around those locations, so the alert-generating system should have some way of filtering reports that are near other reports. To do so efficiently, it could make use of a data structure that allows for efficient nearest neighbor queries. When a new report comes in, the app would then query the data structure for the nearest neighbor to the new point among all other points previously recorded. If the distance to the nearest neighbor is above a certain threshold, then the location could be considered a "new" location for the invasive species, and resources could be sent to that new location to attempt to eradicate the invaders.
A kd-tree is a data structure that allows for efficient nearest neighbor queries. A kd-tree is a binary search tree that rotates which coordinate it compares at each level of the tree. So for 2-dimensional data, the root might compare on x-coordinate and then the children of the root would compare on y-coordinate, then grandchildren of the root on x-coordinate again, and so on.
Assignment
Complete the implementation of the functions described in kdtree.h
from the /c/cs223/hw5/Required/
directory.
The functions define a Set-of-2D-Points ADT with the ability to find
the nearest neighbor among points in the set to a given point.
Note that in order to keep the math simple, our points will be
2-D Cartesian coordinates rather than geographic coordinates with
latitudes and longitudes (although you should consider what would have
to change in your implementation to support geographic points, assuming
a suitable version of the location
module were provided).
The functions should run in the following time bounds (worst-case except as noted).
kdtree_create
: $O(n \log n)$kdtree_add
: $O(\log n)$ when the tree is balanced and $O(n)$ otherwisekdtree_contains
: $O(\log n)$ when the tree is balanced and $O(n)$ otherwisekdtree_for_each
: $O(n)$, assuming thatf
runs in $O(1)$ timekdtree_nearest_neighbor
: $O(\log n)$ expected time, assuming points (in the set and the query point) are uniformly distributed over some rectangular region and the tree is balanced and $O(n)$ otherwisekdtree_destroy
: $O(n)$
kdtree_create
(under
the same assumptions as for kdtree_nearest_neighbor
)
will receive
much more partial credit than $O(n^2)$ worst-case implemetations.
All time bounds assume that
malloc
and free
run in $O(1)$ time.
Proper use of your implementation must not result in valgrind errors.
Your implementation should work with the main
function in
kdtree_unit.c
also found in /c/cs223/hw5/Required/
along with the location
module. There is a makefile
in /c/cs223/hw5/Optional
that builds an executable called Unit
. The
Unit
executable runs a number of different unit tests and
you can select one by passing its index on the command line.
See the code for the unit tests for more details about each one.
There is no additional executable to write for this assignment.
Example
[jrg94@rattlesnake ~]$ for N in `seq 1 14`; do echo === TEST $N ===; ./Unit $N | sort -n; done === TEST 1 === PASSED === TEST 2 === PASSED === TEST 3 === PASSED === TEST 4 === PASSED === TEST 5 === PASSED === TEST 6 === === TEST 7 === 15.071 -163.957 15.357 -157.807 16.698 -164.924 17.079 -162.024 18.085 -162.655 18.863 -163.994 19.097 -160.178 19.353 -162.740 19.712 -161.068 20.376 -162.999 21.833 -161.960 22.189 -157.489 22.215 -158.356 22.589 -157.028 22.858 -158.387 22.998 -156.598 23.420 -162.370 23.858 -157.537 24.797 -157.050 24.904 -164.680 === TEST 8 === Distance: inf === TEST 9 === Distance: 0.000000 Point: 22.588670 -157.028378 === TEST 10 === Distance: 6.324555 Point: 1.000000 1.000000 === TEST 11 === Distance: 6.324555 Point: 1.000000 1.000000 === TEST 12 === PASSED === TEST 13 === Distance: 0.000000 Point: 22.588670 -157.028378 === TEST 14 === PASSED
Submissions
Submit your source code necessary to build the executable using the provided makefile and supporting code. There is no need to submit the provided makefile if you did not change it. You should not submit the files for thelocation
module, kdtree_unit.c
, or kdtree.h
and
you may not change those files.
Your makefile can assume that those supporting files have been copied
into the current directory, but should not assume they
are still in their original locations.