Assignment 5: k-d Tree
Objectives
- to implement a variant of binary search trees
Introduction


Imagine an invasive species tracking app that allows users to submit the geographic coordinates of
the locations of invasive species. Environmental organizations could
then use the data to organize eradication efforts. To best allocate
resources, the organizations may wish to send eradication teams to
the most heavily infested areas. It would be helpful to know,
for a set of candidate areas to treat, which areas have the most
reports of infestation.
A kd-tree is a data structure that allows for efficient range queries over a set of points: given the coordinates of the edge of a (spherical) rectangular region, determine (or simply count) which points in the set are within the region. A kd-tree is a binary search tree that rotates which coordinate it compares at each level of the tree. So for 2-dimensional data, the root might compare on x-coordinate and then the children of the root would compare on y-coordinate, then grandchildren of the root on x-coordinate again, and so on.
Overview
Complete the implementation of the functions described in kdtree.h
from the /c/cs223/hw5/Required/
directory.
The functions define a set-of-points ADT with the ability to create
a set from an array of points, add, find, and remove points, and to find
points that are with a certain range.
The functions should run in the following worst-case time bounds.
kdtree_create
: $O(n \log n)$kdtree_add
: $O(\log n)$kdtree_contains
: $O(\log n)$kdtree_remove
: $O(\sqrt{n})$kdtree_range
: $O(\sqrt{n} + p)$ where $p$ is the number of points in the rangekdtree_range_for_each
: $O(\sqrt{n} + p)$ where $p$ is the number of points in the range, and assuming thatf
runs in $O(1)$ timekdtree_destroy
: $O(n)$
kdtree_create
, which must always run in $O(n \log n)$ time).
All time bounds assume that
malloc
and free
run in $O(1)$ time.
Your implementation should work with the main
function in
kdtree_unit.c
also found in /c/cs223/hw5/Required/
along the location
module. There is no other executable
required for this assignment.
Details
- None of the k-d tree functions need to implement any coordinate normalization: it is the responsibility of the user to ensure that, for example, all longitudes are in the range -180 (exclusive) to 180 (inclusive), and all latitudes are in the range -90 (inclusive) to 90 (inclusive). Your implementation of the ADT will not be tested on calls that violate the preconditions.
- We will not test out-of-memory conditions, so you may assume that there is enough heap space to store all the locations in nodes with a reasonable amount of bookkeeping information.
- The ADT will treat two points at the same latitude with one at longitude -180 and one at longitude 180 as different points, and will treat two points at the same pole but given with different longitudes as different points.
- The ranges specified for
kdtree_range_for_each
andkdtree_range
will not overlap the -180/180 boundary between the eastern and western hemispheres; if the user wishes to perform a query over a range that crosses that boundary, then they will have to divide their query into two parts. -
kdtree_create
should create a tree that is as balanced as possible – for each node, the difference in the size of the left and right subtrees should be at most 1. Pseudocode for an algorithm that achieves that and an example of its execution is available on the examples page. As a simpler alternative, you may create a tree that is approximately balanced by adding the given points in random order. The difference between building a balanced and approximately balanced tree will be no more than 5 points in the private tests.
Suggestions
- The
location
module has functions to compare locations based on latitude or longitude (breaking ties on the other dimension). - The
kdtree_helpers
module in/c/cs223/hw5/Optional
contains a function the provides an inefficient implementation of the algorithm to find the node with the minimum or maximum x- or y-coordinate in a subtree. The inefficient algorithm will be sufficient to pass the correctness tests forkdtree_remove
but not the efficiency test forkdtree_remove
that will be included in the private tests.
Submissions
Submit your makefile, log file, and source code necessary to build theUnit
executable. Note that since there is no other main exectuable to
create for this assignment, the name of the executable used
for testit
is Unit
.
Your makefile should assume that the files from
/c/cs223/hw5/Required
have been copied into
the current directory but not that they are still in their original locations,
and you should not submit those files.