Assignment 5: k-d Tree
Objectives
- to implement a variant of binary search trees
Introduction
Imagine an invasive species tracking app that allows users to submit the geographic coordinates of
the locations of invasive species. Environmental organizations could
then use the data to organize eradication efforts. To best allocate
resources, the organizations may wish to send eradication teams to
the most heavily infested areas. It would be helpful to know,
for a set of candidate areas to treat, which areas have the most
reports of infestation.
A kd-tree is a data structure that allows for efficient range queries over a set of points: given the coordinates of the edge of a (spherical) rectangular region, determine (or simply count) which points in the set are within the region. A kd-tree is a binary search tree that rotates which coordinate it compares at each level of the tree. So for 2-dimensional data, the root might compare on x-coordinate and then the children of the root would compare on y-coordinate, then grandchildren of the root on x-coordinate again, and so on.
Overview
Complete the implementation of the functions described in kdtree.h
from the /c/cs223/hw5/Required/ directory.
The functions define a set-of-points ADT with the ability to create
a set from an array of points, add, find, and remove points, and to find
points that are with a certain range.
The functions should run in the following worst-case time bounds.
kdtree_create: $O(n \log n)$kdtree_add: $O(\log n)$kdtree_contains: $O(\log n)$kdtree_remove: $O(\sqrt{n})$kdtree_range: $O(\sqrt{n} + p)$ where $p$ is the number of points in the rangekdtree_range_for_each: $O(\sqrt{n} + p)$ where $p$ is the number of points in the range, and assuming thatfruns in $O(1)$ timekdtree_destroy: $O(n)$
kdtree_create, which must always run in $O(n \log n)$ time).
All time bounds assume that
malloc and free run in $O(1)$ time.
Your implementation should work with the main function in
kdtree_unit.c also found in /c/cs223/hw5/Required/
along the location module. There is no other executable
required for this assignment.
Details
- None of the k-d tree functions need to implement any coordinate normalization: it is the responsibility of the user to ensure that, for example, all longitudes are in the range -180 (exclusive) to 180 (inclusive), and all latitudes are in the range -90 (inclusive) to 90 (inclusive). Your implementation of the ADT will not be tested on calls that violate the preconditions.
- We will not test out-of-memory conditions, so you may assume that there is enough heap space to store all the locations in nodes with a reasonable amount of bookkeeping information.
- The ADT will treat two points at the same latitude with one at longitude -180 and one at longitude 180 as different points, and will treat two points at the same pole but given with different longitudes as different points.
- The ranges specified for
kdtree_range_for_eachandkdtree_rangewill not overlap the -180/180 boundary between the eastern and western hemispheres; if the user wishes to perform a query over a range that crosses that boundary, then they will have to divide their query into two parts. -
kdtree_createshould create a tree that is as balanced as possible – for each node, the difference in the size of the left and right subtrees should be at most 1. Pseudocode for an algorithm that achieves that and an example of its execution is available on the examples page. As a simpler alternative, you may create a tree that is approximately balanced by adding the given points in random order. The difference between building a balanced and approximately balanced tree will be no more than 5 points in the private tests.
Suggestions
- The
locationmodule has functions to compare locations based on latitude or longitude (breaking ties on the other dimension). - The
kdtree_helpersmodule in/c/cs223/hw5/Optionalcontains a function the provides an inefficient implementation of the algorithm to find the node with the minimum or maximum x- or y-coordinate in a subtree. The inefficient algorithm will be sufficient to pass the correctness tests forkdtree_removebut not the efficiency test forkdtree_removethat will be included in the private tests.
Submissions
Submit your makefile, log file, and source code necessary to build theUnit
executable. Note that since there is no other main exectuable to
create for this assignment, the name of the executable used
for testit is Unit.
Your makefile should assume that the files from
/c/cs223/hw5/Required have been copied into
the current directory but not that they are still in their original locations,
and you should not submit those files.