Assignment 3: Heatmap for GPS Track
Objectives
- to use an ADT implemented with opaque structs
- to implement an ADT using opaque structs
- to use dynamically allocated memory
Introduction
A heatmap is a graphical representation of a 2-D array using different colors to represent different ranges of values. Heatmaps are useful in a variety of applications, including web usability, sports analytics, and many geographic applications. Our application is search-and-rescue: given a GPS track, we want to generate a heatmap to see where in the search area the track hasn't been as often so that we can focus future search efforts on those locations.
A GPS track is a collection of trackpoints
each containing a location and timestamp. Trackpoints within tracks
are ordered by increasing timestamp and can be organized into multiple
segments within the same track (a tracking device might, for example,
start a new segment each time the user pauses the device, starts a new lap,
or when the devices loses the signal from the satellites).
Arrows indicate the direction of increasing time along
a segment, and the bold segment is the current (last) segment.
Operations on segments allow adding points to the current (last) segment,
starting a new segment, merging segments, and querying segments.
adding two track
points to it (the pink then the red).
up to but not including
4.
Given a track and a grid size, we can create a heatmap by counting the
number of track points within each grid cell (note that we count track
points and don't consider the time between those track points).
represent more track points
within the cell.
Overview
Write a program calledHeatmap
that produces an ASCII-art
heatmap from GPS track data read from standard input. We will run
the Heatmap
program with a command line like
./Heatmap 0.010 0.010 .:xX@# 1 < /c/cs223/hw3/Tests/example.inwhere the command-line arguments are
- the width and the height, in degrees of longitude and latitude, of each cell in the heatmap
- a string of the characters used to represent ranges of values in the heatmap; and
- a positive integer $n$ giving the size of the ranges of track point counts represented by the characters in the previous argument.
Standard input will contain track point data, one track point per line and with segments within a track separated by blank lines in a format like
-48.0020 121.0260 1500000000000 -48.0080 121.0220 1500010000000 -48.0120 121.0210 1500025000000 -48.0130 121.0170 1500040000000 -48.0110 121.0140 1500045000000 -48.0095 121.0130 1500052000000 -48.0105 121.0080 1500058000000 -48.0110 121.0070 1500065000000 -48.0115 121.0020 1500072000000 -48.0150 121.0000 1500080000000 -48.0170 121.0010 1500095000000 -48.0220 121.0000 1500201000000 -48.0225 121.0030 1500210000000 -48.0235 121.0120 1500240000000 -48.0210 121.0140 1500245000000 -48.0215 121.0190 1500252000000 -48.0205 121.0270 1500268000000 -48.0220 121.0310 1500280000000 -48.0175 121.0330 1500288000000 -48.0130 121.0310 1500295000000 -48.0070 121.0320 1500300000000 -48.0030 121.0280 1500305000000 -48.0000 121.0330 1500500000000 -48.0030 121.0420 1500505000000 -48.0080 121.0490 1500510000000 -48.0165 121.0490 1500520000000 -48.0245 121.0490 1500530000000 -48.0310 121.0480 1500540000000 -48.0335 121.0390 1500547000000 -48.0340 121.0290 1500554000000 -48.0380 121.0180 1500562000000 -48.0000 121.0250 1500700000000 -48.0035 121.0220 1500707000000
The corresponding heatmap is then written to standard output in a format like
.:#xx #x:x: xX::: .::::where, for this example
.
represents a cell containing 0 track points,
:
represents a cell containing 1 track points,
x
represents a cell containing 2 track points,
X
represents a cell containing 3 track points,
@
represents a cell containing 4 track points, and
#
represents a cell containing 5 or more track points.
If the command to run the program had been
./Heatmap 0.010 0.010 .o0 3 < /c/cs223/hw3/Tests/example.inthe the output would be
..o.. o.... .o... .....where
.
represents a cell containing between 0 and 2 track points
(inclusive),
o
represents a cell containing between 3 and 5 track points
(inclusive), and
0
represents a cell containing between 6 or more track points
(there aren't any in this example).
Implementing the track
ADT
The header file /c/cs223/hw3/Required/track.h
specifies
the public interface for a track
abstract
data type (ADT), which models a GPS track. You must implement
that ADT and it will be tested separately.
Refer to the documentation in the comments in the header file for
the required behavior of each function.
You may not change the header file,
but you may include other helper functions that are not visible to
other modules in your implementation file.
The /c/cs223/hw3/Required
directory also contains the required header and implementation files
for location
and trackpoint
ADTs and you
may not change those either.
Your track ADT will be tested with a series of unit tests that
test individual functions. An incomplete suite of unit tests is implemented
in /c/cs223/hw3/Required/track_unit.c
, which will be replaced
with a complete test suite for the private test script. The public
and private unit test modules will each
contain a main
function that you must link with your
track
implementation to produce an executable
called Unit
that executes the unit tests.
Examine the public track_unit.c
to see how the individual tests
work.
Details
Command-line arguments
- the width and height of the heatmap grid cells will be positive
and are given
in degrees of longitude and latitude in a format
readable by
atof
- the string of characters used in the heatmap will be non-empty
- the size of the ranges represented by each character is
given as a positive integer $n$ in a format readable by
scanf
with the%d
format specifier - if the length of the string of characters is $m$ then the ranges represented by each character are $kn, \ldots, (k+1)n-1$ for k = $0, \ldots, m-2$, and $(m-1)n, \ldots, \infty$.
Standard input
- each non-empty line of input read from standard input will contain one trackpoint
- the latitude and longitude for each trackpoint
will be given in degrees in a format
readable by
scanf
with the%lf
format specifier - the latitides and longitudes will satisfy the preconditions
of
trackpoint_create
- the time for each trackpoint is given as the number of
milliseconds since January 1, 1970 in a format readable by
scanf
with the%ld
format specifier - all three fields will be separated by single space characters, with no other whitespace on the line aside from a newline that marks the end of each line
- the track points will be given in order of increasing timestamp
- there will be a single empty line used to separate the second and subsequent segments from the previous segment
- if there is more than one segment then they will all be non-empty
Building the heatmap
- the top edge of the heatmap is the latitude of the northernmost point in the track
- Considering the left edge of the heatmap to be the longitude
of the track point with the lowest longitude would present problems
when the track crosses between the eastern hemisphere (positive
longitudes) and western hemisphere (negative longitudes).
For example, using the most negative longitude for the track
shown below
would result in something like
3000 ... 002 2000 001 0000 004 0000 002 0000 ... 003
23 12 40 20 30
Map generated by the Great Circle Mapper - copyright © Karl L. Swartz. A collection of points on the equator, and some wedges (ranges of longitudes, possibly wrapping around between hemispheres)
that contain all of them; the solid range is the smallest such
range and the heatmap's western edge would be along that
wedge's western edge. - the number of rows and columns in the heatmap is such that every track point is contained in a cell, and neither the easternmost column nor the southernmost row is empty (unless the track is empty, in which case the heatmap has one cell)
- heatmap cells include their northern and western edges, but not their southern or eastern edges, except for the bottom row and right column, which do include their southern and eastern edges respectively
- the above two rules mean that the output should not contain
an extra row or column just to contain track points that would
otherwise be on the edge of the heatmap; see the example below
a track for which the count array should be 7
41 11
- when the heatmap grid wraps all the way around the earth so the easternmost column overlaps the westernmost, count points only in the leftmost cell they belong to
- consider track points at the poles (where all longitudes refer to the same point) to have whatever longitude was read from standard input
Output
- the output should be a heatmap representing all trackpoints and all
segments read from standard input, as specified by the
track_heatmap
function in thetrack
ADT, using the cell width and height given on the command line. - each row of the heatmap should be written to standard output on a separate line, with the first character written corresponding to the first column of the first row of the heatmap
- each integer value in the heatmap should be replaced with a single character as determined by the last two command-line arguments as described in the command-line arguments section
- there is no space between characters in the same row and nothing at the end of a line except a newline character
- there should be no other output to standard output or standard error
Efficiency
-
track_create
must run in $O(1)$ worst-case time -
track_destroy
must run in $O(n)$ worst-case time -
track_count_segments
,track_count_points
, andtrack_get_point
must run in worst-case $O(1)$ time -
track_get_lengths
must run in worst-case $O(m)$ time -
track_add_point
andtrack_start_segment
must run in amortized $O(1)$ time (so any sequence of $k$ such operations must take $O(k)$ time total in the worst case) -
track_merge_segments
must run in worst-case $O(n)$ time -
track_heatmap
must run in worst-case $O(n \log n + p)$ time
malloc
and
free
work in $O(1)$ time (note that this assumption does
not extends to calloc
, realloc
or similar
functions).
Testing
When the input (standard input or command-line arguments) are invalid,
the output of your program is ignored and it must not crash or hang.
Your Heatmap
program
must not produce any errors when tested
with Valgrind, whether the input is
valid or not. The unit test program Unit
must
not result in errors from Valgrind for any test. None of the
public or private unit tests will violate the preconditions
of the track ADT functions, and so the behavior of your implementations
of those functions may be anything (including crashing and/or hanging).
You may assume that
memory allocations succeed, and so no error-checking is required
for calls to malloc
.
Violations of the abstraction provided by declaring
trackpoint
and track
as opaque structs
will be considered a violation of good programming style and design.
Hints
- Before you write any code, draw a memory diagram of your implementation of a track, and diagrams of how the various operations change a track.
- You might find it helpful to write a helper function
track_bounds
to determine the bounds of the track according to the rules of the bounds of the corresponding heatmap. You can start with an implementation that ignores the problem of points near the -180/180 degree boundary between hemispheres and just finds the minimum and maximum latitude and longitude. The next step might be a version that does the right thing for tracks with points near the boundary, but does so inefficiently. The last step would be a version that meets all of the requirements, including the time bound. That final version might have a specification that looks like/** * Determines: 1) the latitude of the northernmost and southernmost track points in the given track; and 2) * the meridian of longitude at the western edge of the smallest spherical wedge bounded by two meridians * that contains all the points in the track (the "western edge" for a nontrivial wedge being the one that, * when you move east from it along the equator, you stay in the wedge). When there are multiple such wedges, * this function finds the one with the lowest normalized (adjusted to the range -180 (inclusive) * to 180 (exclusive)) longitude. * * @param trk a pointer to a valid, non-empty track * @param west a pointer to a double in which to record the western edge of the containing wedge * @param east a pointer to a double in which to record the eastern edge of the containing wedge * @param north a pointer to a double in which to record the latitude of the northernmost point * @param south a pointer to a double in which to record the latitude of the southernmost point */ static void track_bounds(const track *trk, double *west, double *east, double *north, double *south);
- If the west edge and east edge of a heatmap with cell width $w$ are $(lon_{west},lon_{east})$ where $lon_{west} \le lon_{east}$ then (ignoring special cases) a point at longitude $lon$ belongs in column $\left\lfloor \frac{(lon - lon_{west})}{w} \right\rfloor$. If $lon_{west} \gt lon_{east}$ then the heatmap crosses the -180/180 longitude boundary and the same formula works for longitudes from the western edge to the -180/180 boundary ($lon_{west} \le lon \lt 180$). For other longitudes, use the formula $\left\lfloor \frac{(360 + lon - lon_{west})}{w} \right\rfloor$.
Submissions
Submit your makefile, log file, and source code necessary to build the executablesHeatmap
and Unit
using your makefile.
Your makefile should have rules with each of those executables as their targets.
Your makefile should assume that the files from
/c/cs223/hw3/Required
have been copied into
the current directory but not that they are still in their original locations,
and you should not submit those files (if you do, they will be overwritten with
the copies from /c/cs223/hw3/Required
).