Exercise 3: Mergesort
Objectives
- to use structs
- to implement binary search
- to implement mergesort
Assignment
We return to our implementation of heuristics for the Travelling Salesperson Problem. Instead of giving the coordinates of the cities in the input file, we now have acities
module that includes a list of cities
and their coordinates and a function find_city
that, given
the code for a city, returns its coordinates. And instead of storing cities'
coordinates in separate arrays for the latitudes and longitudes, we
store them in location
structs defined in the
location
module.
Code
Modify your tsp.c
program to use the cities
and location
modules. The cities
module must
be initialized by calling initialize_city_database
at the
beginning of main
.
Refactor (rewrite and/or reorganize to improve without changing the
behavior) main
and read_input
so that
1) instead of having separate arrays to hold latitudes and longitudes,
there is a single array that holds a location
struct for
each city; and 2) instead of reading latitudes and longitudes from
the input file, read_input
calls find_city
from the cities
module to get the coordinates and save
them in the array of location
s. You will also have to
change the code that initializes the array of distances to pass the
appropriate locations from the array of locations to the
location_distance
function from the location
module. (You may optionally take this a step further and store
all the data in an array of city
structs as defined
in cities.h
, but you will need to use dynamic memory
allocation to make space for the strings inside the structs.)
At this point, your program should run as it did before. But the array
that holds the city codes and their locations is unsorted, and so
find_city
has to use a slow sequential search in order
to find each city. If that array were sorted, we could use binary search
in find_city
.
So modify initialize_cities_database
so that it
sorts the cities
array. You can pattern your code
off of the implementation in the reading,
modifying it so it works with arrays of location
structs
instead of int
s. You can use statically allocated temporary
arrays instead of dynamically allocated arrays.
You may want to test your merge
and merge_sort
functions independently of the rest of the program. You can use something
like the code shown below in the "Unit Tests" section
to test merge
. We leave it to you to write a unit test
for merge_sort
if necessary.
The sequential search in find_city
will still work
with the array after it has been sorted by merege_sort
.
But change it to implement binary search for a more efficient solution.
Unit Tests and Efficiency Tests
In addition to testing the TSP
executable, we will test
some of the functions individually using the unit tests in
/c/cs223/hw3/Required/cities_unit.c
. The starter makefile
in /c/cs223/hw3/Optional/makefile
builds a second executable
called Unit
that runs the unit tests. You can examine
cities_unit.c
to see what the unit tests do.
The
Unit
executable also tests the efficiency of your
sort and search functions (initialize_city_database
and find_city
to determine whether they run in worst-case
time $O(n \log n)$ and $O(\log n)$ respectively. The efficiency tests
are empirical and so may give both false positives (your code passes
the tests but does not run within the asymptotic time bound; perhaps
you have exceptionally good constants for an $O(n^2)$ implementation
when the goal is $O(n \log n)$) and false negatives (your code
fails the tests but does in fact meet the asymptotic time bound;
perhaps you have an $O(n \log n)$ implementation with exceptionally
high constants). We will allow ourselves to be fooled by false positives.
But you should fix false negatives because constants do matter –
if your solution runs in at most $100n \log n$ instructions but it is
possible to reduce that to $10n \log n$ with reasonable effort, then
you should make that effort!
Although the test scripts do not have a unit test for merge
and merge_sort
, you may find it helpful to have such tests
(and also for any other functions you write). You can start with the
following test for merge
and modify it for other test cases.
void test_merge() { // locations don't matter for testing merge, so using 0.0 city us[] = { {"ABE", {0.0, 0.0}}, {"CHX", {0.0, 0.0}}, {"DEN", {0.0, 0.0}}, {"TLH", {0.0, 0.0}} }; int us_count = sizeof(us) / sizeof(city); city asia[] = { {"DAC", {0.0, 0.0}}, {"DAM", {0.0, 0.0}}, {"NRT", {0.0, 0.0}}, {"VVO", {0.0, 0.0}} }; int asia_count = sizeof(asia) / sizeof(city); int total_count = us_count + asia_count; city output[total_count]; // EXPECTED: both lists output in sorted order merge(us_count, us, asia_count, asia, output); printf("MERGED US AND ASIA\n"); for (int i = 0; i < total_count; i++) { printf("%s\n", output[i].name); } merge(asia_count, asia, us_count, us, output); printf("MERGED ASIA AND US\n"); for (int i = 0; i < total_count; i++) { printf("%s\n", output[i].name); } }
Examples and Output
Ifsea_6_dal.in
contains
8 SEA YVR MCI STL BWI DCA IAH DALthen the output of running your program should be
(base) [jrg94@newt code]$ ./TSP -given sea_6_dal.in -given: 9306.22 SEA YVR MCI STL BWI DCA IAH DAL SEA (base) [jrg94@newt code]$ ./TSP -nearest sea_6_dal.in -nearest: 10067.11 SEA YVR MCI STL DAL IAH DCA BWI SEA (base) [jrg94@newt code]$ ./TSP -insert sea_6_dal.in -insert: 9320.21 SEA DAL IAH BWI DCA STL MCI YVR SEA
Submissions
Submit your modifiedtsp.c
and cities.c
files, any other files you
created, a makefile with the TSP
executable as the default
target and Unit
as an additional target, and your log as assignment number 3. You need not submit
the location
module, the cities.h
header
file, or the cities_unit.c
unit tests
(and if you do, they will be ignored).