Exercise 3: Mergesort
Objectives
- to use structs
- to implement binary search
- to implement mergesort
Assignment
We return to our implementation of heuristics for the Travelling Salesperson Problem. Instead of giving the coordinates of the cities in the input file, we now have acities module that includes a list of cities
and their coordinates and a function find_city that, given
the code for a city, returns its coordinates. And instead of storing cities'
coordinates in separate arrays for the latitudes and longitudes, we
store them in location structs defined in the
location module.
Code
Modify your tsp.c program to use the cities
and location modules. The cities module must
be initialized by calling initialize_city_database at the
beginning of main.
Refactor (rewrite and/or reorganize to improve without changing the
behavior) main and read_input so that
1) instead of having separate arrays to hold latitudes and longitudes,
there is a single array that holds a location struct for
each city; and 2) instead of reading latitudes and longitudes from
the input file, read_input calls find_city
from the cities module to get the coordinates and save
them in the array of locations. You will also have to
change the code that initializes the array of distances to pass the
appropriate locations from the array of locations to the
location_distance function from the location
module. (You may optionally take this a step further and store
all the data in an array of city structs as defined
in cities.h, but you will need to use dynamic memory
allocation to make space for the strings inside the structs.)
At this point, your program should run as it did before. But the array
that holds the city codes and their locations is unsorted, and so
find_city has to use a slow sequential search in order
to find each city. If that array were sorted, we could use binary search
in find_city.
So modify initialize_cities_database so that it
sorts the cities array. You can pattern your code
off of the implementation in the reading,
modifying it so it works with arrays of location structs
instead of ints. You can use statically allocated temporary
arrays instead of dynamically allocated arrays.
You may want to test your merge and merge_sort
functions independently of the rest of the program. You can use something
like the code shown below in the "Unit Tests" section
to test merge. We leave it to you to write a unit test
for merge_sort if necessary.
The sequential search in find_city will still work
with the array after it has been sorted by merege_sort.
But change it to implement binary search for a more efficient solution.
Unit Tests and Efficiency Tests
In addition to testing the TSP executable, we will test
some of the functions individually using the unit tests in
/c/cs223/hw3/Required/cities_unit.c. The starter makefile
in /c/cs223/hw3/Optional/makefile builds a second executable
called Unit that runs the unit tests. You can examine
cities_unit.c to see what the unit tests do.
The
Unit executable also tests the efficiency of your
sort and search functions (initialize_city_database
and find_city to determine whether they run in worst-case
time $O(n \log n)$ and $O(\log n)$ respectively. The efficiency tests
are empirical and so may give both false positives (your code passes
the tests but does not run within the asymptotic time bound; perhaps
you have exceptionally good constants for an $O(n^2)$ implementation
when the goal is $O(n \log n)$) and false negatives (your code
fails the tests but does in fact meet the asymptotic time bound;
perhaps you have an $O(n \log n)$ implementation with exceptionally
high constants). We will allow ourselves to be fooled by false positives.
But you should fix false negatives because constants do matter –
if your solution runs in at most $100n \log n$ instructions but it is
possible to reduce that to $10n \log n$ with reasonable effort, then
you should make that effort!
Although the test scripts do not have a unit test for merge
and merge_sort, you may find it helpful to have such tests
(and also for any other functions you write). You can start with the
following test for merge and modify it for other test cases.
void test_merge()
{
// locations don't matter for testing merge, so using 0.0
city us[] = {
{"ABE", {0.0, 0.0}},
{"CHX", {0.0, 0.0}},
{"DEN", {0.0, 0.0}},
{"TLH", {0.0, 0.0}}
};
int us_count = sizeof(us) / sizeof(city);
city asia[] = {
{"DAC", {0.0, 0.0}},
{"DAM", {0.0, 0.0}},
{"NRT", {0.0, 0.0}},
{"VVO", {0.0, 0.0}}
};
int asia_count = sizeof(asia) / sizeof(city);
int total_count = us_count + asia_count;
city output[total_count];
// EXPECTED: both lists output in sorted order
merge(us_count, us, asia_count, asia, output);
printf("MERGED US AND ASIA\n");
for (int i = 0; i < total_count; i++)
{
printf("%s\n", output[i].name);
}
merge(asia_count, asia, us_count, us, output);
printf("MERGED ASIA AND US\n");
for (int i = 0; i < total_count; i++)
{
printf("%s\n", output[i].name);
}
}
Examples and Output
Ifsea_6_dal.in contains
8 SEA YVR MCI STL BWI DCA IAH DALthen the output of running your program should be
(base) [jrg94@newt code]$ ./TSP -given sea_6_dal.in -given: 9306.22 SEA YVR MCI STL BWI DCA IAH DAL SEA (base) [jrg94@newt code]$ ./TSP -nearest sea_6_dal.in -nearest: 10067.11 SEA YVR MCI STL DAL IAH DCA BWI SEA (base) [jrg94@newt code]$ ./TSP -insert sea_6_dal.in -insert: 9320.21 SEA DAL IAH BWI DCA STL MCI YVR SEA
Submissions
Submit your modifiedtsp.c and cities.c
files, any other files you
created, a makefile with the TSP executable as the default
target and Unit as an additional target, and your log as assignment number 3. You need not submit
the location module, the cities.h header
file, or the cities_unit.c unit tests
(and if you do, they will be ignored).