Exercise 2: Travelling Salesperson

Objectives

Introduction

The Travelling Salesperson Problem is the problem of, given a set of locations and the distances between every pair of them, determining the shortest tour that starts and ends at the same city and visits each other city exactly once. It is a very common problem in logistics: consider, for example, a delivery driver who needs to make stops at a list of 50 addresses and wants to minimize fuel use along the way.

Assignment

Write a program called TSP that calculates routes between cities by various methods. The name of the file containing the cities and their locations and the method to use to calculate the routes will be given as command-line arguments. For each method, the output should be the method used, the total distance, and the route.

The method to construct the tours (if any) will be given as the first command-line argument. In all cases, distances should be calculated with the location_distance function in the distance module that is provided as header and implementation files in /c/cs223/hw2/Required that you can compile and link with your code.

Code

Start with the code in /c/cs223/hw2/Optional/tsp.c, which declares some functions that you can define to complete the assignment. There is also a module /c/cs223/hw2/Required/distance.c with a corresponding header file that contains a function that will compute the great circle distance between two points on Earth given their latitudes and longitudes.

Complete the code to read the input file first. In main, check that there are the required number of command-line arguments. If there are not, print an error message and exit the program. Open the file whose name is given by the second command-line argument for reading; if this fails then print an error message and exit. Use fscanf to read the first line of the file – the number of cities in the input. If that is not at least two then print an error message and exit; otherwise declare three arrays: one to hold a three-character string for the code of each city, one to hold the latitude of each city, and one to hold the longitudes. Pass the file, the number of cities, and the arrays to read_input to read the input file.

In read_input, loop over the cities to read from the file, and in each iteration of the loop read the city code, and the latitude or longitude. If there was a problem reading the data or if the coordinates are out of range (ranges are -90 to 90 inclusive for latitudes and -180 to 180 inclusive for longitudes) then return the number of cities read successfully (main can then use that count to determine if the all the cities were read in). You may assume that there are no more than 500 cities, the city codes are three non-space characters, that the codes are distinct within the file, and that the coordinates define distinct points (so each pole may only appear once despite having an infinite number of possible representations as (latitude, longitude) pairs); if this is not the case then the only requirement is that your program does not crash or hang.

Back in main, if read_input didn't read all the cities, then print an error message and exit. This would be a good time to check that you're reading the data correctly by printing the values read_input stored in the arrays.

Once you're satisfied that you are reading the input correctly, add code to check the value of the first command-line argument and perform the necessary calculations for the selected method (and if the first argument is not one of the three methods then print an error message and exit). In all three cases it may be helpful to keep track of the tour in terms of the indices of the cities in the arrays rather than trying to rearrange those arrays. For example, if the input contains ACC, LOS, JNB, CAI, NBO, ADD in that order, then the tour CAI-ADD-NBO-ACC-LOS-JNB-CAI would be represented by an array [3, 5, 4, 1, 2, 0].

For the -given case, simply fill the tour array with $0, 1, 2, \ldots$, call the calculate_total function to compute the total distance of the tour (you will need to complete that function too), and output the result along with the tour.

The remaining two methods compute the distances between many pairs of cities, and may need the distance between the same pair more than once. The location_distance function is very slow, so it will speed up our program if we compute the distances between every pair and store them in a 2-D array for later use.

For the -nearest case, pass the city count, the uninitialized tour array, and the 2-D array of distances to the route_nearest function. The route_nearest function can start by initializing the tour to $0, 1, 2, \ldots$. It will need to keep track of the partially constructed tour and the indices of the cities that haven't been added yet. The tour array can serve both purposes: the first portion of the array can contain the partial tour and the rest can contain, in some order, the cities that haven't been added yet. At the beginning, the first element is the partial tour and the rest haven't been added. The rest of the function then repeatedly finds the minimum distance between the last city in the partial tour and the rest of the cities and swaps the index of the selected city into place to grow the partial tour. Back in main, after calling route_nearest, call calculate_total and output the results.

(Optional) After testing the code for -nearest, complete the case for -insert by passing the number of cities, the uninitialized tour array, and the distance array to route_insert. You can keep track of the partial tour and the cities that haven't been added yet in the same way as in the route_nearest function, except instead of initializing the partial tour to [0], you initialize it with the indices of the two cities that are closest together. Then, on each step, select the closest city among those not yet in the tour to those already in the tour, find the best insertion point (hint: it may be easier to minimize the distance added to the current tour than the total distance of the resulting tour), and update the partial tour and the cities not added yet. Finally, in main call (and write the code for) normalize to reorder the tour as specified.

As usual, if the input (both the file and the command-line arguments) does not conform to this specification then the only requirement is that your program does not crash or hang. The suggestions to print an error message and exit the program will satisfy that condition for the corresponding violations, but there may be other cases that are not specifically mentioned here.

Examples and Output

If sea_6_dal.in contains
8
SEA 47.450000 -122.311667
YVR 49.194722 -123.183889
MCI 39.297500 -94.713889
STL 38.748611 -90.370000
BWI 39.175278 -76.668333
DCA 38.851944 -77.037778
IAH 29.984444 -95.341389
DAL 32.847222 -96.851667
then the output of running your program should be
(base) [jrg94@newt code]$ ./TSP -given sea_6_dal.in
  -given:   9306.22 SEA YVR MCI STL BWI DCA IAH DAL SEA
(base) [jrg94@newt code]$ ./TSP -nearest sea_6_dal.in
-nearest:  10067.11 SEA YVR MCI STL DAL IAH DCA BWI SEA
(base) [jrg94@newt code]$ ./TSP -insert sea_6_dal.in
 -insert:   9320.21 SEA DAL IAH BWI DCA STL MCI YVR SEA
Note that the method is right justified in 8 characters, the distance is given with two digits after the decimal point and 7 digits before (padded with spaces) and there is a single space before each city code and no space at the end of the line.

Submissions

Submit your completed tsp.c file, any other files you created, a makefile with the TSP executable as the default target, and your log as assignment number 2. You need not submit the distance module (and if you do, it will be ignored).