Exercise 6: Union/Find

Objectives

Assignment

We return yet again to our implementation of heuristics for the Travelling Salesperson Problem. We wish to add the following greedy heuristic to the TSP program: We can implement this heuristic with a union/find (disjoint set) data structure. A union/find data structure keeps track of a partition of $0, \ldots, n-1$ into disjoint sets. The partition can be changed by unioning two parts of the partition together. Each part of the partition has a representative such that each element in the same part will have the same representative, which is returned by the find operation.

To use a union/find data structure for the greedy heuristic, we start with each city in its own singleton part of the partition. To determine if two cities have already been connected, we use a find operation to see if they have the same representative. When two cities are connected, we union their two parts.

As with the insert method, reorder the tour so that it starts and ends at the first city in the input, and then reverse it if necessary so that the second city in the tour is alphabetically before the penultimate city.

Code

Union/Find

Implement the union/find data structure by implementing the functions declared in /c/cs223/hw6/Required/union_find.c as described below.

The union_find_create function should create two structs for each element: a header and a linked list node. Each part of the partition will be represented by one linked list. The element at the head of the linked list is the representative of the corresponding part of the partition. The header for an element points to the node for that element, and, if the element is the representative of its part of the partition, the size of the part and the tail node of the linked list for that part (if an element is not the representative of its part of the partition then the size and tail elements of its header are meaningless). Since each element starts in its own singleton part of the partition, all head pointers point backwards along the node pointers, all next pointers should initially be NULL, and all tail pointers will be the same as the node pointers.

The union_find_find function should go to the header for the element to find the representative of, follow the node pointer from there, and the head pointer from there to find the representative.

The union_find_union function should first check if the two parts to be unioned are in fact distinct parts. If they are, find the representative of the two parts, concatenate the shorter of the two corresponding linked lists to the longer (if tied then choose arbitrarily), and change the head pointers in the shorter one to point to the head of the longer one. Update the tail so the new tail of the longer list is the old tail of the shorter list. Also, update the size of the longer list (the shorter one has been absorbed, so its size and tail won't be used any more).

The union_find_destroy function should destroy all the nodes and headers, and finally the meta-data struct.

TSP

Modify your tsp.c program to apply the greedy heuristic when the -greedy option is given as the method.

Start by creating a list of all possible pairs of city indices and sort them in order of increasing distance using the generic merge_sort in the mergesort module in /c/cs223/hw6/Required.

Declare and initialize an array to keep track of how many selected segments each city is in.

Declare and initialize a union/find data structure with one element per city, each in its own part of the partition.

Go through the sorted list of city pairs. If both cities are in at most one selected segment and they are in different parts, then add the corresponding segment to the partial tour, union their parts together in the union/find data structure, and increment how many segments they are in.

Once all the segments have been selected, find either of the two cities that is in only one segment (one of the endpoints of the tour). Make that the first city in the tour.

Until the final tour is complete, repeatedly find the segment that contains the current last city and that isn't already in the tour, and add the other city in that segment to the end of the tour.

When the tour is complete, normalize it as for the -insert method.

Unit Tests and Efficiency Tests

There are some unit tests and efficiency tests in /c/cs223/hw6/Required/union_find_unit.c, and your makefile must build an executable called Unit that runs those tests with your implementation of the union/find data structure.

Any sequence that starts with a partition of $n$ elements in singleton sets and performs $n-1$ union operations to create a single set containing all the elements must run in $O(n \log n)$ time.

Examples and Output

If sea_6_dal.in contains
SEA
YVR
MCI
STL
BWI
DCA
IAH
DAL
then the output of running your program should be
(base) [jrg94@termite code]$ ./TSP -greedy sea_6_dal.in
 -greedy:   9711.84 SEA IAH DAL MCI STL DCA BWI YVR SEA

Submissions

Submit your modified tsp.c and your union_find.c files, any other files you created, a makefile with the TSP executable as the default target and Unit as an additional target, and your log as assignment number 6. You need not submit the location and mergesort modules, the union_find.h header file, or the union_find_unit.c unit tests (and if you do, they will be ignored).