Assignment 6: Analyzing a Social Network

Objectives

Introduction

Social networks are complex, data-rich entities in which many different fields, ranging from anthropology to biology, take special interest. Their study and use are truly staggering—you've probably engaged with a service within the past hour that leverages, in some way, information gleaned from a digital implementation of such an entity (possibly for the service's profit). In this assignment, we'll be modeling a social network using an undirected graph, which is a type of graph for which associativity is the defining relationship between entities––not directionality. More specifically, we'll be probing two different queries:

Assignment

There are two parts to this assignment:

Implementing the graph ADT

Complete the implementation of the functions declared in /c/cs223/hw6/Required/graph.h to implement a graph that performs each of the following functions in accordance with their respective bounds: You may make the usual assumptions about running time—for example, that malloc and free run in $O(1)$ time. Note that, in the above statements, V represents the number of vertices in the graph, E the number of edges in the graph, D the degree of one particular vertex, and S the length of the longest string connected to the relevant vertex. For the purposes of this assignment, we let S be bounded by a constant, and we ignore it.

Your ADT will be subjected to a gauntlet of unit tests that examine individual functions. An incomplete suite of unit tests is implemented in /c/cs223/hw6/Required/graph_unit.c, which will be replaced with a complete test suite for the private test script. Your makefile must include a target GraphUnit that is built from those two files and your implementation of the graph module. All unit tests must pass with no errors reported from valgrind, and any use of your graph module that doesn't violate the functions' preconditions must result in no errors from valgrind. No public or private tests will violate the preconditions.

Note that although you may not use all of the functions provided by the graph module when writing the Friends program, you must implement all of them for this part of the assignment.

Writing the Friends program

Use your graph ADT implementation to write a program that performs the two functions specified above.

Input

The command-line arguments will be as follows:

More specifically, the relevant file, here referred to as inputs, will contain a number of lines each with two non-empty strings of characters, or names, separated by a single comma. Each name represents a vertex of the graph (a person in the social network), and each line represents an edge ("friendship"). Each edge (that is, vertex-vertex combination, not permutation) is guaranteed to be unique, and each name is guaranteed to consist of 20 or fewer characters. Edges in inputs cannot extend from a vertex to itself.

Each query will be supplied from standard input in the same manner as name pairs above (comma-delimited, each less than or equal to 20 characters in length), giving names of people on which one (or both) of the two tasks are to be performed. Each name must also be present in inputs. Both input sources will be terminated by a newline character, and there will not be any other whitespace throughout, aside from other implicitly necessary newline characters. Commas will not appear in any names.

Output

If the second command-line argument is distance, the Friends program will then output for each query q:

If the second command-line argument is mutual, the Friends program will then output for each query q:

If the second command-line argument is both, then the output should be the specified output for distance, followed by the specified output for mutual, for each query supplied in standard input.

The output should be formatted as in the example below. Note that, if distance is specified and some query consists of vertices between which there exists no path, the string v, v' are not connected should be printed to standard output followed by a newline character. Similarly, if two identical vertices are provided in some query under distance the string v, v' are the same person should be printed to standard output followed by a newline character. Similarly, if mutual is specified and some query consists of names that have no mutual friends, the string v, v' have no mutual friends should be printed to standard output followed by a newline character. If any of the input (standard input, inputs, or the remaining command-line argument) is invalid for any reason, then the output can be anything and your program must not crash or hang.

Note that the required runtime bounds given above are specified for the graph ADT. If we wish to express required runtime as a function of graph size and queries, the bounds are necessarily different.

Several files have been made available to you in /c/cs223/hw6/Required/ and /c/cs223/hw6/Optional/. In particular, consider the incomplete graph.c, which features at its heart an adjacency matrix. You will probably want to change what's given iteratively, gradually working towards a solution that satisfies all of the specified time bounds. Additionally, note that gmap is a dependency of graph.c, but only its header file accompanies. Finally, you may wish to incorporate gqueue, an ADT implementing a generic queue, into your submission in some capacity.

If we let Q be the number of provided queries, Friends must run in expected $O(Elog(E) * Q)$ if mutual is given as the second command-line argument and in expected $O((E + V) * Q)$if distance is given as the second command-line argument.

Examples

[crb84@turtle hw6]$ cat tests/graph_alpha_subgraphs_2.in 
A,B
B,C
D,E
[crb84@turtle hw6]$ cat tests/query_alpha_4.in 
A,B
A,C
B,D
B,E
[crb84@turtle hw6]$ ./Friends tests/graph_alpha_subgraphs_2.in distance < tests/query_alpha_4.in
A, B have minimum path length 1
A, C have minimum path length 2
B, D are not connected
B, E are not connected
[crb84@turtle hw6]$ cat tests/graph_alpha_mini_11.in 
A,B
B,C
A,C
A,F
F,B
F,E
E,D
B,D
C,D
D,G
G,E
B,E
[crb84@turtle hw6]$ cat tests/query_alpha_3.in 
A,E
C,F
B,G
[crb84@turtle hw6]$ ./Friends tests/graph_alpha_mini_11.in both < tests/query_alpha_3.in 
A, E have minimum path length 2
A, E have mutual friends B F
C, F have minimum path length 2
C, F have mutual friends A B
B, G have minimum path length 2
B, G have mutual friends D E

Submissions

Submit your makefile, log file, and source code necessary to build the executables Friends and GraphUnit using your makefile. Do not submit graph.h or graph_unit.c, as you may not change those files. Your makefile should assume that the files from /c/cs223/hw6/Required have been copied into the current directory but not that they are still in their original locations, and you should not submit those files.