Exam #2 Practice -CPSC 223 - Data Structures and Programming Techniques

Problem 0: Review old programming assignments, lectures, and readings. Use Piazza to suggest a question you feel is missing from this collection of practice problems.

Problem 1: Comment on the quality of the two suggested hash functions for arrays of integers, considering the conditions that are required for hash table operations to work in $O(1)$ expected time.

the hash value is the bitwise exclusive-or of all the integers in the array
the hash value is the sum of all the integers in the array

Problem 2: Consider the problem of, given a list of integers, checking whether they are all different. Give an algorithm for this problem with an average case of $O(n)$ under reasonable assumptions. Give an algorithm that has a worst case $O(n \log n)$ .

Problem 3: Suppose we have the following hash function:

s	hash(s)
ant	43
bat	23
cat	64
dog	19
eel	26
fly	44
gnu	93
yak	86

Show the results of adding the strings in alphabetical order into a hash table of size 8, first using chaining, then using open addressing with linear probing.

Problem 4: Add the ismap_remove operation to the implementation from Oct. 30 and modify the existing functions as necessary.

Problem 5: Draw the binary search tree that results from adding SEA, ARN, LOS, BOS, IAD, SIN, and CAI in that order. Find an order to add those that results in a tree of minimum height. Find an order to add those that results in a tree of maximum height. For your original tree, show the result of removing SEA, then show the order in which the nodes would be processed by an inorder traversal.

Problem 6: Repeat the original adds and delete from the previous problem for an AVL tree.

Problem 7: Repeat the original adds from the previous problem for a Red-Black tree.

Problem 8: Repeat the original adds and delete from the previous problem for a Splay tree using bottom-up splaying.

Problem 9: Implement the expand operation for isset, which implements a set integrs using a binary search tree where the nodes contain disjoint intervals of integers and adjacent intervals are merged into a single node. So if 3, 4, 5, 6, 7, 9, and 10 were all in the set then they would be represented by two nodes: one containing the interval [3, 7] and another containing [9, 10]. Adding 8 would merge those two nodes into one containing the interval [3, 10].

The expand operation takes an integer as its parameter and expands the interval containing it to be as large as possible without requiring a merge; it does not change the left endpoint or right endpoint of the leftmost or rightmost interval in the tree respecitvely, and does nothing if the integer is not in the set. For example, if a set contains [3, 7], [10, 14], [20, 24], [90, 91] then expand(1) would do nothing, expand(4) would change [3, 7] to [3, 8], expand(22) would change [20, 24] into [16, 88], and expand(90) would change [90, 91] into [26, 91]. Make sure your implementation runs in $O(\log n)$ time (worst case for AVL trees, amortized for splay trees).

Problem 10: Consider the remove_incoming operation, which takes a vertex $v$ in a directed graph and removes all incoming edges to that vertex. Explain how to implement remove_incoming when the graph is represented using an adjacency matrix and then for an adjacency list. What is the asymptotic running time of your implementations in terms of the number of vertices $n$ and the total number of edges $m$ ?

Problem 11: Show the DFS tree that results from running DFS on the following directed graph. Start at vertex $a$ and when you get to a vertex, consider its neighbors in alphabetical order.

Problem 12: Show the BFS tree that results from running BFS on the graph in the previous problem. Start at vertex $a$ and after dequeueing a vertex, consider its neighbors in alphabetical order.

Problem 13: The following function computes $C(n,k)$ , where $C(n, 0) = 1$ and $C(n, n) = 1$ for all $n \ge 0$ , and $C(n, k) = C(n - 1, k - 1) + C(n - 1, k)$ for all $n, k$ such that $1 \le k \le n - 1$ . ( $C(n, k)$ is the number of distinct size- $k$ subsets of $\{1, \ldots, n\}$ .

int choose(int n, int k)
{
  if (k == 0 || n == k)
    {
      return 1;
    }
  else
    {
      int including_n = choose(n - 1, k - 1);
      int not_including_n = choose(n - 1, k);

      return including_n + not_including_n;
    }
}

Explain why this code is inefficient.
Write more efficient code to compute $C(n, k)$ . What is running time of your code in terms of $n$ and $k$ ? What is the space requirement? Can you reduce the space requirement to $\Theta(n)$ (if you are not there already)?