YALE UNIVERSITY
DEPARTMENT OF COMPUTER SCIENCE

 CPSC 223b: Data Structures and Programming TechniquesHandout #15
Professor M. J. Fischer  April 8, 2007



 

Problem Set 8

Due before midnight on Tuesday, April 17, 2007.

1 Symbol Tables and Heaps

Symbol tables and heaps are both generally useful data structures. A fairly general implementation of a symbol table and a heap were presented in class demos 16.1_symtab and 18.1_heap, respectively. Slightly cleaned up versions of this code have been placed in the course directory

/c/cs223/assignments/ps8

Unfortunately, these implementations both lack the ability to delete arbitrary elements from their data structures, which tends to limit their usefulness. Deletion is often needed in applications (and will be needed for Problem Set 9, which will use these data structures). This assignment is to add that missing functionality to both data structures.

2 Deletion from a Symbol Table

You should add the following prototype to the interface symtab.h:

  bool delete_symtab( symtab table, void⋆ value );

and a corresponding implementation of delete_symtab() to the code file symtab.c. If value is not present in symtab, delete_symtab() should simply return false. Otherwise, it should remove value from symtab, free value using the domain-specific free_element() function, and return true. A function pointer to free_element() is placed in the element descriptor that is passed to new_symtab() when the symbol table is first created. (See the code for further information about how this is actually done.)

3 Deletion from a Heap

You should add the following prototype to the interface heap.h:

  void⋆ delete_heap( heap h, int slot );

and a corresponding implementation of delete_heap() to the code file heap.c. Here, slot is an integer giving the node number (array index in the heap) that contains the element to be deleted. Recall that the root of the tree is slot 1, and slot 0 is unused. If slot is not present in the heap, delete_heap() should simply return NULL. Otherwise, it should remove and return the value contained in the specified slot. The value returned should not be freed.

3.1 Managing slot numbers

A natural question is, “How is the client supposed to know what slot an element is in that it wants to delete from the heap?” The answer is that the heap will notify the client about each element’s location in the heap when the element is first inserted into the heap, whenever its location changes, and when it is finally deleted from the heap.

The way the heap notifies the client is through a call-back function of type notify_fn. You should add the following type definition to heap.h:

  typedef void (⋆notify_fn) (void⋆ value, int slot);

You should also modify the prototype for new_heap to take a second function argument as follows:

  heap new_heap( comp_fn comp, notify_fn notify);

This function pointer is stored in the heap data structure in a new field notify. Whenever any heap operation moves an element value it will call notify( value, newslot ), where newslot is the new location of the element. When value is removed from the heap, notify is called with the new slot set to -1 (to mean “not in heap”).

It’s up to the client to keep track of the location information provided by notify() if it wants to be able to later delete specific elements. The most convenient way to manage this is to store the current location of each element into the element record itself. We do this by defining notify(value, newslot) to simply store newslot into the appropriate field of value.

3.2 Removing elements from the middle of a heap

An algorithm for deleting an element from the middle of a heap was presented in class. First, the element to be deleted is removed from the heap, leaving a hole. Next, the hole is moved up to the root by sliding each element down a level that lies on the path between the hole and the root. Now that the hole is at the root, the deletion is completed using the method already implemented by delete_min(), i.e., the last node in the heap is deleted and the element residing there is reinserted into the heap by starting at the root and moving the hole down until it reaches a place where that element can be inserted. In order to preserve the heap property, one always walks toward the smaller of the two sons.

This method can be improved still further. Rather than move the hole all the way up to the root and then back down, we’re smarter about which way to move. Let x be the element being deleted, and let v be the element in the last node of the heap. As before, we delete the last node of the heap and place the hole at the slot originally containing x, but now we compare v with x in order to decide what to do next. If v is larger than x, we walk down directly from the hole and reinsert v into the queue, just as delete_min() already does when the hole is at the root. On the other hand, if v is smaller than x, we insert it into the path from the hole to the root, just as insert_heap() already does when inserting a new element into the heap. You’ll notice that I have separated out the hole-moving code into two local functions move_up and move_down in the implementation file heap.c.

A couple of examples should help make this clear. Figure 1 shows a heap.


PIC

Figure 1: A heap.


Figure 2 illustrates the process of deleting node 14. First 14 is removed, becoming a hole, and 6 is removed from the bottom right leaf. Since 6 is smaller than 14, the hole is moved up a level by by moving 9 down. The node just vacated is then filled with 6.


PICPIC

Figure 2: Deleting node 14.


Figure 3 illustrates the process of deleting node 3. Since 6 is larger than 3, the hole must move down.


PICPIC

Figure 3: Deleting node 3.


4 Assignment

Implement and test the extended symtab and heap data structures, modify the given test programs to allow the new deletion operations to be exercised, and write some additional test programs and/or test data sets to thoroughly test the data structures.

5 Deliverables

You should submit complete source code and a Makefile for your modules. You should also submit your test programs and data sets.