{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "## CS 200: Data Structures and HW4\n", "\n", "\n", "\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We have seen a variety of basic data types in Python, including integers, strings, lists, tuples, and dictionaries.\n", "\n", "We have also seen how object oriented programming allows us to define classes that have methods and properties to encapsulate data.\n", "\n", "Now, we will use classes to define additional data structures. If you consider the primitive data types as atomic elements, then data structures can be viewed as molecules that are formed by combining various elements.\n", "\n", "In this notebook, we shall define and discuss the following data types:\n", "\n", "- stacks\n", "- queues\n", "- hash tables\n", "- heaps\n", "- trees\n", "- graphs\n", "\n", "\n", "### Stacks\n", "\n", "A common use of classes is to implement data structures.\n", "Below is an example of a stack,\n", "which is a LIFO - last in first out - structure.\n", "It is a collection.\n", "\n", "Items are added to the stack with push and removed with pop.\n", "\n", "We will see that the python virtual machine for interpreting\n", "byte code is based on a stack architecture." ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "class stack:\n", " ''' A class for a stack data structure. It is LIFO - last in, first out. '''\n", " \n", " def __init__(self, items = []):\n", " ''' Constructor for a stack. Initialize the list of items and the size. '''\n", " ## Why not say: self.items = items ?\n", " self.items = items[:]\n", " self.size = len(items)\n", "\n", " def __repr__(self):\n", " ''' return a string that evaluates to the stack. '''\n", " return \"stack({})\".format(list(self.items))\n", "\n", " def isEmpty(self):\n", " ''' predicate: is the stack empty?'''\n", " return self.items == []\n", "\n", " def push(self, item):\n", " ''' add an item to the end of the stack. '''\n", " self.items.append(item)\n", " self.size += 1\n", "\n", " def peek(self):\n", " ''' return the end of the stack, if not empty. '''\n", " if self.isEmpty():\n", " print (\"Error: stack is empty\")\n", " else:\n", " return self.items[-1]\n", "\n", " def pop(self):\n", " ''' Remove and return the item at the end of the stack. \n", " If the stack is empty, print error message. '''\n", " if self.isEmpty():\n", " print (\"Error: stack is empty\")\n", " else:\n", " self.size -= 1\n", " return self.items.pop()\n", "\n", " def rotate(self):\n", " ''' swap the top two items in the stack. '''\n", " if self.size < 2:\n", " print (\"Error: stack has fewer than 2 elements\")\n", " else:\n", " self.items[-1], self.items[-2] = self.items[-2], self.items[-1]\n", "\n", " def __iter__(self):\n", " \"\"\"Return iterator for the stack. Used in for loop or list comprehension. \"\"\"\n", " if self.isEmpty():\n", " return None\n", " else:\n", " index = self.size -1\n", " while index >= 0:\n", " yield self.items[index]\n", " index -= 1\n", "\n", " def __eq__(self, other):\n", " ''' equality predicate for stacks. (==) '''\n", " if type(other) != type(self):\n", " return False\n", " if self.items == other.items:\n", " return True\n", " else:\n", " return False\n", " \n", " def copy(self):\n", " ''' copy constructor - clone the current instance. '''\n", " s = stack(self.items)\n", " return s" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Help on class stack in module __main__:\n", "\n", "class stack(builtins.object)\n", " | stack(items=[])\n", " | \n", " | A class for a stack data structure. It is LIFO - last in, first out.\n", " | \n", " | Methods defined here:\n", " | \n", " | __eq__(self, other)\n", " | equality predicate for stacks. (==)\n", " | \n", " | __init__(self, items=[])\n", " | Constructor for a stack. Initialize the list of items and the size.\n", " | \n", " | __iter__(self)\n", " | Return iterator for the stack. Used in for loop or list comprehension.\n", " | \n", " | __repr__(self)\n", " | return a string that evaluates to the stack.\n", " | \n", " | copy(self)\n", " | copy constructor - clone the current instance.\n", " | \n", " | isEmpty(self)\n", " | predicate: is the stack empty?\n", " | \n", " | peek(self)\n", " | return the end of the stack, if not empty.\n", " | \n", " | pop(self)\n", " | Remove and return the item at the end of the stack. \n", " | If the stack is empty, print error message.\n", " | \n", " | push(self, item)\n", " | add an item to the end of the stack.\n", " | \n", " | rotate(self)\n", " | swap the top two items in the stack.\n", " | \n", " | ----------------------------------------------------------------------\n", " | Data descriptors defined here:\n", " | \n", " | __dict__\n", " | dictionary for instance variables (if defined)\n", " | \n", " | __weakref__\n", " | list of weak references to the object (if defined)\n", " | \n", " | ----------------------------------------------------------------------\n", " | Data and other attributes defined here:\n", " | \n", " | __hash__ = None\n", "\n" ] } ], "source": [ "help(stack)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let's take our stack out for a test drive." ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [], "source": [ "s = stack()\n", "s.push(1)\n", "s.push(2)\n", "s.push(3)\n", "s.push(4)" ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "stack([1, 2, 3, 4])" ] }, "execution_count": 4, "metadata": {}, "output_type": "execute_result" } ], "source": [ "s" ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "False" ] }, "execution_count": 5, "metadata": {}, "output_type": "execute_result" } ], "source": [ "s.isEmpty()" ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "4" ] }, "execution_count": 6, "metadata": {}, "output_type": "execute_result" } ], "source": [ "s.peek()" ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "4" ] }, "execution_count": 7, "metadata": {}, "output_type": "execute_result" } ], "source": [ "s.pop()" ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "stack([1, 2, 3])" ] }, "execution_count": 8, "metadata": {}, "output_type": "execute_result" } ], "source": [ "s" ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [], "source": [ "s.rotate()" ] }, { "cell_type": "code", "execution_count": 10, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "stack([1, 3, 2])" ] }, "execution_count": 10, "metadata": {}, "output_type": "execute_result" } ], "source": [ "s" ] }, { "cell_type": "code", "execution_count": 11, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "True" ] }, "execution_count": 11, "metadata": {}, "output_type": "execute_result" } ], "source": [ "type(s) == stack" ] }, { "cell_type": "code", "execution_count": 12, "metadata": {}, "outputs": [], "source": [ "s2 = s.copy()" ] }, { "cell_type": "code", "execution_count": 13, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "True" ] }, "execution_count": 13, "metadata": {}, "output_type": "execute_result" } ], "source": [ "s == s2" ] }, { "cell_type": "code", "execution_count": 14, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "[1, 3, 2]" ] }, "execution_count": 14, "metadata": {}, "output_type": "execute_result" } ], "source": [ "s.items" ] }, { "cell_type": "code", "execution_count": 15, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "[1, 3, 2]" ] }, "execution_count": 15, "metadata": {}, "output_type": "execute_result" } ], "source": [ "s2.items" ] }, { "cell_type": "code", "execution_count": 16, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "True" ] }, "execution_count": 16, "metadata": {}, "output_type": "execute_result" } ], "source": [ "s.items == s2.items" ] }, { "cell_type": "code", "execution_count": 17, "metadata": {}, "outputs": [], "source": [ "s2.rotate()" ] }, { "cell_type": "code", "execution_count": 18, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "stack([1, 2, 3])" ] }, "execution_count": 18, "metadata": {}, "output_type": "execute_result" } ], "source": [ "s2" ] }, { "cell_type": "code", "execution_count": 19, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "False" ] }, "execution_count": 19, "metadata": {}, "output_type": "execute_result" } ], "source": [ "s == s2" ] }, { "cell_type": "code", "execution_count": 20, "metadata": {}, "outputs": [], "source": [ "s2.rotate()" ] }, { "cell_type": "code", "execution_count": 21, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "True" ] }, "execution_count": 21, "metadata": {}, "output_type": "execute_result" } ], "source": [ "s == s2" ] }, { "cell_type": "code", "execution_count": 22, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "stack([1, 3, 2])" ] }, "execution_count": 22, "metadata": {}, "output_type": "execute_result" } ], "source": [ "s" ] }, { "cell_type": "code", "execution_count": 23, "metadata": {}, "outputs": [], "source": [ "def test(s):\n", " ''' test for the iterator. '''\n", " for i in s:\n", " print (i)\n", " return [x for x in s]" ] }, { "cell_type": "code", "execution_count": 24, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "2\n", "3\n", "1\n" ] }, { "data": { "text/plain": [ "[2, 3, 1]" ] }, "execution_count": 24, "metadata": {}, "output_type": "execute_result" } ], "source": [ "test(s)" ] }, { "cell_type": "code", "execution_count": 25, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "3\n", "2\n", "1\n" ] }, { "data": { "text/plain": [ "[3, 2, 1]" ] }, "execution_count": 25, "metadata": {}, "output_type": "execute_result" } ], "source": [ "s2.rotate()\n", "test(s2)" ] }, { "cell_type": "code", "execution_count": 26, "metadata": {}, "outputs": [], "source": [ "def revstr(str):\n", " ''' revstr(str) uses a stack to reverse a string. \n", " It works on a copy and does not modify the original string.'''\n", " s = stack()\n", " for c in str:\n", " s.push(c)\n", " result = []\n", " while (not s.isEmpty()):\n", " result.append(s.pop())\n", " return ''.join(result)" ] }, { "cell_type": "code", "execution_count": 27, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "'!dlrow olleh'" ] }, "execution_count": 27, "metadata": {}, "output_type": "execute_result" } ], "source": [ "revstr('hello world!')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### hw4 problem 1 (8 points)\n", "\n", "\n", "Write a procedure balanced(string) that reads string, and determines\n", "whether its parentheses are \"balanced.\" \n", "\n", "Hint: for left delimiters,\n", "push onto stack; for right delimiters, pop from stack and check\n", "whether popped element matches right delimiter.\n", "\n" ] }, { "cell_type": "code", "execution_count": 28, "metadata": {}, "outputs": [], "source": [ "def balanced(string):\n", " pass" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We will import the staff solution to demonstrate the functions." ] }, { "cell_type": "code", "execution_count": 29, "metadata": {}, "outputs": [], "source": [ "import hw4a" ] }, { "cell_type": "code", "execution_count": 31, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "False" ] }, "execution_count": 31, "metadata": {}, "output_type": "execute_result" } ], "source": [ "hw4a.balanced('(()))')" ] }, { "cell_type": "code", "execution_count": 32, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "True" ] }, "execution_count": 32, "metadata": {}, "output_type": "execute_result" } ], "source": [ "hw4a.balanced('()')" ] }, { "cell_type": "code", "execution_count": 33, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "True" ] }, "execution_count": 33, "metadata": {}, "output_type": "execute_result" } ], "source": [ "hw4a.balanced('((())())')" ] }, { "cell_type": "code", "execution_count": 152, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "True" ] }, "execution_count": 152, "metadata": {}, "output_type": "execute_result" } ], "source": [ "hw4a.balanced('abcd(1234)dfg')" ] }, { "cell_type": "code", "execution_count": 34, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "True" ] }, "execution_count": 34, "metadata": {}, "output_type": "execute_result" } ], "source": [ "hw4a.balanced('')" ] }, { "cell_type": "code", "execution_count": 35, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "False" ] }, "execution_count": 35, "metadata": {}, "output_type": "execute_result" } ], "source": [ "hw4a.balanced('abcdef)')" ] }, { "cell_type": "code", "execution_count": 32, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "False" ] }, "execution_count": 32, "metadata": {}, "output_type": "execute_result" } ], "source": [ "hw4a.balanced('abc(')" ] }, { "cell_type": "code", "execution_count": 36, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "False" ] }, "execution_count": 36, "metadata": {}, "output_type": "execute_result" } ], "source": [ "hw4a.balanced(')(')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Queues\n", "\n", "In the homework, we ask you to write the queue class.\n", "\n", "Write a queue data structure, similar to the stack above.\n", "Whereas a stack is LIFO (last in first out), a queue is \n", "FIFO = first in, first out\n", "\n", "See Skiena, page 71. The Algorithm Design Manual\n", "Steven Skiena\n", "\n", "\n", " Yale online book\n", "\n" ] }, { "cell_type": "code", "execution_count": 37, "metadata": {}, "outputs": [], "source": [ "class queue:\n", " ''' A queue data structure: First In First Out FIFO.'''\n", " def __init__(self, stuff=[]):\n", " ''' Constructor for a queue object. '''\n", " pass\n", "\n", " def __str__(self):\n", " ''' Render queue instance as a string. '''\n", " pass\n", "\n", " def __repr__(self):\n", " ''' Render queue instance as a string that evaluates to the object. '''\n", " pass\n", "\n", " def isempty(self):\n", " ''' Is the queue empty? true or false'''\n", " pass\n", "\n", " def enqueue(self, item):\n", " ''' Add an item to the queue'''\n", " pass\n", "\n", " def dequeue(self):\n", " ''' remove next item from the queue. error message if queue is empty'''\n", " pass\n", "\n", " def peek(self):\n", " ''' return the next item without removing it.\n", " Error message if queue is empty.'''\n", " pass\n", "\n", " def __iter__(self):\n", " '''define the iterator for queue. Used in for or list comprehension\n", " similar to iterator for stack. ''' \n", " pass\n", "\n", " def __eq__(self, other):\n", " ''' overload equality operator'''\n", " pass\n", "\n", " def copy(self):\n", " ''' copy constructor - clone the current instance'''\n", " pass" ] }, { "cell_type": "code", "execution_count": 38, "metadata": {}, "outputs": [], "source": [ "d = hw4a.queue()\n", "d.enqueue(9)\n", "d.enqueue(1)\n", "d.enqueue(2)" ] }, { "cell_type": "code", "execution_count": 39, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "True" ] }, "execution_count": 39, "metadata": {}, "output_type": "execute_result" } ], "source": [ "d == d.copy()" ] }, { "cell_type": "code", "execution_count": 40, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "9" ] }, "execution_count": 40, "metadata": {}, "output_type": "execute_result" } ], "source": [ "d.peek()" ] }, { "cell_type": "code", "execution_count": 41, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "[9, 1, 2]" ] }, "execution_count": 41, "metadata": {}, "output_type": "execute_result" } ], "source": [ "d.data" ] }, { "cell_type": "code", "execution_count": 42, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "[9, 1, 2]" ] }, "execution_count": 42, "metadata": {}, "output_type": "execute_result" } ], "source": [ "[x for x in d]" ] }, { "cell_type": "code", "execution_count": 43, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "True" ] }, "execution_count": 43, "metadata": {}, "output_type": "execute_result" } ], "source": [ "2 in d" ] }, { "cell_type": "code", "execution_count": 44, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "False" ] }, "execution_count": 44, "metadata": {}, "output_type": "execute_result" } ], "source": [ "5 in d" ] }, { "cell_type": "code", "execution_count": 45, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "9" ] }, "execution_count": 45, "metadata": {}, "output_type": "execute_result" } ], "source": [ "d.dequeue()" ] }, { "cell_type": "code", "execution_count": 46, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "1" ] }, "execution_count": 46, "metadata": {}, "output_type": "execute_result" } ], "source": [ "d.dequeue()" ] }, { "cell_type": "code", "execution_count": 47, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "False" ] }, "execution_count": 47, "metadata": {}, "output_type": "execute_result" } ], "source": [ "d.isempty()" ] }, { "cell_type": "code", "execution_count": 48, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "2" ] }, "execution_count": 48, "metadata": {}, "output_type": "execute_result" } ], "source": [ "d.dequeue()" ] }, { "cell_type": "code", "execution_count": 49, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "True" ] }, "execution_count": 49, "metadata": {}, "output_type": "execute_result" } ], "source": [ "d.isempty()" ] }, { "cell_type": "code", "execution_count": 50, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "False" ] }, "execution_count": 50, "metadata": {}, "output_type": "execute_result" } ], "source": [ "2 in d" ] }, { "cell_type": "code", "execution_count": 51, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "'queue is empty'" ] }, "execution_count": 51, "metadata": {}, "output_type": "execute_result" } ], "source": [ "d.dequeue()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### hw4 problem 3 (10 points)\n", "\n", "Create a queue using two stacks: s1 and s2.\n", "\n", "enqueue() pushes items on s1.\n", "\n", "dequeue() pops s2, unless s2 is empty, in which case\n", "keep popping s1 onto s2 until s1 is empty. Then pop s2.\n", "\n", "peek is similar to dequeue, except no final pop." ] }, { "cell_type": "code", "execution_count": 52, "metadata": {}, "outputs": [], "source": [ "class queue2:\n", " ''' queue implemented using two stacks. '''\n", " \n", " def __init__(self, stuff1 = [], stuff2 = []):\n", " ''' initialize stacks. '''\n", " self.s1 = stack(stuff1[:])\n", " self.s2 = stack(stuff2[:])\n", "\n", " def __str__(self):\n", " pass\n", "\n", " def __repr__(self):\n", " pass\n", "\n", " def isempty(self):\n", " ''' is the queue empty? true or false'''\n", " return self.s1.isempty() and self.s2.isempty()\n", "\n", " def enqueue(self, item):\n", " ''' add an item to the queue'''\n", " pass\n", "\n", " def dequeue(self):\n", " ''' remove next item. error message if queue is empty'''\n", " pass\n", " \n", " def peek(self):\n", " ''' return the next item without removing it.\n", " return error message if queue is empty'''\n", " pass\n", "\n", " def __iter__(self):\n", " ''' define the iterator for queue2. Used in for or list comprehension\n", " HINT:\n", " convert stacks to lists.\n", " extend the stack 2 list with the reverse of the stack 1 list\n", " use a for loop to iterate through the extended list, \n", " yielding the item'''\n", " pass\n", " \n", " def __eq__(self, other):\n", " ''' overload equality operator\n", " true if both stacks are respectively equal\n", " use the convert stacks to list method given above for __iter__'''\n", " pass\n", "\n", " def copy(self):\n", " ''' copy constructor for queue '''\n", " pass" ] }, { "cell_type": "code", "execution_count": 51, "metadata": {}, "outputs": [], "source": [ "d2 = hw4a.queue2()" ] }, { "cell_type": "code", "execution_count": 52, "metadata": {}, "outputs": [], "source": [ "d2.enqueue(9)" ] }, { "cell_type": "code", "execution_count": 53, "metadata": {}, "outputs": [], "source": [ "d2.enqueue(1)" ] }, { "cell_type": "code", "execution_count": 54, "metadata": {}, "outputs": [], "source": [ "d2.enqueue(2)" ] }, { "cell_type": "code", "execution_count": 55, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "queue2(stack([9, 1, 2]), stack([]))" ] }, "execution_count": 55, "metadata": {}, "output_type": "execute_result" } ], "source": [ "d2" ] }, { "cell_type": "code", "execution_count": 56, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "True" ] }, "execution_count": 56, "metadata": {}, "output_type": "execute_result" } ], "source": [ "d2 == d2.copy()" ] }, { "cell_type": "code", "execution_count": 57, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "True" ] }, "execution_count": 57, "metadata": {}, "output_type": "execute_result" } ], "source": [ "d2 == hw4a.queue2([9,1,2])" ] }, { "cell_type": "code", "execution_count": 58, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "False" ] }, "execution_count": 58, "metadata": {}, "output_type": "execute_result" } ], "source": [ "d2 == hw4a.queue2([1,2])" ] }, { "cell_type": "code", "execution_count": 59, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "9" ] }, "execution_count": 59, "metadata": {}, "output_type": "execute_result" } ], "source": [ "d2.peek()" ] }, { "cell_type": "code", "execution_count": 60, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "queue2(stack([]), stack([2, 1, 9]))" ] }, "execution_count": 60, "metadata": {}, "output_type": "execute_result" } ], "source": [ "d2" ] }, { "cell_type": "code", "execution_count": 61, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "[9, 1, 2]" ] }, "execution_count": 61, "metadata": {}, "output_type": "execute_result" } ], "source": [ "[x for x in d2]" ] }, { "cell_type": "code", "execution_count": 62, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "True" ] }, "execution_count": 62, "metadata": {}, "output_type": "execute_result" } ], "source": [ "2 in d2" ] }, { "cell_type": "code", "execution_count": 63, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "False" ] }, "execution_count": 63, "metadata": {}, "output_type": "execute_result" } ], "source": [ "5 in d2" ] }, { "cell_type": "code", "execution_count": 64, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "9" ] }, "execution_count": 64, "metadata": {}, "output_type": "execute_result" } ], "source": [ "d2.dequeue()" ] }, { "cell_type": "code", "execution_count": 65, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "queue2(stack([]), stack([2, 1]))" ] }, "execution_count": 65, "metadata": {}, "output_type": "execute_result" } ], "source": [ "d2" ] }, { "cell_type": "code", "execution_count": 66, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "False" ] }, "execution_count": 66, "metadata": {}, "output_type": "execute_result" } ], "source": [ "d2.isempty()" ] }, { "cell_type": "code", "execution_count": 67, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "1" ] }, "execution_count": 67, "metadata": {}, "output_type": "execute_result" } ], "source": [ "d2.dequeue()" ] }, { "cell_type": "code", "execution_count": 68, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "2" ] }, "execution_count": 68, "metadata": {}, "output_type": "execute_result" } ], "source": [ "d2.dequeue()" ] }, { "cell_type": "code", "execution_count": 69, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "True" ] }, "execution_count": 69, "metadata": {}, "output_type": "execute_result" } ], "source": [ "d2.isempty()" ] }, { "cell_type": "code", "execution_count": 70, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "False" ] }, "execution_count": 70, "metadata": {}, "output_type": "execute_result" } ], "source": [ "2 in d2" ] }, { "cell_type": "code", "execution_count": 71, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "'queue is empty'" ] }, "execution_count": 71, "metadata": {}, "output_type": "execute_result" } ], "source": [ "d2.dequeue()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### hw4 problem 4 (10 points)\n", "\n", "Write a procedure to reverse a queue. It modifies the original queue!\n", "It should work with either q implementation. That is, the function should use the standard methods, enqueue and dequeue which are common to both implementations.\n", "This demonstrates the value of encapsulation." ] }, { "cell_type": "code", "execution_count": 194, "metadata": {}, "outputs": [], "source": [ "def reverseq(q):\n", " pass" ] }, { "cell_type": "code", "execution_count": 195, "metadata": {}, "outputs": [], "source": [ "q = hw4a.queue()\n", "q.enqueue(1)\n", "q.enqueue(2)\n", "q.enqueue(3)\n", "q.enqueue(4)" ] }, { "cell_type": "code", "execution_count": 196, "metadata": { "scrolled": true }, "outputs": [ { "data": { "text/plain": [ "queue([1, 2, 3, 4])" ] }, "execution_count": 196, "metadata": {}, "output_type": "execute_result" } ], "source": [ "q" ] }, { "cell_type": "code", "execution_count": 197, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "queue([4, 3, 2, 1])" ] }, "execution_count": 197, "metadata": {}, "output_type": "execute_result" } ], "source": [ "hw4a.reverseq(q)" ] }, { "cell_type": "code", "execution_count": 198, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "queue([4, 3, 2, 1])" ] }, "execution_count": 198, "metadata": {}, "output_type": "execute_result" } ], "source": [ "q" ] }, { "cell_type": "code", "execution_count": 200, "metadata": {}, "outputs": [], "source": [ "q2 = hw4a.queue2()\n", "q2.enqueue(1)\n", "q2.enqueue(2)\n", "q2.enqueue(3)\n", "q2.enqueue(4)" ] }, { "cell_type": "code", "execution_count": 201, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "queue2(stack([4, 3, 2, 1]), stack([]))" ] }, "execution_count": 201, "metadata": {}, "output_type": "execute_result" } ], "source": [ "hw4a.reverseq(q2)" ] }, { "cell_type": "code", "execution_count": 202, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "queue2(stack([4, 3, 2, 1]), stack([]))" ] }, "execution_count": 202, "metadata": {}, "output_type": "execute_result" } ], "source": [ "q2" ] }, { "cell_type": "code", "execution_count": 203, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "hw4a.queue2" ] }, "execution_count": 203, "metadata": {}, "output_type": "execute_result" } ], "source": [ "type(q2)" ] }, { "cell_type": "code", "execution_count": 204, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "hw4a.queue" ] }, "execution_count": 204, "metadata": {}, "output_type": "execute_result" } ], "source": [ "type(q)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Hash Tables. hw4 problem 5 (20 points)\n", "\n", "Python dicts are implemented as hash tables.\n", "\n", "Reading: Skiena pages 89-93\n", "\n", "Video: hash tables\n", "\n", "Create a hash table.\n", "It will be a list of size buckets.\n", "Each bucket will itself contain a list.\n", "If two items fall in the same bucket,\n", "the respective list will contain both items.\n", "\n", "See Skiena page 89\n", "\n", "Create a hash function using the \n", " djb2 algorithm.\n", " \n", "We will show you some bad hash functions below." ] }, { "cell_type": "code", "execution_count": 30, "metadata": {}, "outputs": [], "source": [ "class myhash:\n", "\n", " def __init__(self, size = 20):\n", " ''' construct a hash tble of a given size, \n", " that is, with size buckets. '''\n", " pass\n", "\n", " def __repr__(self):\n", " pass\n", "\n", " def __str__(self):\n", " pass\n", "\n", " def isempty(self):\n", " ''' is the hash table empty? '''\n", " return self.count == 0\n", "\n", " def put(self, key, value):\n", " ''' add an item with the given key and value\n", " if there is already an item with the given key, remove it.\n", " no duplicate keys'''\n", " pass\n", "\n", " def get(self,key):\n", " ''' retrieve the value for the given key'''\n", " pass\n", "\n", " def remove(self,key):\n", " ''' remove the item for the given key'''\n", " pass\n", "\n", " def hashfun(self,key, debug=False):\n", " ''' create a hash function using the djb2 algorithm\n", " http://www.cse.yorku.ca/~oz/hash.html\n", " If the optional debug parameter is true\n", " Print out the value of the hash\n", " '''\n", " pass\n", "\n", " def __iter__(self):\n", " ''' iterate through the buckets and their respective contents\n", " ''' \n", " pass\n", "\n", " def __eq__(self, other):\n", " ''' overload the equality operator '''\n", " pass\n", "\n", " def copy(self):\n", " ''' copy constructor - clone the current instance. '''\n", " pass" ] }, { "cell_type": "code", "execution_count": 31, "metadata": {}, "outputs": [], "source": [ "h = hw4a.myhash(20)\n", "h.put(\"one\",1)\n", "h.put(\"two\",2)\n", "h.put(\"three\",3)" ] }, { "cell_type": "code", "execution_count": 32, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "\"myhash([[], [], [], [], [], [], [], [('one', 1)], [], [], [], [], [], [], [], [], [], [('three', 3)], [], [('two', 2)]])\"" ] }, "execution_count": 32, "metadata": {}, "output_type": "execute_result" } ], "source": [ "str(h)" ] }, { "cell_type": "code", "execution_count": 33, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "myhash(20)" ] }, "execution_count": 33, "metadata": {}, "output_type": "execute_result" } ], "source": [ "h" ] }, { "cell_type": "code", "execution_count": 34, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "True" ] }, "execution_count": 34, "metadata": {}, "output_type": "execute_result" } ], "source": [ "h == h.copy()" ] }, { "cell_type": "code", "execution_count": 35, "metadata": {}, "outputs": [], "source": [ "h2 = hw4a.myhash()\n", "h2.put(\"one\",1)\n", "h2.put(\"two\",2)\n", "h2.put(\"three\",3)" ] }, { "cell_type": "code", "execution_count": 36, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "True" ] }, "execution_count": 36, "metadata": {}, "output_type": "execute_result" } ], "source": [ "h == h2" ] }, { "cell_type": "code", "execution_count": 38, "metadata": {}, "outputs": [], "source": [ "h2.put(\"four\",4)" ] }, { "cell_type": "code", "execution_count": 39, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "False" ] }, "execution_count": 39, "metadata": {}, "output_type": "execute_result" } ], "source": [ "h == h2" ] }, { "cell_type": "code", "execution_count": 40, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "True" ] }, "execution_count": 40, "metadata": {}, "output_type": "execute_result" } ], "source": [ "\"one\" in h" ] }, { "cell_type": "code", "execution_count": 213, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "False" ] }, "execution_count": 213, "metadata": {}, "output_type": "execute_result" } ], "source": [ "\"zero\" in h" ] }, { "cell_type": "code", "execution_count": 214, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "['one', 'three', 'two']" ] }, "execution_count": 214, "metadata": {}, "output_type": "execute_result" } ], "source": [ "[x for x in h]" ] }, { "cell_type": "code", "execution_count": 216, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "[('one', 1), ('three', 3), ('two', 2)]" ] }, "execution_count": 216, "metadata": {}, "output_type": "execute_result" } ], "source": [ "[(x,h.get(x)) for x in h]" ] }, { "cell_type": "code", "execution_count": 27, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "['one', 'four', 'three', 'two']" ] }, "execution_count": 27, "metadata": {}, "output_type": "execute_result" } ], "source": [ "[x for x in h2]" ] }, { "cell_type": "code", "execution_count": 28, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "['__class__',\n", " '__delattr__',\n", " '__dict__',\n", " '__dir__',\n", " '__doc__',\n", " '__eq__',\n", " '__format__',\n", " '__ge__',\n", " '__getattribute__',\n", " '__gt__',\n", " '__hash__',\n", " '__init__',\n", " '__init_subclass__',\n", " '__iter__',\n", " '__le__',\n", " '__lt__',\n", " '__module__',\n", " '__ne__',\n", " '__new__',\n", " '__reduce__',\n", " '__reduce_ex__',\n", " '__repr__',\n", " '__setattr__',\n", " '__sizeof__',\n", " '__str__',\n", " '__subclasshook__',\n", " '__weakref__',\n", " 'copy',\n", " 'count',\n", " 'get',\n", " 'hashfun',\n", " 'isempty',\n", " 'put',\n", " 'remove',\n", " 'size',\n", " 'table']" ] }, "execution_count": 28, "metadata": {}, "output_type": "execute_result" } ], "source": [ "dir(h)" ] }, { "cell_type": "code", "execution_count": 217, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "[[],\n", " [],\n", " [],\n", " [],\n", " [],\n", " [],\n", " [],\n", " [('one', 1)],\n", " [],\n", " [],\n", " [],\n", " [],\n", " [],\n", " [],\n", " [],\n", " [],\n", " [],\n", " [('three', 3)],\n", " [],\n", " [('two', 2)]]" ] }, "execution_count": 217, "metadata": {}, "output_type": "execute_result" } ], "source": [ "h.table" ] }, { "cell_type": "code", "execution_count": 218, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "7" ] }, "execution_count": 218, "metadata": {}, "output_type": "execute_result" } ], "source": [ "h.hashfun(\"one\")" ] }, { "cell_type": "code", "execution_count": 219, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "9" ] }, "execution_count": 219, "metadata": {}, "output_type": "execute_result" } ], "source": [ "h.hashfun(\"four\")" ] }, { "cell_type": "code", "execution_count": 220, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "17" ] }, "execution_count": 220, "metadata": {}, "output_type": "execute_result" } ], "source": [ "h.hashfun('three')" ] }, { "cell_type": "code", "execution_count": 222, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "19" ] }, "execution_count": 222, "metadata": {}, "output_type": "execute_result" } ], "source": [ "h.hashfun('two')" ] }, { "cell_type": "code", "execution_count": 34, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "3" ] }, "execution_count": 34, "metadata": {}, "output_type": "execute_result" } ], "source": [ "h.get('three')" ] }, { "cell_type": "code", "execution_count": 223, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "False" ] }, "execution_count": 223, "metadata": {}, "output_type": "execute_result" } ], "source": [ "h.get('four')" ] }, { "cell_type": "code", "execution_count": 224, "metadata": {}, "outputs": [], "source": [ "h.remove('three')" ] }, { "cell_type": "code", "execution_count": 225, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "False" ] }, "execution_count": 225, "metadata": {}, "output_type": "execute_result" } ], "source": [ "h.get('three')" ] }, { "cell_type": "code", "execution_count": 41, "metadata": {}, "outputs": [], "source": [ "dd = {}" ] }, { "cell_type": "code", "execution_count": 42, "metadata": {}, "outputs": [ { "ename": "KeyError", "evalue": "'a'", "output_type": "error", "traceback": [ "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m", "\u001b[0;31mKeyError\u001b[0m Traceback (most recent call last)", "\u001b[0;32m\u001b[0m in \u001b[0;36m\u001b[0;34m\u001b[0m\n\u001b[0;32m----> 1\u001b[0;31m \u001b[0mdd\u001b[0m\u001b[0;34m[\u001b[0m\u001b[0;34m'a'\u001b[0m\u001b[0;34m]\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m", "\u001b[0;31mKeyError\u001b[0m: 'a'" ] } ], "source": [ "dd['a']" ] }, { "cell_type": "code", "execution_count": 43, "metadata": {}, "outputs": [], "source": [ "dd.get('a')" ] }, { "cell_type": "code", "execution_count": 44, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "'Hash of: one = 193501607 ==> 7'" ] }, "execution_count": 44, "metadata": {}, "output_type": "execute_result" } ], "source": [ "h.hashfun('one', True)" ] }, { "cell_type": "code", "execution_count": 45, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "'Hash of: two = 193507359 ==> 19'" ] }, "execution_count": 45, "metadata": {}, "output_type": "execute_result" } ], "source": [ "h.hashfun('two', True)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Bad hash functions\n", "\n", "Hash functions are common and useful in many programming applications. They are critical in many cryptography systems. For examples, bitcoin depends on its hash function being (nearly) impossible to invert (one-way). We will return to the topic of cryptography in a few weeks.\n", "\n", "A crypto hash function h(x) must provide the following:\n", "\n", " \n", "Below are some bad hash functions." ] }, { "cell_type": "code", "execution_count": 46, "metadata": {}, "outputs": [], "source": [ "def badhash(str):\n", " return len(str)" ] }, { "cell_type": "code", "execution_count": 47, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "12" ] }, "execution_count": 47, "metadata": {}, "output_type": "execute_result" } ], "source": [ "badhash('hello world!')" ] }, { "cell_type": "code", "execution_count": 49, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "12" ] }, "execution_count": 49, "metadata": {}, "output_type": "execute_result" } ], "source": [ "badhash(\"this is test\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "This function achieves the first three objectives: compression, efficiency, and one-way. However, it is not collision resistant. Lots of strings will end up in the same bucket. You want a function that will spread the keys around to different buckets.\n", "\n", "Dynamic hash tables can grow over time. The hash table will start with a table size of N, and once N/2 items have been inserted, the table will expand to 2N. Doing so insures that the table does not fill up. Of course this technique will not be effective if the hash function throws every key in the same handful of buckets.\n", "\n", "Below is another bad function. This one sums the ASCII values of the characters in the string." ] }, { "cell_type": "code", "execution_count": 50, "metadata": {}, "outputs": [], "source": [ "def badhash2(str):\n", " result = 0\n", " for c in str:\n", " result += ord(c)\n", " return result" ] }, { "cell_type": "code", "execution_count": 51, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "97" ] }, "execution_count": 51, "metadata": {}, "output_type": "execute_result" } ], "source": [ "ord('a')" ] }, { "cell_type": "code", "execution_count": 52, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "65" ] }, "execution_count": 52, "metadata": {}, "output_type": "execute_result" } ], "source": [ "ord('A')" ] }, { "cell_type": "code", "execution_count": 53, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "1149" ] }, "execution_count": 53, "metadata": {}, "output_type": "execute_result" } ], "source": [ "badhash2('hello world!')" ] }, { "cell_type": "code", "execution_count": 54, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "1172" ] }, "execution_count": 54, "metadata": {}, "output_type": "execute_result" } ], "source": [ "badhash2(\"this is test\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "It is a slight improvement over the first hash. However, it still is deplorable." ] }, { "cell_type": "code", "execution_count": 55, "metadata": {}, "outputs": [], "source": [ "x = ''.join(sorted('hello world!'))" ] }, { "cell_type": "code", "execution_count": 56, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "' !dehllloorw'" ] }, "execution_count": 56, "metadata": {}, "output_type": "execute_result" } ], "source": [ "x" ] }, { "cell_type": "code", "execution_count": 57, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "1149" ] }, "execution_count": 57, "metadata": {}, "output_type": "execute_result" } ], "source": [ "badhash2(x)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "If you have the same characters in a different order, you get the same hash value. Let's try to fix that problem." ] }, { "cell_type": "code", "execution_count": 58, "metadata": {}, "outputs": [], "source": [ "def badhash3(str):\n", " result = 0\n", " for c in str:\n", " result += ord(c)\n", " result *= 2\n", " return result" ] }, { "cell_type": "code", "execution_count": 59, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "845554" ] }, "execution_count": 59, "metadata": {}, "output_type": "execute_result" } ], "source": [ "badhash3('hello world!')" ] }, { "cell_type": "code", "execution_count": 60, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "887900" ] }, "execution_count": 60, "metadata": {}, "output_type": "execute_result" } ], "source": [ "badhash3('this is test')" ] }, { "cell_type": "code", "execution_count": 61, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "' !dehllloorw'" ] }, "execution_count": 61, "metadata": {}, "output_type": "execute_result" } ], "source": [ "x" ] }, { "cell_type": "code", "execution_count": 62, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "406942" ] }, "execution_count": 62, "metadata": {}, "output_type": "execute_result" } ], "source": [ "badhash3(x)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "By inserting the multiplication step, we have reduced the collision problem.\n", "\n", "The djb2 hash function follows this approach of combining addition with multiplication. Note that ((hash << 5) + hash) is the same as multiplying by 33. It is just faster, since multiplication is typically much slower than\n", "shifts and addition. Here is the C++ code for djb2.\n", "\n", "
\n",
    "    unsigned long\n",
    "    hash(unsigned char *str)\n",
    "    {\n",
    "        unsigned long hash = 5381;\n",
    "        int c;\n",
    "\n",
    "        while (c = *str++)\n",
    "            hash = ((hash << 5) + hash) + c; /* hash * 33 + c */\n",
    "\n",
    "        return hash;\n",
    "    }\n",
    "
" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### hw4 problem 6 ** (10 points)\n", "\n", "Use your hash function to implement remove duplicates for strings. \n", "\n", "Hint: you want to use the hash table to answer the question: have I seen this character already?" ] }, { "cell_type": "code", "execution_count": 54, "metadata": {}, "outputs": [], "source": [ "def removedups(string):\n", " pass" ] }, { "cell_type": "code", "execution_count": 63, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "'abc'" ] }, "execution_count": 63, "metadata": {}, "output_type": "execute_result" } ], "source": [ "hw4a.removedups('abcabcabc')" ] }, { "cell_type": "code", "execution_count": 64, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "'cba'" ] }, "execution_count": 64, "metadata": {}, "output_type": "execute_result" } ], "source": [ "hw4a.removedups('cbacbacba')" ] }, { "cell_type": "code", "execution_count": 66, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "'abcdef'" ] }, "execution_count": 66, "metadata": {}, "output_type": "execute_result" } ], "source": [ "hw4a.removedups('abcabcabcdef')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Heaps\n", "\n", "Video: heap sort\n", "\n", "See heap data structure\n", "\n", "
\n", " In computer science, a heap is a specialized tree-based data structure which is essentially an almost complete tree that satisfies the heap property: in a max heap, for any given node C, if P is a parent node of C, then the key (the value) of P is greater than or equal to the key of C. In a min heap, the key of P is less than or equal to the key of C. The node at the \"top\" of the heap (with no parents) is called the root node.\n", "
\n", "

Below is a max heap.\n", "

\n", " \n", " \n", "

We use the python heapq algorithm for a min heap." ] }, { "cell_type": "code", "execution_count": 67, "metadata": {}, "outputs": [], "source": [ "from heapq import *\n", "\n", "heap = []\n", "data = [1,3,5,7,9,2,4,6,8,0]\n", "for item in data:\n", " heappush(heap, item)" ] }, { "cell_type": "code", "execution_count": 68, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "[0, 1, 2, 6, 3, 5, 4, 7, 8, 9]" ] }, "execution_count": 68, "metadata": {}, "output_type": "execute_result" } ], "source": [ "heap" ] }, { "cell_type": "code", "execution_count": 69, "metadata": {}, "outputs": [], "source": [ "ordered = []\n", "def h1():\n", " while heap:\n", " ordered.append(heappop(heap))" ] }, { "cell_type": "code", "execution_count": 70, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "[]" ] }, "execution_count": 70, "metadata": {}, "output_type": "execute_result" } ], "source": [ "ordered" ] }, { "cell_type": "code", "execution_count": 71, "metadata": {}, "outputs": [], "source": [ "h1()" ] }, { "cell_type": "code", "execution_count": 72, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]" ] }, "execution_count": 72, "metadata": {}, "output_type": "execute_result" } ], "source": [ "ordered" ] }, { "cell_type": "code", "execution_count": 73, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "[1, 3, 5, 7, 9, 2, 4, 6, 8, 0]" ] }, "execution_count": 73, "metadata": {}, "output_type": "execute_result" } ], "source": [ "data" ] }, { "cell_type": "code", "execution_count": 74, "metadata": {}, "outputs": [], "source": [ "heapify(data)" ] }, { "cell_type": "code", "execution_count": 75, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "[0, 1, 2, 6, 3, 5, 4, 7, 8, 9]" ] }, "execution_count": 75, "metadata": {}, "output_type": "execute_result" } ], "source": [ "data" ] }, { "cell_type": "code", "execution_count": 76, "metadata": {}, "outputs": [], "source": [ "data.sort()" ] }, { "cell_type": "code", "execution_count": 77, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]" ] }, "execution_count": 77, "metadata": {}, "output_type": "execute_result" } ], "source": [ "data" ] }, { "cell_type": "code", "execution_count": 78, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "True" ] }, "execution_count": 78, "metadata": {}, "output_type": "execute_result" } ], "source": [ "data == ordered" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### hw4 ** problem 7 ** (20 points)\n", "\n", "Reading: Skiena pages 109-115\n", "\n", "Skienna sorting chapter\n", "\n", "Implement a min heap per the description in Skiena." ] }, { "cell_type": "code", "execution_count": 80, "metadata": {}, "outputs": [], "source": [ "class heap:\n", "\n", " def __init__(self, size = 10):\n", " pass\n", "\n", " def __str__(self):\n", " pass\n", "\n", " def __repr__(self):\n", " pass\n", "\n", " def isempty(self):\n", " pass\n", " \n", " def insert(self,item):\n", " ''' add a new element to the heap and adjust as needed '''\n", " pass\n", "\n", " def bubbleup(self, n):\n", " ''' This could be tricky. I am defining it for you. '''\n", " if heap.parent(n) == -1:\n", " return\n", " if self.data[heap.parent(n)] > self.data[n]:\n", " self.data[n],self.data[heap.parent(n)] = self.data[heap.parent(n)],self.data[n]\n", " self.bubbleup(heap.parent(n))\n", " \n", " def extractmin(self):\n", " ''' remove the smallest element and adjust the heap '''\n", " pass\n", "\n", " def bubbleDown(self,p):\n", " ''' This could be tricky. I am defining it for you. '''\n", " c = self.child(p)\n", " min_index = p\n", "\n", " for i in [0, 1]:\n", " if ((c + i) <= self.count):\n", " if self.data[min_index] > self.data[c + i]:\n", " min_index = c+i\n", "\n", " if min_index != p:\n", " self.data[p], self.data[min_index] = self.data[min_index], self.data[p]\n", " self.bubbleDown(min_index)\n", "\n", " @staticmethod\n", " def parent(n):\n", " ''' I define this for you. '''\n", " if (n == 1):\n", " return (-1)\n", " else:\n", " return int(n/2)\n", "\n", " @staticmethod\n", " def child(n):\n", " ''' I define this for you. '''\n", " return (2 * n)\n", "\n", "\n", " def __iter__(self):\n", " ''' define the iterator for heap. Used in for or list comprehension'''\n", " pass\n", " \n", " def __eq__(self, other):\n", " ''' overload equality operator'''\n", " pass\n", "\n", " def copy(self):\n", " ''' copy constructor - clone the current instance '''\n", " pass\n" ] }, { "cell_type": "code", "execution_count": 81, "metadata": {}, "outputs": [], "source": [ "import hw4a" ] }, { "cell_type": "code", "execution_count": 82, "metadata": {}, "outputs": [], "source": [ "hh = hw4a.heap(10)" ] }, { "cell_type": "code", "execution_count": 83, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "heap(10)" ] }, "execution_count": 83, "metadata": {}, "output_type": "execute_result" } ], "source": [ "hh" ] }, { "cell_type": "code", "execution_count": 84, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "'heap( [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0] )'" ] }, "execution_count": 84, "metadata": {}, "output_type": "execute_result" } ], "source": [ "str(hh)" ] }, { "cell_type": "code", "execution_count": 85, "metadata": {}, "outputs": [], "source": [ "hh.insert(12)" ] }, { "cell_type": "code", "execution_count": 86, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "'heap( [0, 12, 0, 0, 0, 0, 0, 0, 0, 0, 0] )'" ] }, "execution_count": 86, "metadata": {}, "output_type": "execute_result" } ], "source": [ "str(hh)" ] }, { "cell_type": "code", "execution_count": 87, "metadata": {}, "outputs": [], "source": [ "hh.insert(4)" ] }, { "cell_type": "code", "execution_count": 88, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "'heap( [0, 4, 12, 0, 0, 0, 0, 0, 0, 0, 0] )'" ] }, "execution_count": 88, "metadata": {}, "output_type": "execute_result" } ], "source": [ "str(hh)" ] }, { "cell_type": "code", "execution_count": 89, "metadata": {}, "outputs": [], "source": [ "hh.insert(8)" ] }, { "cell_type": "code", "execution_count": 90, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "'heap( [0, 4, 12, 8, 0, 0, 0, 0, 0, 0, 0] )'" ] }, "execution_count": 90, "metadata": {}, "output_type": "execute_result" } ], "source": [ "str(hh)" ] }, { "cell_type": "code", "execution_count": 91, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "True" ] }, "execution_count": 91, "metadata": {}, "output_type": "execute_result" } ], "source": [ "hh == hh.copy()" ] }, { "cell_type": "code", "execution_count": 92, "metadata": {}, "outputs": [], "source": [ "hh2 = hw4a.heap(10)\n", "hh2.insert(12)\n", "hh2.insert(4)\n", "hh2.insert(8)" ] }, { "cell_type": "code", "execution_count": 93, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "'heap( [0, 4, 12, 8, 0, 0, 0, 0, 0, 0, 0] )'" ] }, "execution_count": 93, "metadata": {}, "output_type": "execute_result" } ], "source": [ "str(hh2)" ] }, { "cell_type": "code", "execution_count": 94, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "True" ] }, "execution_count": 94, "metadata": {}, "output_type": "execute_result" } ], "source": [ "hh == hh2" ] }, { "cell_type": "code", "execution_count": 95, "metadata": {}, "outputs": [], "source": [ "hh2.insert(40)" ] }, { "cell_type": "code", "execution_count": 96, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "False" ] }, "execution_count": 96, "metadata": {}, "output_type": "execute_result" } ], "source": [ "hh == hh2" ] }, { "cell_type": "code", "execution_count": 97, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "'heap( [0, 4, 12, 8, 40, 0, 0, 0, 0, 0, 0] )'" ] }, "execution_count": 97, "metadata": {}, "output_type": "execute_result" } ], "source": [ "str(hh2)" ] }, { "cell_type": "code", "execution_count": 98, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "True" ] }, "execution_count": 98, "metadata": {}, "output_type": "execute_result" } ], "source": [ "4 in hh" ] }, { "cell_type": "code", "execution_count": 99, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "False" ] }, "execution_count": 99, "metadata": {}, "output_type": "execute_result" } ], "source": [ "40 in hh" ] }, { "cell_type": "code", "execution_count": 100, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "[4, 12, 8]" ] }, "execution_count": 100, "metadata": {}, "output_type": "execute_result" } ], "source": [ "[x for x in hh]" ] }, { "cell_type": "code", "execution_count": 101, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "[4, 12, 8, 40]" ] }, "execution_count": 101, "metadata": {}, "output_type": "execute_result" } ], "source": [ "[x for x in hh2]" ] }, { "cell_type": "code", "execution_count": 102, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "2" ] }, "execution_count": 102, "metadata": {}, "output_type": "execute_result" } ], "source": [ "hh.child(1)" ] }, { "cell_type": "code", "execution_count": 103, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "4" ] }, "execution_count": 103, "metadata": {}, "output_type": "execute_result" } ], "source": [ "hh.child(2)" ] }, { "cell_type": "code", "execution_count": 104, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "2" ] }, "execution_count": 104, "metadata": {}, "output_type": "execute_result" } ], "source": [ "hh.parent(4)" ] }, { "cell_type": "code", "execution_count": 105, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "4" ] }, "execution_count": 105, "metadata": {}, "output_type": "execute_result" } ], "source": [ "hh.extractmin()" ] }, { "cell_type": "code", "execution_count": 106, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "'heap( [0, 8, 12, 0, 0, 0, 0, 0, 0, 0, 0] )'" ] }, "execution_count": 106, "metadata": {}, "output_type": "execute_result" } ], "source": [ "str(hh)" ] }, { "cell_type": "code", "execution_count": 107, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "8" ] }, "execution_count": 107, "metadata": {}, "output_type": "execute_result" } ], "source": [ "hh.extractmin()" ] }, { "cell_type": "code", "execution_count": 108, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "['__class__',\n", " '__delattr__',\n", " '__dict__',\n", " '__dir__',\n", " '__doc__',\n", " '__eq__',\n", " '__format__',\n", " '__ge__',\n", " '__getattribute__',\n", " '__gt__',\n", " '__hash__',\n", " '__init__',\n", " '__init_subclass__',\n", " '__iter__',\n", " '__le__',\n", " '__lt__',\n", " '__module__',\n", " '__ne__',\n", " '__new__',\n", " '__reduce__',\n", " '__reduce_ex__',\n", " '__repr__',\n", " '__setattr__',\n", " '__sizeof__',\n", " '__str__',\n", " '__subclasshook__',\n", " '__weakref__',\n", " 'bubbleDown',\n", " 'bubbleup',\n", " 'child',\n", " 'copy',\n", " 'count',\n", " 'data',\n", " 'extractmin',\n", " 'insert',\n", " 'isempty',\n", " 'parent',\n", " 'size']" ] }, "execution_count": 108, "metadata": {}, "output_type": "execute_result" } ], "source": [ "dir(hh)" ] }, { "cell_type": "code", "execution_count": 109, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "3" ] }, "execution_count": 109, "metadata": {}, "output_type": "execute_result" } ], "source": [ "h.count" ] }, { "cell_type": "code", "execution_count": 111, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "20" ] }, "execution_count": 111, "metadata": {}, "output_type": "execute_result" } ], "source": [ "h.size" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### hw4 ** problem 8 ** (10 points)\n", "\n", "Write a function that takes in a list of positive integers of\n", "size n and returns a sorted list containing the n/2 smallest elements.\n", "Use a heap." ] }, { "cell_type": "code", "execution_count": 112, "metadata": {}, "outputs": [], "source": [ "def smallest(lst = [4,2,5,6,8,11,99,6,77]):\n", " pass" ] }, { "cell_type": "code", "execution_count": 113, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "[2, 4, 5, 6]" ] }, "execution_count": 113, "metadata": {}, "output_type": "execute_result" } ], "source": [ "hw4a.smallest()" ] }, { "cell_type": "code", "execution_count": 114, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "[]" ] }, "execution_count": 114, "metadata": {}, "output_type": "execute_result" } ], "source": [ "hw4a.smallest([])" ] }, { "cell_type": "code", "execution_count": 115, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "[]" ] }, "execution_count": 115, "metadata": {}, "output_type": "execute_result" } ], "source": [ "hw4a.smallest([1])" ] }, { "cell_type": "code", "execution_count": 116, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "[1]" ] }, "execution_count": 116, "metadata": {}, "output_type": "execute_result" } ], "source": [ "hw4a.smallest([1,2])" ] }, { "cell_type": "code", "execution_count": 117, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "[2, 3, 3, 4, 4, 5, 5, 6]" ] }, "execution_count": 117, "metadata": {}, "output_type": "execute_result" } ], "source": [ "hw4a.smallest([3,4,5,6,2,3,4,5,6,7,88,22,11,33,22,44])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Trees\n", "\n" ] }, { "cell_type": "code", "execution_count": 28, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "bst(15)" ] }, "execution_count": 28, "metadata": {}, "output_type": "execute_result" } ], "source": [ "\n", "## binary search tree\n", "\n", "\n", "class bst:\n", "\n", " def __init__(self, value, parent = None):\n", " self.left = None\n", " self.right = None\n", " self.value = value\n", " self.parent = parent\n", "\n", " def __repr__(self):\n", " return \"bst({})\".format(self.value)\n", "\n", " def insert(self, value):\n", " ''' no duplicates'''\n", " if self.value == value:\n", " return self\n", " if self.value > value:\n", " if self.left:\n", " return self.left.insert(value)\n", " self.left = bst(value, parent=self)\n", " return self.left\n", " else:\n", " if self.right:\n", " return self.right.insert(value)\n", " self.right = bst(value, parent=self)\n", " return self.right\n", "\n", " def preorder(self, indent = 0):\n", " if self.left: self.left.preorder(indent+1)\n", " print ('-' * indent, self)\n", " if self.right: self.right.preorder(indent+1)\n", "\n", " def inorder(self, indent = 0):\n", " print ('-' * indent, self)\n", " if self.left: self.left.inorder(indent+1)\n", " if self.right: self.right.inorder(indent+1)\n", "\n", " def postorder(self, indent=0):\n", " if self.left: self.left.postorder(indent+1)\n", " if self.right: self.right.postorder(indent+1)\n", " print ('-' * indent, self)\n", "\n", " def find(self, value):\n", " if self.value == value:\n", " return self\n", " if self.value > value:\n", " if self.left:\n", " return self.left.find(value)\n", " return False\n", " else:\n", " if self.right:\n", " return self.right.find(value)\n", " return False\n", " \n", " def successor(self):\n", " if self.right:\n", " return self.right.min()\n", " if self.parent.left == self:\n", " return self.parent\n", " if self.parent.right == self:\n", " s = self\n", " p = self.parent\n", " while p and p.right and p.right == s:\n", " s = p\n", " p = s.parent\n", " # print (s, p)\n", " return p or False\n", "\n", " def min(self):\n", " if self.left:\n", " return self.left.min()\n", " return self\n", " \n", " ## iterator uses inorder traversal\n", " def __iter__(self):\n", " if self.left:\n", " yield from self.left\n", " yield self.value\n", " if self.right:\n", " yield from self.right\n", "\n", " ## there is a bug in this code\n", " def dfs(self, value, trace=False):\n", " if self.value == value:\n", " return self\n", " else:\n", " if trace:\n", " print (self)\n", " if self.left:\n", " return self.left.dfs(value, trace)\n", " if self.right:\n", " return self.right.dfs(value, trace)\n", " return False\n", "\n", " def height(self):\n", " ''' get the height (or depth) of a tree - like earlier hw problem'''\n", " if not self:\n", " return 0\n", " left = right = 0\n", " if self.left:\n", " left = self.left.height()\n", " if self.right:\n", " right = self.right.height()\n", " return 1 + max(left, right)\n", " \n", "\n", " # predicate to indicate if bst is balanced\n", " def isbalanced(self):\n", " if not self:\n", " return True\n", " left = right = True\n", " hleft = hright = 0\n", " if self.left:\n", " left = self.left.isbalanced()\n", " hleft = self.left.height()\n", " if self.right:\n", " right = self.right.isbalanced()\n", " hright = self.right.height()\n", " return left and right and abs(hleft - hright) <= 1\n", " \n", "\n", " # convert unbalanced tree to balanced tree\n", " def balance(self):\n", " # create inorder list of nodes\n", " nodes = []\n", " for node in self:\n", " nodes.append(node)\n", " # recursively divide list in half, adding to balanced tree\n", " return self.balanceutil(nodes,0,len(nodes)-1)\n", "\n", " def balanceutil(self,nodes,start,end):\n", " if start > end:\n", " return None\n", " mid = (start + end)//2\n", " root = bst(nodes[mid])\n", " root.left = self.balanceutil(nodes,start,mid-1)\n", " root.right = self.balanceutil(nodes,mid+1,end)\n", " return root" ] }, { "cell_type": "code", "execution_count": 30, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "bst(15)" ] }, "execution_count": 30, "metadata": {}, "output_type": "execute_result" } ], "source": [ "x = bst(10)\n", "x.insert(5)\n", "x.insert(7)\n", "x.insert(6)\n", "x.insert(8)\n", "x.insert(9)\n", "x.insert(15)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Graphs\n", "\n" ] }, { "cell_type": "code", "execution_count": 33, "metadata": {}, "outputs": [], "source": [ "\n", "## graph class\n", "\n", "class node:\n", "\n", " count = 0\n", " nodelist = []\n", "\n", " def __init__(self,name,value):\n", " self.name = name\n", " self.value = value\n", " self.neighbors = []\n", " self.count = node.count\n", " node.count += 1\n", " node.nodelist.append(self)\n", "\n", " def __repr__(self):\n", " return \"node({}, {})\".format(self.name, self.value)\n", "\n", " def __str__(self):\n", " return \"node({}, {})\".format(self.name, self.value)\n", "\n", " def addneighbor(self, neighbor):\n", " if neighbor not in self.neighbors:\n", " self.neighbors.append(neighbor)\n", " neighbor.addneighbor(self)\n", "\n", " def connected(self):\n", " count = self.connectedaux({}, 0)\n", " return count == node.count\n", "\n", " def connectedaux(self, visited, count):\n", " # print (self, count)\n", " if not self in visited:\n", " visited[self] = True\n", " count += 1\n", " for n in self.neighbors:\n", " count = n.connectedaux(visited, count)\n", " return count\n", "\n", "\n", " def dfs(self, value):\n", " x = self.dfsaux(value, {})\n", " return x\n", "\n", " def dfsaux(self, value, visited):\n", " print (self, visited)\n", " if not self in visited:\n", " visited[self] = True\n", " if self.value == value:\n", " print (\"***\")\n", " return self\n", " for n in self.neighbors:\n", " n.dfsaux(value, visited)\n", " return None\n", "\n", " def bfs(self, value):\n", " x = self.bfsaux(value, {}, [])\n", " return x\n", "\n", " def bfsaux(self, value, visited, queue):\n", " print (self, visited, queue)\n", " if not self in visited:\n", " visited[self] = True\n", " if self.value == value:\n", " print (\"***\")\n", " return (self, value)\n", " queue.extend(self.neighbors)\n", " print (\":::\", queue)\n", " if queue != []:\n", " n = queue.pop(0)\n", " n.bfsaux(value, visited, queue)\n", " return None\n", "\n", " def astar(self, value):\n", " pass" ] }, { "cell_type": "code", "execution_count": 34, "metadata": {}, "outputs": [], "source": [ "n1 = node('n1',1)\n", "n2 = node('n2',2)\n", "n3 = node('n3',3)\n", "n4 = node('n4',4)\n", "n1.addneighbor(n2)\n", "n1.addneighbor(n3)\n", "n3.addneighbor(n4)\n", "n5 = node('n5',5)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.9.7" } }, "nbformat": 4, "nbformat_minor": 4 }