{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "## Recursion\n", "\n", "We are no longer in the Google Python Course.\n", "\n", "\n", "\n", "\n", "#### Recursion allows you to define infinite objects, finitely\n", "\n", "We may define the positive integers as follows:\n", "\n", "> The number 1 is a positive integer.\n", "\n", "> Any positive integer plus 1 is a positive integer.\n", "\n", "Below we define a generator for integers. (We will discuss generators later.)\n", "\n", "These examples are found in the python file recursion.py" ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "def integers(n):\n", " yield n\n", " while True:\n", " n +=1\n", " yield n" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [], "source": [ "intg = integers(1)" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "generator" ] }, "execution_count": 3, "metadata": {}, "output_type": "execute_result" } ], "source": [ "type(intg)" ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "1" ] }, "execution_count": 4, "metadata": {}, "output_type": "execute_result" } ], "source": [ "next(intg)" ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "2" ] }, "execution_count": 5, "metadata": {}, "output_type": "execute_result" } ], "source": [ "next(intg)" ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "3" ] }, "execution_count": 6, "metadata": {}, "output_type": "execute_result" } ], "source": [ "next(intg)" ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [], "source": [ "newintg = integers(1000)" ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "1000" ] }, "execution_count": 8, "metadata": {}, "output_type": "execute_result" } ], "source": [ "next(newintg)" ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "1001" ] }, "execution_count": 9, "metadata": {}, "output_type": "execute_result" } ], "source": [ "next(newintg)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We may extend the definition to include all integers.\n", "\n", "\n", "> Any positive integer is an integer.\n", "\n", "> Any integer minus 1 is an integer.\n", "\n", "### Recursive Data Structures\n", "\n", "Lists and strings are recursive data structures.\n", "\n", "> The empty list [] is a list.\n", "\n", "> Appending an item to a list results in a list.\n", "\n", "Similarly for strings.\n", "\n", "> The empty string \"\" is a string.\n", "\n", "> Concatenating a character to a string results in a string.\n", "\n", "We will see similar definitions for other data structures, including stacks, queues, \n", "trees, and graphs. FYI, the Facebook friends network is a graph.\n", "\n", "Given a recursive data structure, it is convenient to use recursive functions to process the recursive structures. The two hallmarks of a recursive function are \n", "\n", "> A base case, such as the empty list or the empty string.\n", "\n", "> A traversal function for moving through the data structure.\n", "\n", "\n", "Below we compare recursion to iteration and introduce tail recursion as a more efficient version.\n", "\n", "\n", "### Iterative versions of total [sum], length (len), and max." ] }, { "cell_type": "code", "execution_count": 10, "metadata": {}, "outputs": [], "source": [ "def itotal(lst):\n", " sum = 0\n", " for n in lst:\n", " sum += n\n", " return sum" ] }, { "cell_type": "code", "execution_count": 11, "metadata": {}, "outputs": [], "source": [ "def ilength(lst):\n", " length = 0\n", " for i in lst:\n", " length += 1\n", " return length" ] }, { "cell_type": "code", "execution_count": 12, "metadata": {}, "outputs": [], "source": [ "def imax(lst):\n", " m = lst[0]\n", " for n in lst[1:]:\n", " if n > m:\n", " m = n\n", " return m" ] }, { "cell_type": "code", "execution_count": 13, "metadata": {}, "outputs": [], "source": [ "l = list(range(1,6))" ] }, { "cell_type": "code", "execution_count": 14, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "[1, 2, 3, 4, 5]" ] }, "execution_count": 14, "metadata": {}, "output_type": "execute_result" } ], "source": [ "l" ] }, { "cell_type": "code", "execution_count": 15, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "15" ] }, "execution_count": 15, "metadata": {}, "output_type": "execute_result" } ], "source": [ "itotal(l)" ] }, { "cell_type": "code", "execution_count": 16, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "5" ] }, "execution_count": 16, "metadata": {}, "output_type": "execute_result" } ], "source": [ "ilength(l)" ] }, { "cell_type": "code", "execution_count": 17, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "5" ] }, "execution_count": 17, "metadata": {}, "output_type": "execute_result" } ], "source": [ "imax(l)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Recursive versions of total [sum], length [len], and max\n" ] }, { "cell_type": "code", "execution_count": 18, "metadata": {}, "outputs": [], "source": [ "def rtotal(lst):\n", " if not lst:\n", " return 0\n", " else:\n", " return lst[0] + rtotal(lst[1:])" ] }, { "cell_type": "code", "execution_count": 19, "metadata": {}, "outputs": [], "source": [ "def rlength(lst):\n", " if not lst:\n", " return 0\n", " else:\n", " return 1 + rlength(lst[1:])" ] }, { "cell_type": "code", "execution_count": 20, "metadata": {}, "outputs": [], "source": [ "def rmax(lst):\n", " if not lst:\n", " return float('-inf')\n", " else:\n", " return max(lst[0], rmax(lst[1:]))" ] }, { "cell_type": "code", "execution_count": 21, "metadata": {}, "outputs": [], "source": [ "l = list(range(1,6))" ] }, { "cell_type": "code", "execution_count": 22, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "[1, 2, 3, 4, 5]" ] }, "execution_count": 22, "metadata": {}, "output_type": "execute_result" } ], "source": [ "l" ] }, { "cell_type": "code", "execution_count": 23, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "15" ] }, "execution_count": 23, "metadata": {}, "output_type": "execute_result" } ], "source": [ "sum(l)" ] }, { "cell_type": "code", "execution_count": 24, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "15" ] }, "execution_count": 24, "metadata": {}, "output_type": "execute_result" } ], "source": [ "rtotal(l)" ] }, { "cell_type": "code", "execution_count": 25, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "5" ] }, "execution_count": 25, "metadata": {}, "output_type": "execute_result" } ], "source": [ "len(l)" ] }, { "cell_type": "code", "execution_count": 26, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "5" ] }, "execution_count": 26, "metadata": {}, "output_type": "execute_result" } ], "source": [ "rlength(l)" ] }, { "cell_type": "code", "execution_count": 27, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "5" ] }, "execution_count": 27, "metadata": {}, "output_type": "execute_result" } ], "source": [ "max(l)" ] }, { "cell_type": "code", "execution_count": 28, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "5" ] }, "execution_count": 28, "metadata": {}, "output_type": "execute_result" } ], "source": [ "rmax(l)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Below we define trace() which allows us watch the execution of recursive functions. Later we will see that trace() can be used as a decorator." ] }, { "cell_type": "code", "execution_count": 29, "metadata": {}, "outputs": [], "source": [ "def trace(f):\n", " f.indent = 0\n", " def g(x):\n", " print('| ' * f.indent + '|--', f.__name__, x)\n", " f.indent += 1\n", " value = f(x)\n", " print('| ' * f.indent + '|--', 'return', repr(value))\n", " f.indent -= 1\n", " return value\n", " return g" ] }, { "cell_type": "code", "execution_count": 30, "metadata": {}, "outputs": [], "source": [ "rtotal = trace(rtotal)" ] }, { "cell_type": "code", "execution_count": 31, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "|-- rtotal [1, 2, 3, 4, 5]\n", "| |-- rtotal [2, 3, 4, 5]\n", "| | |-- rtotal [3, 4, 5]\n", "| | | |-- rtotal [4, 5]\n", "| | | | |-- rtotal [5]\n", "| | | | | |-- rtotal []\n", "| | | | | | |-- return 0\n", "| | | | | |-- return 5\n", "| | | | |-- return 9\n", "| | | |-- return 12\n", "| | |-- return 14\n", "| |-- return 15\n" ] }, { "data": { "text/plain": [ "15" ] }, "execution_count": 31, "metadata": {}, "output_type": "execute_result" } ], "source": [ "rtotal(l)" ] }, { "cell_type": "code", "execution_count": 32, "metadata": {}, "outputs": [], "source": [ "rlength = trace(rlength)" ] }, { "cell_type": "code", "execution_count": 33, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "|-- rlength [1, 2, 3, 4, 5]\n", "| |-- rlength [2, 3, 4, 5]\n", "| | |-- rlength [3, 4, 5]\n", "| | | |-- rlength [4, 5]\n", "| | | | |-- rlength [5]\n", "| | | | | |-- rlength []\n", "| | | | | | |-- return 0\n", "| | | | | |-- return 1\n", "| | | | |-- return 2\n", "| | | |-- return 3\n", "| | |-- return 4\n", "| |-- return 5\n" ] }, { "data": { "text/plain": [ "5" ] }, "execution_count": 33, "metadata": {}, "output_type": "execute_result" } ], "source": [ "rlength(l)" ] }, { "cell_type": "code", "execution_count": 34, "metadata": {}, "outputs": [], "source": [ "rmax = trace(rmax)" ] }, { "cell_type": "code", "execution_count": 35, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "|-- rmax [1, 2, 3, 4, 5]\n", "| |-- rmax [2, 3, 4, 5]\n", "| | |-- rmax [3, 4, 5]\n", "| | | |-- rmax [4, 5]\n", "| | | | |-- rmax [5]\n", "| | | | | |-- rmax []\n", "| | | | | | |-- return -inf\n", "| | | | | |-- return 5\n", "| | | | |-- return 5\n", "| | | |-- return 5\n", "| | |-- return 5\n", "| |-- return 5\n" ] }, { "data": { "text/plain": [ "5" ] }, "execution_count": 35, "metadata": {}, "output_type": "execute_result" } ], "source": [ "rmax(l)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "

Tail Recursion

\n", "\n", "As should be apparent from the traces of the recursive functions above, there is a lot of overhead for calling recursive functions. The state of the calling function has to be pushed on the stack, waiting for the recursive calls to complete.\n", "\n", "One way to avoid this overhead is to write tail recursive functions. That is, the function is recursive, but the calling function does not need to wait to complete the final value. The tail recursive call includes the needed result values.\n", "\n", "See (https://en.wikipedia.org/wiki/Tail_call)" ] }, { "cell_type": "code", "execution_count": 36, "metadata": {}, "outputs": [], "source": [ "def trtotal(lst):\n", " return totalaux(lst,0)\n", "\n", "def totalaux(lst, result):\n", " if not lst:\n", " return result\n", " return totalaux(lst[1:], result + lst[0])" ] }, { "cell_type": "code", "execution_count": 37, "metadata": {}, "outputs": [], "source": [ "def trlength(lst):\n", " return lengthaux(lst,0)\n", "\n", "def lengthaux(lst, result):\n", " if not lst:\n", " return result\n", " return lengthaux(lst[1:], result + 1)" ] }, { "cell_type": "code", "execution_count": 38, "metadata": {}, "outputs": [], "source": [ "def trmax(lst):\n", " return maxaux(lst, float('-inf'))\n", "\n", "def maxaux(lst, result):\n", " if not lst:\n", " return result\n", " else:\n", " if lst[0] > result:\n", " result = lst[0]\n", " return maxaux(lst[1:], result)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Here are revised versions of the tail recursive functions that do\n", "not require helper functions. They use default parameter values instead.\n" ] }, { "cell_type": "code", "execution_count": 39, "metadata": {}, "outputs": [], "source": [ "def trtotalbest(lst, result=0):\n", " if not lst:\n", " return result\n", " return trtotalbest(lst[1:], result + lst[0])" ] }, { "cell_type": "code", "execution_count": 40, "metadata": {}, "outputs": [], "source": [ "def trlengthbest(lst, result=0):\n", " if not lst:\n", " return result\n", " return trlengthbest(lst[1:], result + 1)" ] }, { "cell_type": "code", "execution_count": 41, "metadata": {}, "outputs": [], "source": [ "def trmaxbest(lst, result = float('-inf')):\n", " if not lst:\n", " return result\n", " else:\n", " if lst[0] > result:\n", " result = lst[0]\n", " return trmaxbest(lst[1:], result)" ] }, { "cell_type": "code", "execution_count": 42, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "15" ] }, "execution_count": 42, "metadata": {}, "output_type": "execute_result" } ], "source": [ "trtotalbest(l)" ] }, { "cell_type": "code", "execution_count": 43, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "5" ] }, "execution_count": 43, "metadata": {}, "output_type": "execute_result" } ], "source": [ "trlengthbest(l)" ] }, { "cell_type": "code", "execution_count": 44, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "5" ] }, "execution_count": 44, "metadata": {}, "output_type": "execute_result" } ], "source": [ "trmaxbest(l)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let's try tracing these tail recursive functions." ] }, { "cell_type": "code", "execution_count": 45, "metadata": {}, "outputs": [], "source": [ "trtotalbest = trace(trtotalbest)" ] }, { "cell_type": "code", "execution_count": 46, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "|-- trtotalbest [1, 2, 3, 4, 5]\n" ] }, { "ename": "TypeError", "evalue": "g() takes 1 positional argument but 2 were given", "output_type": "error", "traceback": [ "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m", "\u001b[0;31mTypeError\u001b[0m Traceback (most recent call last)", "\u001b[0;32m\u001b[0m in \u001b[0;36m\u001b[0;34m\u001b[0m\n\u001b[0;32m----> 1\u001b[0;31m \u001b[0mtrtotalbest\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0ml\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m", "\u001b[0;32m\u001b[0m in \u001b[0;36mg\u001b[0;34m(x)\u001b[0m\n\u001b[1;32m 4\u001b[0m \u001b[0mprint\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m'| '\u001b[0m \u001b[0;34m*\u001b[0m \u001b[0mf\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mindent\u001b[0m \u001b[0;34m+\u001b[0m \u001b[0;34m'|--'\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mf\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0m__name__\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mx\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 5\u001b[0m \u001b[0mf\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mindent\u001b[0m \u001b[0;34m+=\u001b[0m \u001b[0;36m1\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m----> 6\u001b[0;31m \u001b[0mvalue\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mf\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mx\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 7\u001b[0m \u001b[0mprint\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m'| '\u001b[0m \u001b[0;34m*\u001b[0m \u001b[0mf\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mindent\u001b[0m \u001b[0;34m+\u001b[0m \u001b[0;34m'|--'\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0;34m'return'\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mrepr\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mvalue\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 8\u001b[0m \u001b[0mf\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mindent\u001b[0m \u001b[0;34m-=\u001b[0m \u001b[0;36m1\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n", "\u001b[0;32m\u001b[0m in \u001b[0;36mtrtotalbest\u001b[0;34m(lst, result)\u001b[0m\n\u001b[1;32m 2\u001b[0m \u001b[0;32mif\u001b[0m \u001b[0;32mnot\u001b[0m \u001b[0mlst\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 3\u001b[0m \u001b[0;32mreturn\u001b[0m \u001b[0mresult\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m----> 4\u001b[0;31m \u001b[0;32mreturn\u001b[0m \u001b[0mtrtotalbest\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mlst\u001b[0m\u001b[0;34m[\u001b[0m\u001b[0;36m1\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m]\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mresult\u001b[0m \u001b[0;34m+\u001b[0m \u001b[0mlst\u001b[0m\u001b[0;34m[\u001b[0m\u001b[0;36m0\u001b[0m\u001b[0;34m]\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m", "\u001b[0;31mTypeError\u001b[0m: g() takes 1 positional argument but 2 were given" ] } ], "source": [ "trtotalbest(l)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Oops. We have a problem. The trace() function is assuming that the parameter function has a single parameter itself. The tr--best functions have a variable number of arguments due to the default parameter. We need to modify trace to handle the variable number of arguments." ] }, { "cell_type": "code", "execution_count": 47, "metadata": {}, "outputs": [], "source": [ "def trace2(f):\n", " f.indent = 0\n", " def g(*x):\n", " print('| ' * f.indent + '|--', f.__name__, x)\n", " f.indent += 1\n", " value = f(*x)\n", " print('| ' * f.indent + '|--', 'return', repr(value))\n", " f.indent -= 1\n", " return value\n", " return g" ] }, { "cell_type": "code", "execution_count": 48, "metadata": {}, "outputs": [], "source": [ "def trtotalbest(lst, result=0):\n", " if not lst:\n", " return result\n", " return trtotalbest(lst[1:], result + lst[0])" ] }, { "cell_type": "code", "execution_count": 49, "metadata": {}, "outputs": [], "source": [ "trtotalbest = trace2(trtotalbest)" ] }, { "cell_type": "code", "execution_count": 50, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "|-- trtotalbest ([1, 2, 3, 4, 5],)\n", "| |-- trtotalbest ([2, 3, 4, 5], 1)\n", "| | |-- trtotalbest ([3, 4, 5], 3)\n", "| | | |-- trtotalbest ([4, 5], 6)\n", "| | | | |-- trtotalbest ([5], 10)\n", "| | | | | |-- trtotalbest ([], 15)\n", "| | | | | | |-- return 15\n", "| | | | | |-- return 15\n", "| | | | |-- return 15\n", "| | | |-- return 15\n", "| | |-- return 15\n", "| |-- return 15\n" ] }, { "data": { "text/plain": [ "15" ] }, "execution_count": 50, "metadata": {}, "output_type": "execute_result" } ], "source": [ "trtotalbest(l)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The sad fact is that python does not optimize tail-recursive calls. See (https://chrispenner.ca/posts/python-tail-recursion). As the linked page shows,\n", "you can implement tail recursion, creating a Recurse class. We will revisit this topic later.\n", "\n", "### More recursive functions\n", "\n", "Here are a couple of additional recursive functions. These examples are related to the homework problems." ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "# toord('abcd') - iterative\n", "def toord(str):\n", " for c in str:\n", " print (c, \" => \", ord(c))" ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [], "source": [ "# recursive version\n", "def rtoord(str):\n", " if not str:\n", " return\n", " else:\n", " c = str[0]\n", " print (c, \" => \", ord(c))\n", " rtoord(str[1:])" ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "a => 97\n", "b => 98\n", "c => 99\n", "d => 100\n" ] } ], "source": [ "toord('abcd')" ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "a => 97\n", "b => 98\n", "c => 99\n", "d => 100\n" ] } ], "source": [ "rtoord('abcd')" ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [], "source": [ "def tochar(s, e):\n", " for n in range(s, e):\n", " print (n, \" => \", chr(n))" ] }, { "cell_type": "code", "execution_count": 10, "metadata": {}, "outputs": [], "source": [ "def rtochar(s, e):\n", " if s == e:\n", " return\n", " else:\n", " print (s, \" => \", chr(s))\n", " rtochar(s+1, e)" ] }, { "cell_type": "code", "execution_count": 11, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "97 => a\n", "98 => b\n", "99 => c\n", "100 => d\n" ] } ], "source": [ "tochar(97,101)" ] }, { "cell_type": "code", "execution_count": 12, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "97 => a\n", "98 => b\n", "99 => c\n", "100 => d\n" ] } ], "source": [ "rtochar(97,101)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Context Free Grammars\n", "\n", "Above we discussed the use of recursion in providing succinct definitions of mathematical objects, like integers, and data structures, like lists.\n", "\n", "Another important use of recursion is defining grammars for languages - both natural languages, like English and German, and computer languages, like Python and C.\n", "\n", "You are no doubt familiar with parts of speech such as noun, verb, and prepositions. These are the non-terminal building blocks of a grammar. The terminal elements are words: nouns such as \"cat\" or \"dog\", and verbs such as \"ran\" or \"chased\".\n", "\n", "Below is a simple grammar (in racket) for a trivial subset of English. \n", "\n", "
\n",
    "(define grammar-mcd\n",
    "  (cfg\n",
    "   '(a the mouse cat dog it slept swam chased evaded dreamed believed that)\n",
    "   '(s np vp det n pn vi vt v3)\n",
    "   's\n",
    "   (list\n",
    "    (rule 's '(np vp))\n",
    "    (rule 'np '(det n))\n",
    "    (rule 'np '(pn))\n",
    "    (rule 'det '(a))\n",
    "    (rule 'det '(the))\n",
    "    (rule 'n '(mouse))\n",
    "    (rule 'n '(cat))\n",
    "    (rule 'n '(dog))\n",
    "    (rule 'pn '(it))\n",
    "    (rule 'vp '(vi))\n",
    "    (rule 'vp '(vt np))\n",
    "    (rule 'vp '(v3 that s))\n",
    "    (rule 'vi '(slept))\n",
    "    (rule 'vi '(swam))\n",
    "    (rule 'vt '(chased))\n",
    "    (rule 'vt '(evaded))\n",
    "    (rule 'v3 '(dreamed))\n",
    "    (rule 'v3 '(believed)))))\n",
    "
\n", "\n", "We interpret the grammar as follows:\n", "\n", "Terminal values: a the mouse cat dog it slept swam chased evaded dreamed believed that\n", "\n", "Non-terminal values: s np vp det n pn vi vt v3\n", "\n", "Rewrite rules: (the left side can be rewritten as the right side)\n", "\n", "
\n",
    "s : np vp   # a sentence is a noun phrase followed by a verb phrase\n",
    "np : det n  # a noun phrase can be a determiner followed by a noun\n",
    "np : pn     # a noun can be a pronoun\n",
    "det : a     # a determiner can be a\n",
    "det : the   # a determiner can be the\n",
    "n : mouse   # a noun can be mouse\n",
    "n : cat     # a noun can be cat\n",
    "n : dog     # a noun can be dog\n",
    "pn : it     # a pronoun can be it\n",
    "vp : vi     # a verb phrase can be an intransitive verb\n",
    "vp : vt np  # a verb phrase can be a transitive verb followed by a noun phrase\n",
    "vp : v3 that s # a verb phrase can be a v3 verb followed by \"that\" followed by a sentence\n",
    "vi : slept  # an intransitive verb can be slept\n",
    "vi : swam   # an intransitive verb can be swam\n",
    "vt : chased # a transitive verb can be chased\n",
    "vt : evaded # a transitive verb can be evaded\n",
    "v3: dreamed # a v3 verb can be dreamed\n",
    "v3 : believed # a v3 verb can be believed\n",
    "
\n", "\n", "We can derive a sentence by expanding each non-terminal node of a tree until no non-terminal nodes remain. This is a recursive process.\n", "\n", "
\n",
    "S -> NP VP\n",
    "NP -> DET N\n",
    "DET -> the\n",
    "N -> cat\n",
    "VP -> VT NP\n",
    "VT -> chased\n",
    "NP -> PN\n",
    "PN -> it\n",
    "\n",
    "the cat chased it\n",
    "
\n", "\n", "Note that the v3 rule results in a recursive call to the sentence node." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "This type of grammar, with a single non-terminal on the left hand side, is \n", "known as a context free grammar. There is a more restrictive type of grammar known as regular expressions, which we will examine later.\n", "\n", "There are also less restrictive grammars including context sensitive grammars and recursively enumerable languages. We will not be discussing those.\n", "\n", "They all fall under the general category of formal languages.\n", "\n", "Context free grammars are especially interesting for computer science as they provide a grammatical structure for most computer programming languages. The common name for these grammatical descriptions of programming languages is Backus Naur Form or BNF. John Backus designed FORTRAN and Peter Naur designed ALGOL.\n", "\n", "See http://matt.might.net/articles/grammars-bnf-ebnf/ for a discussion of the language of languages." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "End of Recursion notebook, for now. Later we will discuss deep recursion." ] } ], "metadata": { "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.9.7" } }, "nbformat": 4, "nbformat_minor": 4 }