## CS 201: Introduction to Racket

<p>
<script language="JavaScript">
    document.write("Last modified: " + document.lastModified)
</script>
    
See <a target=eex href="racket.html">racket.html</a>
    
### Acknowledgement
    
These notes were originally written by Professor Dana Angluin of the Yale Computer Science Department, who taught CS 201 for many years.  The present author has supplemented them over the years with the aim of preserving the tone and rigor of Professor Angluin.
    
### Why Racket?

First, check out the <a target=ww href="https://racket-lang.org/">racket web site</a>.

- Why do we use Racket?
- How many of you have studied Latin?

Racket, like Latin, is not widely used. It is not practical. 
How many classic scholars have been disappointed on their 
first visit to Latin America?

Many western languages have roots in Latin. 

Many languages can trace their syntax and vocabulary to Latin origins. 
Studying Latin provides a convenient way for studying language itself and 
the way language evolves.  In addition, a study of Latin informs our understanding of grammar in general and vocabulary for particular western languages.

The same is true of Racket. Racket itself is relatively new, dating from the 1990's. 
However, Racket is derived from Scheme (1970's) and LISP (1950's). 
LISP is one of the oldest and most influential programming languages around. 
Learning Latin helps you with French, Spanish, English, Italian, and Romanian. 
You will likewise see elements of Racket in many modern languages, such as Python.



Also, learning a programming language is less like learning a natural language, 
like French or Chinese, but more like learning to drive a car. 
It should not matter much what kind of car you use when you learn to drive. 
(Exception: per P.J. O'Rourke, what is the best kind of car to use when learning 
to drive a stick shift?) 
You are learning generic driving skills. Once you learn to drive, it should not 
matter much what kind of car you are driving. At some point, you will 
fly somewhere and rent a car that you have never driven before. Within five 
minutes, you will be on the road because most of driving is independent of 
any specific model of car.

The same is true of programming. 
The programming skills you learn in this course will transfer 
to almost any language you use in the future.

- <i>Perlis epigram #26:</i> There will always be things we wish to 
say in our programs that in all known languages can only be said poorly.

### Evaluating Expressions

We consider rules for the evaluation of expressions in Racket.

We first have to tell Jupyter notebooks which version of racket we want.  As it turns out, we are using plain, vanilla racket.

In [1]:
(require racket)
(require racket/base)

You will likely be using Dr. Racket, an IDE (Interactive Development Environment) for racket. Here is an image of a Dr Racket session.

<img src="DrRacketScreenShot.png">

### 1. Constants evaluate to themselves.

Constants are expressions like the numbers 18 and -1, the string "hi there!", and the Boolean values <tt>#t</tt> and <tt>#f</tt> (for true and false.) As examples of evaluating these expressions in the code  field of jupyter notebooks, we have the following.

#### Integers

In [2]:
18

In [3]:
-1

#### Strings

In [4]:
"hi there!"

In [5]:
"" ;; empty string

In [6]:
"this ; is not a comment"

Note that the semi-colon is the comment character.  Everything to the right
of a semi-colon is ignored by racket, unless the semi-colon is in a string.

#### Booleans (true and false)

In [7]:
#t

In [8]:
#f

Numbers are actually a fairly complex subject in most programming languages, 
so for the time being we will consider only integers, that is, positive, negative, 
and zero whole numbers.

### Applications are procedure calls

The term "application" means procedure call. The syntax of an application is as follows.

<pre>
    (proc arg1 ... argn)
</pre>

where proc is an expression (which evaluates to a procedure), and arg1, arg2, ..., argn are expressions (which are evaluated to determine the arguments to the procedure.) 

The math formulas $$f(x)$$ and $$g(x,y,z)$$ 

correspond to the racket expressions
<code>(f x)</code> and <code>(g x y z)</code>.

The rule for evaluating an application is the following

<h3 id="h3application">2. An application is evaluated by evaluating the first expression
</h3>
(whose value must be a procedure) and each of the rest of the
expressions; the procedure is called with the values of the rest
of the expressions as its actual arguments; when the procedure
returns a value, that value is the value of the application.
As an example of an application, we have the following.

In [9]:
(+ 18 4)

In [10]:
+

In the application <code>(+ 18 4)</code>, + is an identifier which evaluates to a procedure, namely, the built-in procedure to add numbers. The expressions 18 and 4 are constants, which evaluate to themselves according to rule (1). Then the built-in procedure to add numbers is called with the arguments 18 and 4, and returns the value 22, which becomes the value of the whole application expression. Just how + evaluates to a procedure will be seen shortly. As further examples of applications, we have the following.

In [11]:
(- 18 4)

In [12]:
(* 6 3)

In [13]:
(quotient 22 6)

In [14]:
(remainder 22 6)

The identifiers <tt>-</tt>, <tt>*</tt>, <tt>quotient</tt>, and <tt>remainder</tt> evaluate to the built-in procedures  (respectively) subtract, multiply, take the integer quotient, and take the integer remainder. (Using the "division algorithm" we divide 6 into 22 getting a quotient of 3 and a remainder of 4, so 22 = 3*6+4.)

Note that Racket is uncompromising: in an application, the expression for the procedure *always* comes first. What happens when (inevitably) we write something like (18 - 4)? We get an error message, something like:

In [15]:
(18 - 4)

application: not a procedure;
 expected a procedure that can be applied to arguments
  given: 18
  arguments...:
   #<procedure:->
   4
  context...:
   eval-one-top12
   /usr/share/racket/pkgs/sandbox-lib/racket/sandbox.rkt:510:0: call-with-custodian-shutdown
   /usr/share/racket/collects/racket/private/more-scheme.rkt:148:2: call-with-break-parameterization
   .../more-scheme.rkt:261:28
   /usr/share/racket/pkgs/sandbox-lib/racket/sandbox.rkt:878:5: loop


Well, I told a lie.  There is an exception to the prefix notation using the following dot notation.

In [16]:
(18 . - . 4)

We point this out not as an endorsement, but as a warning, much like telling you about poison ivy or rattlesnakes. You should know that they exist, but steer clear of them.

The first expression after the left parenthesis is 18, which is a number, not a procedure. The error message comes from the code to evaluate an application and is trying to tell us that it expected a procedure but got 18 instead. Feel free to try things out in the interaction window to see what happens -- it will help you to be able to interpret error messages if/when you see them in response to running your own programs.

The third rule is simple but powerful.

#### 3. The rules apply recursively.
This means that sub-expressions are evaluated the same way as any other expressions. As an example, consider the following.

In [17]:
(+ (* 3 6) 4)

The outer set of parenthesis indicate an application. The first expression is +, which evaluates to the built-in procedure to add numbers. The second expression is <tt>(* 3 6)</tt>, which is another application. We need to find its value to know what the first argument to the addition procedure should be. So (recursively) we evaluate this expression. It is also an application, with first expression <tt>*</tt>, which evaluates to the built-in procedure to multiply numbers. The other expressions are 3 and 6, which are constants and evaluate to themselves. Then the multiplication procedure is called with arguments 3 and 6 and returns 18. Now we know the value of the first argument to the addition procedure. The expression for the second argument is 4, which is a constant that evaluates to itself. Now we can call the addition procedure with arguments 18 and 4, and it returns 22, which becomes the value of the whole expression.

So what rule can we use to evaluate an identifier like <tt>+</tt>, <tt>*</tt>, or <tt>remainder</tt>? We need the concept of an "environment", which is a table with entries consisting of an identifier and its value. The identifier is said to be "bound" to the value in the environment. There is a top-level environment already defined when you start Dr. Racket; it contains identifiers such as <tt>+</tt>, <tt>*</tt>, and <tt>remainder</tt>, and gives their values as the built-in procedures to add numbers, multiply numbers, and take the remainder of two numbers. We could picture this top-level environment as a table as follows.

<pre>
     identifier   |    value
    --------------------------------
    |     +       |   built-in addition procedure
    --------------------------------
    |     *       |   built-in multiplication procedure
    --------------------------------
    | remainder   |   built-in remainder procedure
    ---------------------------------
</pre>

(Of course, there are many more than three entries in the top-level environment.) Then the rule for evaluating identifiers can be stated as follows.

#### 4. The value of an identifier is found by looking it up in the "relevant" environment.


This is well defined up to the specification of the "relevant" environment. For the moment, there is only one environment we will consider, namely, the top-level environment, so that will be the "relevant" one. Returning to the application <code>(+ 18 4)</code>, we see that the first expression in the application, the identifier +, is evaluated by looking up its value in the top-level environment, where its value is found to be the built-in addition procedure.

Can we add entries to the top-level environment? Yes, we can do so using the "special form" whose keyword is <tt>define</tt>. The terminology "special form" means an expression that looks somewhat like an application, but actually has a different evaluation rule. The syntax of <tt>define</tt> is as follows.

<pre>
(define identifier expression)
</pre>
The evaluation rule for this expression is as follows.

#### 5. A define expression adds the identifier to the relevant environment
with a value obtained by evaluating the expression.
If the identifier is already in the relevant environment, its binding is changed to the value obtained by evaluating the expression. A define expression may look a bit like an assignment statement, but that is the wrong way to think about it. You'll use it in your homework primarily to define procedures in Dr. Racket's definitions window. As an example, if we evaluate the following expression:

In [18]:
(define age 18)

then the identifier age is added to the top-level environment with the value of the expression 18 (namely 18 itself) as its value. So we can then picture the top-level environment as follows.
<pre>
     identifier   |    value
    --------------------------------
    |     +       |   built-in addition procedure
    --------------------------------
    |     *       |   built-in multiplication procedure
    --------------------------------
    | remainder   |   built-in remainder procedure
    ---------------------------------
    |    age      |   18
    ---------------------------------
</pre>

Then we can evaluate <code>age</code> as follows.

In [19]:
age

And we can proceed to use <code>age</code> in other expressions, for example the following.

In [20]:
(define new-age (+ age 4))

In [21]:
new-age

In this case, another identifier, <code>new-age</code> (note that the dash is part of the identifier), is added to the top-level environment, with the value 22, which is the result of evaluating the expression <code>(+ age 4)</code>. In detail, <tt>+</tt> is evaluated by looking it up in the top-level environment, where its value is found to be the built-in addition procedure. The identifier <tt>age</tt> is also evaluated by looking it up in the top-level environment, where its value is found to be the number 18. The expression 4 evaluates to itself, and the addition procedure is called with the arguments 18 and 4, and returns the value 22. This value is bound to the identifier new-age in the top-level environment, which we can now picture as follows.
<pre>
     identifier   |    value
    --------------------------------
    |     +       |   built-in addition procedure
    --------------------------------
    |     *       |   built-in multiplication procedure
    --------------------------------
    | remainder   |   built-in remainder procedure
    ---------------------------------
    |    age      |   18
    ---------------------------------
    |  new-age    |   22
    ---------------------------------
</pre>

When we evaluate <code>new-age</code>, its value is looked up in the top-level environment and found to be 22. Note that when we quit and restart Dr. Racket, the top-level environment returns to its initial contents, so <code>age</code> and <code>new-age</code> would no longer be in the top-level environment in that case.

<a name="special"><h3>Special Forms</h3></a>

How can we tell a "special form" from an ordinary application expression? They are both enclosed in parentheses, but a "special form" is one of a small number of keywords (e.g., <tt>define</tt>) as the first expression in the list. 

<b>A special form, unlike a procedure, does not need to evaluate its arguments.</b>


Note that parentheses in Racket have a rather different function from their use in mathematics, where they can be used for grouping and are sometimes optional. It is important to retrain your intuition so that you do not think of parentheses in Racket as negligible or innocuous.

The Racket Guide lists <a target=tt href="https://docs.racket-lang.org/ts-reference/special-forms.html">a finite number of special forms</a>. In this course, we will focus on a handful: <code>define, if, cond, and, or, quote, let, let*, case, struct, and lambda</code>. No big drama.

Try evaluating the expression +, and you will see how Racket represents the built-in procedure +. Try defining * to be + and see what happens. (Remember that you can restore the initial top-level environment by quitting and re-starting Dr. Racket.)



In [22]:
(define someage 200)

In [23]:
(+ someage someage)

In [24]:
+

In [25]:
(define old+ +)
(define old* *)

In [26]:
(define + *)
(define * old+)

In [27]:
(+ 7 7)

In [28]:
(* 7 7)

Enough of this nonsense.

In [29]:
(require racket)
(require racket/base)

In [30]:
(define + old+)
(define * old*)

In [31]:
(* 7 7)

In [32]:
(+ 7 7)

### Can we please write some code?

We are programmers! When are we going to define our own procedures? There is a special form with keyword <tt>lambda</tt> that causes a procedure to be created. The syntax is as follows.
<pre>
    (lambda (arg1 ... argn) expression)
</pre>
The keyword <tt>lambda</tt> signals that this is a <b>lambda-expression</b>. The <code>(arg1 ... argn)</code> component is a finite sequence of identifiers <tt>arg1</tt>, <tt>arg2</tt>, and so on, up to <tt>argn</tt>, that gives names to the "formal arguments" of the procedure. The final expression is the "body" of the procedure and indicates how to compute the value of the procedure from its arguments. As an example, we can evaluate the following expression.

In [33]:
(lambda (n) (+ n 4))

Evaluating this expression creates a procedure of one formal argument, <tt>n</tt>, that takes a number, adds 4 to it, and returns the resulting sum. In fact, in this case, the procedure is created, but neither applied nor named, so it just drifts off into the ether. We could not only create it, but also apply it, as follows.

In [34]:
((lambda (n) (+ n 4)) 18)

What happened here? The outer parentheses are an application (after the first left parenthesis there is another left parenthesis, not a keyword.) The first expression, <code>(lambda (n) (+ n 4))</code>, is evaluated, which creates a procedure of one argument that adds 4 to its argument and returns the sum. The second expression, 18, is evaluated (to itself), and the procedure that we just created is called on the argument 18. The procedure adds 4 to 18 and returns 22, which is the value of the application. At least the procedure got applied in this case, but only once, and then got lost in the bit bucket. To use a procedure multiple times, we can give it a name, e.g., by using the <tt>define</tt> special form. We'll see more details of procedures, and more examples, below.

You may define functions that take a variable number of arguments.

In [35]:
(define x (lambda (a b) (+ a b)))

In [36]:
(x 3 4)

In [37]:
(define y (lambda n n))

In [38]:
(y 1 2 3)

In [39]:
(y 1 2 3 4 5 6)

In [40]:
(y)

The single quote is an abbreviation of another special form: <tt>quote</tt>, which instructs racket <b>not</b> to evaluate its arguments.

In [41]:
'(add 3 4)

In [42]:
(quote (add 3 4))

In [43]:
(define z (lambda n (apply + n)))

In [44]:
(z 3 4)

In [45]:
(z 3 4 5 6)

In [46]:
(define (z2 . n) (apply + n))

In [47]:
(z2 3 4 5 6)

We simply follow the <code>lambda</code> keyword with an argument name not enclosed in parens. The <code>apply</code> procedure uses the procedure of its first argument and evaluates the rest of the arguments.

We can also specify functions with at least n arguments.

In [48]:
(define w (lambda (a b . c) (+ a b (apply + c))))

In [49]:
(w 1 2)

In [50]:
(w 1 2 3 4 5 6)

In [51]:
(w 1)

w: arity mismatch;
 the expected number of arguments does not match the given number
  expected: at least 2
  given: 1
  arguments...:
   1
  context...:
   eval-one-top12
   /usr/share/racket/pkgs/sandbox-lib/racket/sandbox.rkt:510:0: call-with-custodian-shutdown
   /usr/share/racket/collects/racket/private/more-scheme.rkt:148:2: call-with-break-parameterization
   .../more-scheme.rkt:261:28
   /usr/share/racket/pkgs/sandbox-lib/racket/sandbox.rkt:878:5: loop


Here we define function w to take at least two arguments, but allow more.
See [racket1.rkt](racket1.rkt) for examples of defining racket variables and functions with variable or optional parameters.

### Collatz Conjecture

<img src="https://imgs.xkcd.com/comics/collatz_conjecture.png">

Want to win a Fields Medal? Solve the Collatz Conjecture! (aka, Kakutani's Problem).
We define a function <code>(collatz n)</code> (where n is an arbitrary positive integer) which behaves as follows:

- If n is even, return n/2.
- If n is odd, return 3n + 1.

Note: Shizuo Kakutani was a popular Yale Math professor for many years.  His daughter, 
Michiko Kakutani, who is a Yale graduate, won the Pulitzer Prize for criticism as the <i>New York Times</i> book reviewer.

Let's write that function in Racket: <code>(collatz n)</code> See 
<a target=qq href="collatz.rkt">collatz.rkt</a> and <a target=cc href="Collatz.ipynb">notebook</a> and <a target=ww href="Collatz.html">HTML</a>. (Note use of <tt>trace</tt> and <tt>untrace</tt>)

In [52]:
(define (collatz n)
  (if (= (modulo n 2) 0) 
      (/ n 2)
      (+ 1 (* n 3))))

In [53]:
(collatz 11)

In [54]:
(collatz 34)

In [55]:
(collatz 17)

We have used the <code>if</code> special form:

<pre>
(if test then-expression else-expression)
</pre>

If the <tt>test</tt> is true, then the <tt>then-expression</tt> is evaluated and returned.  If the <tt>test</tt> is false, the <tt>else-expression</tt> is evaluated and returned.

Note that if <tt>if</tt> were a procedure and not a special form, both the <tt>then-expression</tt> and the <tt>else-expression</tt> would always be evaluated.  This could be unfortunate.

<pre>
(if under-attack
    launch-missiles
    take-a-nap)
</pre>

We can have <code>collatz</code> call itself with its own result.  

In [56]:
(collatz (collatz (collatz 11)))

Next, we use recursion to define a sequence of Collatz numbers, such that the output of each call becomes the input of the next, unless and until you arrive at 1. The conjecture part is to prove that this series will always converge to 1. We define 
<code>(c-series n)</code> which uses an if statement and a let statement to create a local variable. The other function, <code>(c-series2 n)</code>, does away with the local variable.

In [57]:
(define (c-series n)
  (print n)
  (newline)
  (if (equal? n 1) 'done
      (let ((next (collatz n)))
    (c-series next))))

In [58]:
(c-series 11)

11
34
17
52
26
13
40
20
10
5
16
8
4
2
1


In [59]:
(define (c-series2 n)
  (print n)
  (newline)
  (if (equal? n 1) 'done
      (c-series2 (collatz n))))

In [60]:
(c-series2 11)

11
34
17
52
26
13
40
20
10
5
16
8
4
2
1


## More Racket.

- How can you tell racket to import a file that has changed since the last time you imported it? Just "<code>enter!</code>" it again. It will reload. Note: <code>enter!</code> is not in our jupyter version of racket.

We recall the following principles of racket:

- Constants evaluate to themselves: numbers, strings, booleans (and actually procedures!)
- There are many numerical types in racket with corresponding predicates:

In [61]:
(number? 1)

In [62]:
(number? "hello")

Racket has a naming convention that procedures ending in <tt>?</tt> are predicates, that is, they ask a yes or no question. Thus, <tt>number?</tt> means, <i>is my argument a number</i>? By extension, racket programmers may write <tt>hungry?</tt> meaning <i>are you hungry</i>?

In [63]:
(complex? 2+3i)

In [64]:
(real? pi)

In [65]:
pi

In [66]:
(real? pi)

In [67]:
(real? +inf.0)

In [68]:
-inf.0

In [69]:
(rational? 1)

In [70]:
(integer? 1)

In [71]:
(integer? +inf.0)

In [72]:
(integer? 2.0)

In [73]:
(exact-integer? 2.0)

Representing numbers inside a computer can get complicated.  Clearly, <i>pi</i> is an approximation.  Racket distinguishes between exact and inexact integers.

In [74]:
(exact-nonnegative-integer? 0)

In [75]:
(exact-nonnegative-integer? -1)

In [76]:
(exact-positive-integer? 0)

In [77]:
(inexact-real? 3.4)

In [78]:
(inexact-real? 3.5)

In [79]:
(flonum? 3.4)

In [80]:
(double-flonum? 3.4)

In [81]:
(double-flonum? 3.4444444444)

In [82]:
(single-flonum? 3.4)

In [83]:
(zero? 0.0)

In [84]:
(positive? 1)

In [85]:
(negative? 1)

In [86]:
(even? 1)

In [87]:
(odd? 1)

In [88]:
(exact? pi)

In [89]:
(inexact? pi)

In [90]:
(inexact->exact pi)

In [91]:
(inexact->exact 2.0)

We will next load the file <a target=ww href="racket2.rkt">racket2.rkt</a> and execute the procedure <code>(demo)</code>

In [92]:
(define examples
  '(
    (number? 1)
    (complex? 2+3i)
    (real? 3.14159)
    (real? +inf.0)
    (rational? 1)
    (integer? 1)
    (integer? +inf.0)
    (integer? 2.0)
    (exact-integer? 2.0)
    (exact-nonnegative-integer? 0)
    (exact-nonnegative-integer? -1)
    (exact-positive-integer? 0)
    (inexact-real? 3.4)
    (inexact-real? 3.5)
    (flonum? 3.4)
    (double-flonum? 3.4)
    (double-flonum? 3.4444444444)
    (single-flonum? 3.4)
    (zero? 0.0)
    (positive? 1)
    (negative? 1)
    (even? 1)
    (odd? 1)
    (exact? 3.14159)
    (inexact? 3.14159)
    (inexact->exact 3.14159)

    )
  )

In [93]:
(define (demo)
  (map
   (lambda (lst)
     (list (car lst)
        (cadr lst)
       '==>
       (apply (eval (car lst)) (cdr lst))))
   examples))

Note: <code>map, list, car, cadr</code> will be covered in the list section below.

In [94]:
(demo)

### Recapitulation of Rules for Evaluating Expressions


- The leftmost form in a list is evaluated as a procedure, and the remaining elements of the list are passed as arguments to the procedure.

- The last value is the value returned by the procedure.

- If you want your procedure to return no value, have the tail position be 
<code>(void ...)</code>. This may be useful if your function is called merely for its side-effects, such as input/output.

- The rules apply recursively: <code>(* (+ 9 9) (- 10 2))</code>

- The value of an identifier is found by looking it up in the relevant environment. This is just a big table. Be careful: You can clobber definitions.
<code>(define + *)</code>

- A <tt>define</tt> expression adds the identifier to the relevant environment with a value obtained by evaluating the expression.

- <tt>define</tt> is a special form not a procedure. Another special form is <tt>lambda</tt> which allows you to define a procedure without adding it to the environment.
<pre>
(lambda (n) (+ n 4))
</pre>
creates a procedure which can be applied to arguments:
<pre>
((lambda (n) (+ n 4)) 18)
</pre>

- Note: <tt>lambda</tt> expressions are available these days in most programming languages, including java, python, ruby, r, and haskell.

In [95]:
(define x (lambda (n) (+ n 1)))

In [96]:
(x 3)

In [97]:
(define (x2 n) (+ n 1))

In [98]:
(x2 3)

<h3 id="define">Defining Racket Procedures</h3>

The evaluation of a <tt>lambda</tt> expression creates a procedure. Above, we saw how to create and apply a (one time use) procedure in a single expression, for example:

In [99]:
((lambda (n) (+ n 4)) 18)

But we'd like to be able to re-use our procedures. We can do so by using <tt>define</tt>, as in the following example.

In [100]:
(define plus-four (lambda (n) (+ n 4)))

In [101]:
(plus-four 6)

In [102]:
(plus-four -1)

We'll now look at the details of what happened here. Recall that there is an initial top-level environment when you start Racket, which we may picture as follows.
<pre>
     identifier   |    value
    --------------------------------
                .....
    --------------------------------
    |     +       |   built-in addition procedure
    --------------------------------
    |     *       |   built-in multiplication procedure
    --------------------------------
    | remainder   |   built-in remainder procedure
    ---------------------------------
</pre>

(Here the dots are included to remind you that there are many other entries in the initial top-level environment.) In the above example, when the <tt>define</tt> special form is evaluated, the identifier <tt>plus-four</tt> is added to the top-level environment, and its value is the result of evaluating the expression <code>(lambda (n) (+ n 4))</code>, which is a procedure with formal arguments: n, and body: <code>(+ n 4)</code>. So the top-level environment becomes:

<pre>
     identifier   |    value
    --------------------------------
                .....
    --------------------------------
    |     +       |   built-in addition procedure
    --------------------------------
    |     *       |   built-in multiplication procedure
    --------------------------------
    | remainder   |   built-in remainder procedure
    ---------------------------------
    |  plus-four  |   user procedure with formal arguments: n
    |             |   and body: (+ n 4)
    ---------------------------------
</pre>


Note that the procedure has not been applied to any arguments yet. When the next expression, <code>(plus-four 6)</code>, is evaluated, the procedure we just created is applied, as follows. The left parenthesis is not followed by a keyword, so this is an application. Using the rules for an application, the identifier plus-four is looked up (in the top-level environment) and its value is found to be a procedure -- so far, so good. The rest of the expression, in this case just 6, are evaluated, and the procedure is called on the values, again, just 6.

The process of calling the procedure on its argument can now be understood as follows. A *new* local environment is created using the formal arguments of the procedure (here, just the identifier n) and the corresponding actual arguments (here, just the integer 6). We can picture this new local environment as follows.
<pre>
     identifier   |    value
    --------------------------------
    |     n       |      6
    --------------------------------
</pre>

In addition, there is a "search pointer" that points from this environment to the top-level environment, which indicates where to look for the value of an identifier that is not found in this environment. On the blackboard, this is just an arrow labeled "search pointer" pointing from this environment to the top-level environment. In this medium, we will just use text to indicate the search pointer. So, at this point, the whole environment picture is as follows.
<pre>
    * top level environment *
    --------------------------------
     identifier   |    value
    --------------------------------
                .....
    --------------------------------
    |     +       |   built-in addition procedure
    --------------------------------
    |     *       |   built-in multiplication procedure
    --------------------------------
    | remainder   |   built-in remainder procedure
    ---------------------------------
    |  plus-four  |   user procedure with formal arguments: (n)
    |             |   and body: (+ n 4)
    ---------------------------------


    * local environment *  search pointer: top-level environment
    --------------------------------
     identifier   |    value
    --------------------------------
    |     n       |      6
    --------------------------------
</pre>
Now that the local environment is set up, the process of applying the procedure evaluates the body of the procedure, in this case, (+ n 4), in the local environment. Now we can understand the meaning of the "relevant" environment in the rule for evaluating identifiers. An expression is evaluated in a current environment; to find the value of an identifier, we first look in the current environment to see if it has a binding there -- if so, that is its value. If not, then we follow the environment's search pointer (if any) to another environment and see if it has a binding there -- if so, that is its value. If not, then we follow that environment's search pointer (if any) to another environment, and so on. If this process reaches the top-level environment (which has no search pointer) and does not find a binding for the identifier there, then an error message is generated. An example of such an error message follows.

<pre>
> x
 (..... stuff ..............)    x: undefined;
 cannot reference an identifier before its definition
</pre>

(The "stuff" tells you where in the Racket system the error was detected.)
Back to the application in progress: the body of the procedure, that is, the expression (+ n 4), is now evaluated in the local environment just created. The left parenthesis is not followed by a keyword, so this is an application. The first expression, +, is an identifier, and is evaluated by looking it up in the relevant environment. So first we look in the current environment, which is the local environment. It is not there, so we follow the search pointer to the top-level environment, where we find that it has as its value the built-in procedure to add numbers. The expression n is also an identifier, but in this case we find its value in the current environment, namely 6. Finally, 4 is evaluated (to itself) and we call the built-in addition procedure on 6 and 4; it returns 10, which is the value of the expression (+ n 4) in the local environment, and is the value of the application (plus-four 6).

Once the application has been evaluated, what happens to the local environment we just created? It is no longer accessible, and become eligible for "garbage collection" or "recycling", which means that the Racket system may reclaim the memory that it used for other purposes. Conceptually, it is as though the local environment is erased immediately after the application completes, so that the environment picture returns to its previous situation, as follows.
<pre>

     identifier   |    value
    --------------------------------
                .....
    --------------------------------
    |     +       |   built-in addition procedure
    --------------------------------
    |     *       |   built-in multiplication procedure
    --------------------------------
    | remainder   |   built-in remainder procedure
    ---------------------------------
    |  plus-four  |   user procedure with formal arguments: (n)
    |             |   and body: (+ n 4)
    ---------------------------------
</pre>

Note that the binding for plus-four in the top-level environment remains as before. If next we evaluate the expression <code>(plus-four -1)</code>, the process is repeated analogously, creating a new local environment in which n is bound to -1, with its search pointer pointing to the top-level environment, and the body expression <code>(+ n 4)</code> is evaluated in this local environment and found to have value 3. Once evaluation of this application completes, its local environment is eligible for garbage collection (and may be thought of as erased.)

Whew! This is a lot of detail, but it is intended to give you an inside view of how the Racket interpreter works, which in turn will enhance your understanding of functional programming. You will seldom need to think about this level of detail while you are writing your procedures in Racket.

In [103]:
plus-four

In [104]:
(plus-four 5)

In [105]:
(define n 10)

In [106]:
(plus-four 17)

In [107]:
n

In [108]:
(define x 11)

In [109]:
(define plus-five (lambda  (n) (+ x 5)))

In [110]:
(plus-five 2)

### Alternate procedure definition syntax.

We now get a bit of "syntactic sugar" -- to let you avoid typing <tt>lambda</tt> all the time, and to make your code a little prettier, there is an alternate syntax for defining procedures. For example, we could define the <code>plus-four</code> procedure as follows.

In [111]:
 (define (plus-four n) (+ n 4))

Note that the <code>define</code> keyword is not followed by an identifier, but by a parenthesized list of identifiers. The first one is taken to be the procedure name, and the rest of them are taken to be the formal arguments to the procedure. (If there are more than one formal arguments, they are separated by white space, not commas.) This is syntactic sugar in the sense that it is a little more user-friendly than the previous syntax, but is just an abbreviation for it. Though you won't be typing <tt>lambda</tt> all the time, you should understand how <tt>lambda</tt> expressions work.

We'll write some more procedures. We'd like a procedure <code>(last-digit n)</code> that returns the last decimal digit of the positive integer <tt>n</tt>. As an example of its behavior we have the following.
<pre>
> (last-digit 437)
7
</pre>

Rather than picturing what happens in the interaction window, we introduce a shorthand (=>) for the concept that an expression evaluates to a value. Thus, we could indicate the above example by writing the following.

<pre>
(last-digit 437) => 7
</pre>

This is read as follows: the expression <code>(last-digit 437)</code> evaluates to 7. We'll first write it using <tt>lambda</tt>, and then give the abbreviated definition. Recall that quotient and remainder are built-in procedures giving the quotient and remainder of an integer division. If we divide 437 by 10, we get a quotient of 43 and a remainder of 7, so the remainder of the input and 10 will be exactly the last decimal digit of the input. Hence we can write the <code>last-digit</code> procedure as follows.

In [112]:
(define last-digit
  (lambda (x)
    (remainder x 10)))

In [113]:
(last-digit 437)

Note that I've used newlines and indentation to aid the readability of this procedure. Dr. Racket will help with indentation -- if you highlight a region of code and press tab, it will indent the code according to its parenthesis nesting. This can help you find errors in your parenthesis nesting. Note also that I chose to call the formal argument x -- the particular identifiers you choose for formal arguments are not important, except that naming things well will help you program well. In the alternate procedure definition syntax, this could be rewritten as follows.

In [114]:
(define (last-digit x)
  (remainder x 10))

In [115]:
(last-digit 438)

## The factorial function.

We come to our first recursive procedure definition. (Not counting the collatz series!) Recall from earlier educational experiences the definition of the factorial function, n!, for positive integers.
<pre>
    n!   =    if n = 1 then the value is 1
              otherwise, the value is n * (n-1)!

</pre>
So, for example, to compute 4!, we have the following.
<pre>
    4!   =    4 x 3!
         =    4 x 3 x 2!
         =    4 x 3 x 2 x 1!
         =    4 x 3 x 2 x 1
         =    24
</pre>

The case n = 1 is a "base case" -- it returns a value (1) without any further references to the factorial function. The other case (n not equal to 1) is a "recursive case" -- we need to compute the factorial function on another value (namely (n-1)) in order to find the value of the factorial function on n.

We'd like to write a procedure <code>(factorial n)</code> to compute the value of n! For example, we'd like <code>(factorial 4) => 24</code>. In order to write a procedure based on the definition above, we need two things: a way to test whether the input is equal to 1 or not, and a way to do one thing if it is, and something else if it is not. The testing can be done by using built-in predicates. A "predicate" is just a procedure that returns either <tt>#t</tt> (for true) or <tt>#f</tt>
(for false). The built-in predicates =, <, >, <=, >= can be used to compare two numbers to determine whether they are equal (=), or the first is less than (<), greater than (>), less than or equal to (<=), or greater than or equal to (>=), the second. As examples, we have the following. (Recall that the procedure invariably comes first.)

In [116]:
(= 3 4)

In [117]:
(= (+ 1 3) 4)

In [118]:
(< 3 4)

In [119]:
(<= 3 4)

In [120]:
(<= 4 4)

In [121]:
(> 3 4)

In [122]:
(> 4 3)

In [123]:
(>= 4 3)

In [124]:
(>= 4 4)

Every value in Racket has a type, and there are predicates to test the types of values. For example, the predicate <tt>number?</tt> tests whether its argument is a number. Note that the question mark is part of the identifier. As mentioned earlier, there is a convention in Racket (and Scheme) that ending the name of a procedure with a ? indicates that it is a predicate, that is, always returns <tt>#t</tt> or <tt>#f</tt>. The above predicates expect numbers as their arguments, and return an error message if this is not true. As an example, consider the following.

In [125]:
(= "hi" 7)

=: contract violation
  expected: number?
  given: "hi"
  argument position: 1st
  other arguments...:
   7


This error message is telling you that the built-in procedure <tt>=</tt> experienced a "contract violation", which means that its input didn't satisfy some requirement. In this case, it says it was expecting an argument of type number, (indicated by the predicate <code>number?</code>), and was given instead the string "hi!", and that this happened in the first argument position.

For testing equality of general Racket values, you can use the predicate <tt>equal?</tt> For this predicate we have the following.

In [126]:
(equal? "hi!" 7)

You get no error message, just the answer that "hi!" and 7 are not equal.
Now we know how to test whether the input n is equal to 1, namely the expression (= n 1). But we also need to be able to do one thing if it is and something else if it isn't. For this we can use the special form <tt>if</tt>. The syntax of <tt>if</tt> is as follows.

<pre>
    (if expression1 expression2 expression3)
</pre>
The keyword is <tt>if</tt>, and it must be followed by exactly three expressions. The evaluation rule is as follows. The condition, <tt>expression1</tt>, is evaluated. If the value is not <tt>#f</tt>, then <tt>expression2</tt> is evaluated and its value returned. If the value is <tt>#f</tt>, then <tt>expression3</tt> is evaluated and its value returned. Notice that (unlike in an application), we *don't* evaluate all three expressions: either we evaluate <tt>expression1</tt> and e<tt>xpression2</tt>, or we evaluate <tt>expression1</tt> and <tt>expression3</tt>, but not all three. (This is the "coffee or tea" exclusive or, as opposed to the "milk or sugar" inclusive or.)

With this new special form, we can finally write <code>(factorial n)</code> as follows.

In [127]:
(define factorial
  (lambda (n)
    (if (= n 1)
        1
        (* n (factorial (- n 1))))))

In [128]:
(factorial 5)

In [129]:
(factorial 4)

Using the alternate procedure definition syntax, we have the following.

In [130]:
(define (fact n)
  (if (= n 1)
      1
      (* n (fact (- n 1)))))

In [131]:
(fact 5)

Note that the "condition", <tt>expression1</tt>, in the if expression is <code>(= n 1)</code>, which tests whether n is equal to 1. The "then case", <tt>expression2</tt>, is just 1, which is the value that is returned when n is equal to 1. The "else case", <tt>expression3</tt>, is an expression that multiplies n by the result of a recursive call to the factorial procedure on the value of <code>(- n 1)</code>, which is 1 less than n. Thus, this procedure definition mirrors the original definition we gave for n! above.

To understand how <code>(factorial 4) => 24</code>, we see that <code>(factorial 4)</code> first has to compute <code>(factorial 3)</code> and multiply it by 4. And <code>(factorial 3)</code> has to compute <code>(factorial 2)</code> and multiply it by 3. And <code>(factorial 2)</code> has to compute <code>(factorial 1)</code> and multiply it by 2. But in the application <code>(factorial 1)</code>, the value of the argument n is 1, so we reach the base case, and (<code>factorial 1)</code> simply evaluates to 1. Then <code>(factorial 2)</code> can multiply 1 by 2 and evaluate to 2. Then <code>(factorial 3)</code> can multiply 2 by 3 and evaluate to 6. Finally, <code>(factorial 4)</code> can multiply 6 by 4 and evaluate to 24.

See <a target=ww href="racket3.rkt">racket3.rkt</a> and try out the trace facility.

In [132]:
(require racket/trace)

In [133]:
(trace fact)

In [134]:
(fact 5)

>(fact 5)
> (fact 4)
> >(fact 3)
> > (fact 2)
> > >(fact 1)
< < <1
< < 2
< <6
< 24
<120


Note that trace displays the recursive calls to <code>fact</code>

## More Racket

Can you undefine a value in racket? That is, can you remove it from the namespace? The answer appears to be no, however, I am willing to hear suggestions. Here is the <a target=ss href="https://stackoverflow.com/questions/3487138/how-to-undefine-a-variable-in-scheme">stackoverflow rationale</a>.

- Perlis epigram #23: To understand a program you must become both the machine and the program.

## Combining Boolean values.

There is a built-in procedure <code>not</code> and two special forms (<code>and</code>, <code>or</code>) that can be used to combine Boolean values. The syntax of <code>not</code> is as follows.
<pre>
    (not exp)
</pre>

The expression <tt>exp</tt> is evaluated, and if its value is not <tt>#f</tt>, then the value <tt>#f</tt> is returned; if its value is <tt>#f</tt>, then the value #<tt>t </tt>is returned. As examples, we have the following.

In [135]:
(not #f)

In [136]:
(not #t)

In [137]:
(not (= (+ 1 3) (+ 2 2))) 

In [138]:
(not (> 2 4))

In [139]:
(not 1)

In [140]:
(not "")

Non-Boolean values are treated as not <tt>#f</tt>, so, for example <code>(not 13) => #f</code>.

## The special forms: and, or.

Recall that the meaning of "special form" is that the rules of evaluation are not those of an application. The syntax of these two special forms is as follows.
<pre>
    (and exp1 exp2 ... expn)
    (or exp1 exp2 ... expn)
</pre>

Each takes an arbitrary finite number of expressions as arguments, and evaluates them in order, left to right, possibly stopping early. 

The evaluation rule is that if any expression is <tt>#f</tt>, then the value <tt>#f</tt> is returned for the whole <tt>and</tt> expression -- in this case, no further expressions are evaluated. If the value of <tt>exp1</tt> is not equal? to <tt>#f</tt>, then <tt>exp2</tt> is evaluated, and if its value is equal? to <tt>#f</tt>, then the value <tt>#f</tt> is returned for the whole and expression (and no further expressions are evaluated.) If this process continues until <tt>expn</tt> is evaluated, its value is simply returned as the value of the whole <tt>and</tt> expression. We have the following examples.

In [141]:
(and #f #f) 

In [142]:
(and #t #f)

In [143]:
(and #f #t)

In [144]:
(and #t #t)

In [145]:
(and (= (+ 1 2) 3) (> 4 3))

In [146]:
(and (> 4 3) (< 12 6)) 

In [147]:
(and (= (+ 1 1) 2) (> 4 3) (< 6 12))

Non-Boolean values are treated as not <tt>#f</tt>, so we have <code>(and 1 2 13) => 13</code>. (This is because 1 and 2 are not equal? to <tt>#f</tt>, so we return the value of the last expression, which is 13.) This last kind of behavior is sometimes convenient, but also confusing and somewhat deprecated.

The special form <tt>or</tt> is analogous to the special form <tt>and</tt>, but is looking for the first not <tt>#f</tt> value it can find, evaluating expressions left to right. When it finds a not <tt>#f</tt> value, it returns that value as the value of the whole or expression (not evaluating any further expressions.) If all the expressions evaluate to <tt>#f</tt>, then <tt>#f</tt> is returned as the value of the <tt>or</tt> expression. Some examples follow.

In [148]:
(or #f #f)

In [149]:
(or #t #f)

In [150]:
(or #f #t) 

In [151]:
(or #t #t) 

In [152]:
(or (= (+ 1 1) 2) (< 4 3))

In [153]:
(or (= 2 3) (> 6 7) (<= 12 6))

Once again, because non-Boolean values are treated as not <tt>#f</tt>, we have the following behavior: <code>(or 1 2 13) => 1</code>. (This is because the first expression evaluated, namely 1, is not <tt>#f</tt>, so its value is returned as the value of the or expression.)

### Edge cases for the special forms: and, or. 

What happens when we evaluate (and) and (or)?

In [154]:
(and)

In [155]:
(or)

These expressions do not result in errors when they are evaluated. Why are the values chosen reasonable? If we think of <tt>and</tt>, <tt>or</tt> as operating on Boolean values, then <tt>#t</tt> makes sense as an initial value for <tt>and</tt>: we keep evaluating expressions, combining them with the current value, until either the value becomes <tt>#f</tt> (which is then the final value) or we run out of expressions (and the final value is <tt>#t</tt>). Dually, <tt>#f</tt> makes sense as an initial value for <tt>or</tt>. For the same reason, these are the mathematical conventions for AND and OR over a set of expressions -- if the set is empty, AND returns true, OR returns false.

<h3 id="lists">Lists.</h3>

A list is a finite sequence of values. There is a list of no values, the empty list, which can be represented as <code>'()</code> or as the keyword <code>empty</code>. To determine whether an expression is <tt>equal?</tt> to the empty list, we can use either of the following built-in predicates.
<pre>
    (empty? exp)
    (null? exp)
</pre>
Each of these evaluates the expression exp, and if its value is <tt>equal?</tt> to <tt>'()</tt>, then the value #<tt>t </tt>is returned; otherwise, the value <tt>#f</tt> is returned. For example we have the following.

In [156]:
(require racket)
(require racket/base)

In [157]:
(empty? '())

 <code>null?</code> is the same as <code>empty?</code>

In [158]:
(null? '())

<code>empty</code> is the same as <code>'()</code>

In [159]:
empty

In [160]:
(null? 13)

In [161]:
(null? "")

In [162]:
(null? "this is not an empty list")

Here are examples of lists with a nonzero number of elements; each one is given as its "quoted" representation -- note the leading single quote. (The special form <tt>quote</tt>, abbreviated as a leading single quote, causes its argument not to be evaluated.) First, we have a list with one element, the number 17.

In [163]:
'(17)

In [164]:
'(+ 2 3)

Next, a list with two elements, the number 17 followed by the number 24 -- remember that order matters in a list.

In [165]:
'(17 24)

The separator for the two elements is "whitespace" -- blanks, tabs, newlines and the like. Another list, with three elements, 17 followed by 24 followed by 6.

In [166]:
'(17 24 6)

The elements of a list do not have to be of the same type; here is another list with three elements, a number, a string and a Boolean.

In [167]:
'(17 "hi!" #f)

The elements of a list may themselves be lists, as in the following example.

In [168]:
'((1 2) (3) 4)

This list has three elements: the list '(1 2) followed by the list '(3) followed by the number 4. Lists may be nested within lists to arbitrary depth.

### Selectors and constructors.

When we have a compound data structure, we expect to be able to extract parts of it. The procedures that do that are <b>selectors</b>. We also expect to be able to assemble parts into a compound data structure. Procedures that do that are <b>constructors</b>. 

There are two basic selectors for lists: one to return the first element of a non-empty list, and one to return the rest of the elements of a non-empty list, without the first one. Each selector has two names: the LISP-historical one, and one that is considerably more mnemonic. The procedure to return the first element of a list is <code>(car lst)</code> or <code>(first lst)</code>, and the procedure to return a list of all the elements except the first element is <code>(cdr lst)</code> or <code>(rest lst)</code>. Here are examples, using both names for both selectors.

In [169]:
(first '(17 24 6)) 

In [170]:
(car '(17 24 6))

In [171]:
(rest '(17 24 6))

In [172]:
(cdr '(17 24 6))

In [173]:
(first (rest '(17 24 6)))

In [174]:
(car (cdr '(17 24 6))) 

Compositions of <tt>car</tt> and <tt>cdr</tt> procedures (up to some limit, 5?) have "syntactic sugar" in the form of abbreviations, so that <code>(car (cdr lst))</code> can be abbreviated <code>(cadr lst)</code>. Note that in a list of at least 2 elements, <code>cadr</code> returns the second element of the list. Racket has (also as "syntactic sugar") built-in procedures <code>second</code>, <code>third</code>, ... (up to some limit), so that <code>(second '(17 24 6)) => 24</code>. The procedure name <code>second</code> is generally easier for humans to process than <code>cadr</code>.

The names <tt>car</tt> and <tt>cdr</tt> reflect the assembly language opcodes from the early implementation of LISP on an IBM computer.

- CAR => contents of the address register
- CDR => contents of the decrement register

In [175]:
(cadr '(17 24 6))

In [176]:
(second '(17 24 6))

In [177]:
(caddr '(17 24 6))

In [178]:
(third '(17 24 6))

In [179]:
(caar '((17) (24) (6)))

In [180]:
(first (first '((17) (24) (6))))

In [181]:
(caar '((17) (24) (6)))

### Further discussion: Constructors.

In the case of lists, there is one basic constructor, whose name is <code>cons</code>. Its syntax is
<pre>
    (cons item lst)
</pre>
If <tt>lst</tt> is a list and <tt>item</tt> is any value, then <code>(cons item lst)</code> returns a list equal to <tt>lst</tt> with <tt>item</tt> inserted as the first element. As examples, we have the following.

In [182]:
(cons 17 '(24 6)) 

In [183]:
(cons 6 '())

This allows us to construct the list '(17 24 6) using three applications of cons, as follows.

In [184]:
(cons 17 (cons 24 (cons 6 '())))

The innermost <tt>cons</tt> expression evaluates to the list <code>'(6)</code>, the middle <tt>cons</tt> expression adds 24 at the front of that list, to get <code>'(24 6)</code>, and the outermost <tt>cons</tt> expression adds 17 to that list, to get <code>'(17 24 6)</code>.

Note that <tt>cons</tt> does not require its second argument to be a list, though that will cover essentially all our uses of cons. If you cons together two numbers, you get a "dotted pair", as in the following example.

In [185]:
(cons 3 8)

This looks a bit like a list, but the dot (.) is an indicator that it is not -- sometimes such structures are called "improper lists". When it occurs in your program's output, it is generally an indication that you cons'ed an element onto something that is not a list, which is generally an error to be corrected. There is a built-in predicate <code>(list? exp)</code>, which returns <tt>#t</tt> if exp is a list, and <tt>#f</tt> otherwise. <code>(pair? exp)</code>, which returns <tt>#t</tt> if exp is a list - even an improper one. As examples, we have the following.

In [186]:
(list? '())

In [187]:
(list? '(17 24 6))

In [188]:
(list? 17)

In [189]:
(list? (cons 3 '())) 

In [190]:
(list? "hi!") 

In [191]:
(list? (cons 3 8)) 

In [192]:
(pair? (cons 3 8))

In [193]:
(pair? '(3 4))

As another example, we will construct the list <code>'((1 2) (3) 4)</code> using <tt>cons</tt>, numbers, and the empty list <tt>'()</tt>. Because cons adds elements at the front of the list, we construct a list by working backwards. To get the list <code>'(4)</code>, we cons 4 to the empty list:

In [194]:
(cons 4 '()) 

We'd like to add the element <code>'(3)</code> to the front of this list. However, this is itself a list, so we need to construct it similarly to <code>'(4)</code>, namely, <code>(cons 3 '())</code>. Thus, we can get a list of the 2nd and 3rd elements of our target as follows.

In [195]:
(cons (cons 3 '()) (cons 4 '())) 

We just need to cons the element <code>'(1 2)</code> onto the front of this list, but we need to construct the list <code>'(1 2) </code>in order to do this. To construct the list <code>'(1 2)</code> we use the expression <code>(cons 1 (cons 2 '()))</code>. Putting all this together, we have

In [196]:
(cons (cons 1 (cons 2 '())) (cons (cons 3 '()) (cons 4 '())))

Next we look at various forms of recursive procedures dealing with lists as arguments and return values.

End of Racket notebook.