Racket style guide for CPSC 201

Introduction

When you first start learning Racket (or any programming language, for that matter!) you'll typically start by learning its syntax and some of its reserved keywords (e.g., define, if, cond) and built-in procedures (e.g., max, map, apply). Both the Lecture Summaries and The Racket Guide are great resources for this. However, with all these things on your mind, it can be easy to neglect, or even completely ignore issues like formatting or efficiency.

By now, though, you should be familiar with the basics of Racket. The time is right for you to better understand how to make your code more clear, concise and easy to understand, both for you and for anyone else reading it. The more complex your code, the more important it becomes for you to adhere to these guidelines: you'll be surprised how much time you'll save when you're debugging your programs or trying to understand code that you've written some time ago.

Everyone has a unique coding style, which is why this is a style guide and not a list of style requirements. However, we guarantee that once you get in the habit of following some of these suggestions, you'll quickly start to appreciate the power of well-written code.

Good luck, and happy coding!

- The CS201 Staff

Formatting

Because the syntax of Racket is so uniform, it is very important that you organize your code on the page in a way which makes its structure evident. Most Racket programmers employ the rules below. If you adopt them, you will never need to scan your code for matching parentheses; instead, you will recognize the nesting and content of an expression by the form of the code.

Note that DrRacket helpfully provides an automatic indentation feature: if you select a block of code and hit the Tab key, the horizontal alignment of each line in the block will be adjusted to match the nesting of your code.

Avoid long lines

Long lines in Racket programs are usually very hard to decipher. Consider even the simple procedure below:

(define (filter pred? lst)
  (cond
    [(null? lst) '()]
    [(pred? (car lst)) (cons (car lst) (filter pred? (cdr lst)))]
    [else (filter pred? (cdr lst))]))

The second cond clause is hard to read, because your eyes and brain need to figure out where the condition ends and result expression begins, and even then the boundaries of the arguments to cons are not entirely obvious.

In general, you will want to break lines immediately after you close one or more parentheses:

(+ (* 2 3)
   (/ 17 5))

Notice how the closing parenthesis in the product closes the line as well. This is a good rule of thumb to avoid too complex forms on a line which would be difficult to read. Reformatting filter above using this rule gives us the more readable:

(define (filter pred? lst)
    (cond
      [(null? lst)
       '()]
      [(pred? (car lst))
       (cons (car lst)
             (filter pred? (cdr lst)))]
      [else
       (filter pred? (cdr lst))]))

Parens tend to feel lonely

Don't put closing (or opening) parens on a line of their own. They get lonely easily. Seriously, it's superfluous information and takes up lines for nothing. Therefore, the following code is good Racket style:

(define (factorial n)
    (if (zero? n)
        1
        (* n (factorial (- n 1)))))

Notice the closing parens at the end. A seasoned Racket programmer won't see those, though - the expression ends there, the next one begins at column zero, so they know that everything is closed there.

Indent subexpressions equally

Basic indentation in Racket is to indent subexpressions equally. This is easiest seen in example:

(list (foo)
      (bar)
      (baz))

As you can see, the expressions (foo), (bar) and (baz) are all lined up under each other. They're all on the same syntactic level - all are arguments to list - so they should be lined up under each other.

There are exceptions, such as define as shown above, or let. They get the body argument indented two spaces from the definition:

(let ((pi 3.14)
      (r 120))
  (* pi r r))

In general, DrRacket's automatic indentation does the right thing. Read more about indentation here.

Break for one - break for all

If you put subexpressions onto multiple lines, put every subexpression on a single line.

For example, you can write

(+ 1 foo bar baz)

but if one of those expressions gets more complicated, you may want it on a line of its own. If so, put all of the subexpressions on lines of their own:

(+ 1
   (foo 3.5 a)
   bar
   baz)

If an argument list is broken, a seasoned Racket programmer will expect to find every argument on a line of its own, so putting more than one on any of the lines will likely cause the extra argument to be missed.

Other best practices

Consider this perfectly indented, perhaps technically correct piece of code:

(define (helper5 x)
    (cond
      [(null? x)
       'error]
      [(null? (cdr x))
       (length (car x))]
      [(> (length (cadr x))
          (length (car x)))
       (helper5 (cons (car x)
                      (cddr x)))]
      [else
       (helper5 (cdr x))]))

Because the formatting rules above were followed to the letter, with a little bit of training, you will have no trouble figuring out where the subexpressions begin and end, or which procedure is applied to which arguments. Still, even an experienced Racket programmer will be unable at first glance to understand what the procedure above does.

The guidelines below should help you make sure your code is as clear as possible.

Choose variable and procedure names carefully

Choosing descriptive names for your variables and procedures is much more important - and much harder - than you might suspect at first.

The reason it is important becomes apparent when you try to read the code of helper5 above. The name of something is the first, immediately available piece of information about what it is and how it behaves. Hence, the following procedure is somewhat easier to read than the previous one, even though it is identical as far as the Racket interpreter is concerned:

(define (shortest-length lists)
    (cond
      [(null? lists)
       'error]
      [(null? (cdr lists))
       (length (car lists))]
      [(> (length (cadr lists))
          (length (car lists)))
       (shortest-length (cons (car lists)
                              (cddr lists)))]
      [else
       (shortest-length (cdr lists))]))

In fact, the whole point of defining a procedure is abstraction: we create a black box, which can then be used as a building block in a more complex system, with no concern about its inner workings. From the name of a procedure (and the associated comment - see next section), one should be able to predict how the procedure will behave, without having to study the details of its body.

This is why naming procedures is hard: you need to reduce a potentially very complicated behavior into a few words at most. On the other hand, thinking of a good name can help you think about what it is that your procedure should be doing exactly. And if you have a hard time coming up with an appropriate name, then perhaps the outline of the black box you're trying to create is more complicated than it needs to be?

Comment your code

In the above example, it would be helpful to add a short description before our procedure definition. Unsurprisingly, even with good procedure and variable names, it's usually easier to read English than code. A descriptive sentence or two can be very helpful if you don't remember what your procedure is meant to do, or if someone else is reading your code.

Most of the time, you should try to avoid writing long procedures (see next section), but longer or more complicated procedures may also warrant some in-line comments. The previous procedure could be commented as follows:

; Length of the shortest element of a list of lists
(define (shortest-length lists)
    (cond
      [(null? lists)
       'error]
      [(null? (cdr lists))
       (length (car lists))]
      ; If second element is longer than the first, call shortest-length recursively 
      ; on a new list with the second element removed
      [(> (length (cadr lists))
          (length (car lists)))
       (shortest-length (cons (car lists)
                              (cddr lists)))]
      ; Otherwise, call shortest-length on a new list with the first element removed 
      [else
       (shortest-length (cdr lists))]))

Keep your procedures short

Shorter procedures are easier to get right, read, reason about and debug. Compare the above procedure with:

; Length of the shortest element in a list of lists
(define (shortest-length lists)
    (smallest-element (lengths-of lists)))

; Find the smallest element in a list of numbers
(define (smallest-element lst)
    (cond
      [(null? lst)
       'error]
      [(null? (cdr lst))
       (car lst)]
      [else
       (minimum (car lst)
                (smallest-element (cdr lst)))]))

; Choose the smaller one of two numbers
(define (minimum x y)
    (if (< x y)
        x
        y))

; Given a list of lists, build a list of their lengths
(define (lengths-of lists)
    (cond
      [(null? lists)
       '()]
      [else
       (cons (length (car lists))
             (lengths-of (cdr lists)))]))

Instead of doing everything in one procedure, we first call lengths-of, which returns a list containing the lengths of each element of lists. Then, we use smallest-element to find the smallest number in this list, which will be the length of the shortest element.

Avoid duplicating code

Consider this code:

(define (roll dice)
    (cond
      [(null? dice) 0]
      [else (+ (list-ref (car dice)
                         (random-integer
                           (length (car dice))))
               (roll (cdr dice)))]))

(define (choose-random-move sum state)
    (cond
      [(null? (possible-moves sum state))
       'none]
      [else
       (list-ref (possible-moves sum state)
                 (random-integer
                   (length (possible-moves sum states))))]))

Both of these procedures contain the subexpression (list-ref <something> (random-integer (length <something>))), and so we may want to split that into an auxiliary procedure, namely pick-random:

(define (roll dice)
    (cond
      [(null? dice) 0]
      [else (+ (pick-random (car dice))
               (roll (cdr dice)))]))

(define (choose-random-move sum state)
    (cond
      [(null? (possible-moves sum state))
       'none]
      [else
       (pick-random (possible-moves sum states))]))

(define (pick-random lst)
    (list-ref lst
              (random-integer (length lst))))

Code duplication is best avoided for several reasons. If you write the same thing several times, you're more likely to make a mistake. If you have to modify it later on, you might forget about some of the copies. Whenever you find yourself typing the same thing several times, it is generally a good indication that you should be writing an auxiliary procedure.

Avoid duplicating computations

You may have noticed another benefit to rewriting the code above. Before, the choose-random-move procedure had to compute the list of all possible-moves three times: to check whether there were any, to count them, and finally to look up the nth one among them. Now, the last two uses of (possible-moves sum states) have been merged into one, with the result passed to the pick-random procedure we introduced. Hence, the resulting program will also be more efficient.

Now, it is tempting to apply the same principle in order to merge the two remaining uses of (possible-moves sum states) into one:

(define (choose-random-move sum state)
    (pick-random-if-there-is-one
      (possible-moves sum state))

(define (pick-random-if-there-is-one lst)
    (cond
      [(null? lst) 'none]
      [else (pick-random lst)]))

However, since we use pick-random-if-there-is-one only once, and it is rather idiosyncratic (and therefore unlikely to be needed again), we can use a let statement instead of defining a new procedure:

(define (choose-random-move sum state)
    (let ((lst (possible-moves sum state)))
      (cond
        [(null? lst) 'none]
        [else (pick-random lst)])))

It is interesting to note that (let ((<x1> <v1>) ... (<xn> <vn>)) <body>) is actually just syntactic sugar for ((lambda (<x1> ... <xn>) <body>) <v1> ... <vn>), and so this new version pretty much amounts to replacing the constant pick-random-if-there-is-one by its definition!

Use higher-order procedures

Sometimes it is not entirely obvious how to package some redundant code into its own procedure. Consider the following:

(define (keep-moves-of-length len moves)
    (cond
      [(null? moves)
       '()]
      [(= len (length (car moves)))
       (cons (car moves)
             (keep-moves-of-length len (cdr moves)))]
      [else
       (keep-moves-of-length len (cdr moves))]))

(define (keep-possible-moves sum moves)
    (cond
      [(null? moves)
       '()]
      [(= (sum-of (car moves))
          sum)
       (cons (car moves)
             (keep-possible-moves sum (cdr moves)))]
      [else
       (keep-possible-moves sum (cdr moves))]))

These two procedures have much code in common: both are meant to filter a list in order to keep only those elements which satisfy a certain property (such as, being of length len, or adding up to sum). The only difference between them is the code used for testing which elements qualify. However, it may not be entirely obvious how to factorize the common code into an auxiliary procedure.

Luckily, Racket procedures are values like any other; passing a procedure as a parameter to another procedure can be used to inject some varying code into a common template. In this case, the common template is exactly the filter procedure which we saw earlier! For reference, we had:

(define (filter pred? list)
    (cond
      [(null? list)
       '()]
      [(pred? (car list))
       (cons (car list)
             (filter pred? (cdr list)))]
      [else
       (filter pred? (cdr list))]))

Then keep-moves-of-length and keep-possible-moves can be rewritten as:

(define (keep-moves-of-length len moves)
    (filter (lambda (m)
              (= len (length m)))
            moves))

(define (keep-possible-moves sum moves)
    (filter (lambda (m)
              (= (sum-of m)
                 sum))
            moves))

Note that in each of the two above procedures, we used a lambda expression to create the unnamed procedure that gets passed to filter.

Procedures that take other procedures as arguments, or which return a procedure as a result, are called higher order procedures, and they usually correspond to frequently-occurring patterns of code. In fact, one such pattern was also present in the lengths-of procedure that we saw earlier:

; Given a list of lists, build a list of their lengths
(define (lengths-of lists)
    (cond
      [(null? lists)
       '()]
      [else
       (cons (length (car lists))
             (lengths-of (cdr lists)))]))

The lengths-of procedure evaluates to a list consisting of the results of applying the length function to each element in an existing list. But we know that this pattern is captured by the map procedure, hence lengths-of can be written as:

; Given a list of lists, build a list of their lengths
(define (lengths-of lists)
    (map length lists))

In fact, to a seasoned Racket programmer, (map length lists) is much more precise and evocative than (lengths-of lists), therefore we may want to drop the lengths-of procedure altogether.

Closing remarks

You may have noticed that in filter above, the subexpression (filter pred? (cdr list)) appears twice. We could have used let to factor it out, however in this particular case, it would have significantly complicated the code, defeating the purpose. Furthermore, this would break tail call optimization. The point is: all of these rules are guidelines. If you choose to break one, do so. But be aware that you are breaking one, and make sure that you have a good reason.

Another way to avoid duplicating code is to use Racket's built-in procedures when one is available that suits a given purpose. Hence, in the above, we could have used min instead of writing minimum by hand. Even better, since min can accept any number of arguments, we could have used (apply min lst) instead of writing a smallest-element procedure. Racket and its set of built-in procedures are defined in The Racket Reference.

Acknowledgments

This document is distributed under the Creative Commons Attribution-ShareAlike 2.0 license. It is derived from the style guide on schemewiki.org (specifically, the Formatting section was written based on this version), and draws inspiration from this excellent document from Dartmouth as well. It was originally written for the R5RS variant of Scheme, and later adapted for Racket.

Most of the Other best practices section was written by Jérémie Koenig, with input from Patrick Paczkowski and the rest of the 2013-2014 CS201 staff at Yale University.