Lisp Notes for CS 188, Fall 2005
Introduction to Artificial Intelligence

These notes review the basic ideas of symbolic computation and functional programming as embodied in LISP. We will cover the basic data structures (s-expressions); the evaluation of functional expressions; recursion and loops; binding and equality; user-defined data structures (defstructs); and generic functions for object-oriented programming.

With minor differences which will be pointed out, LISP and Scheme (which you should recall from CS 61A) are very similar. These notes intended mainly as a refresher for students who had 61A a while back. For a thorough introduction and a complete reference, I strongly recommend Paul Graham's ANSI Common Lisp.

Characteristics of LISP

The main characteristic of LISP is its capability for symbolic computation. Symbols (atoms) are the principal data type. The operations that can be performed on symbols include equality testing and building into symbol structures. Putting two symbols together creates a structure, which can then be accessed and taken apart.

Typical applications:
Language processing, using words as symbols; lists for sentences, trees for grammatical structure.
Mathematics, involving expressions and equations; trees for expressions.
Manipulating programs --- these are just pieces of (formal) language.

LISP programs are just symbol structures, so LISP programs can modify or create LISP programs. Hence programs can implement the results of learning by writing new code for themselves; it is also easy to write interpreters for new languages in LISP.

LISP is a functional language: compute by evaluating nested functional expressions. (Pure) programs have no side-effects, so they are modular. Simple semantics (i.e., it's easy to tell what a program does) allows for powerful program development environments.

Evaluating functional expressions

Read-eval-print: LISP reads the expressions you type, evaluates them and prints the value.

Expressions are as follows:

Atoms can have a value associated with them, e.g. X might have the value 5.

Numbers (which are also atoms) evaluate to themselves.

Functional expressions are delimited by matching parens:
(fn arg1 ... argn) applies fn to the arguments as follows:

Evaluate each argument in turn, then
Apply the definition of fn to the results. Note the difference from Scheme: in Scheme, every position including the fn position is evaluated; in LISP, symbols can have a function definition pointer distinct from the value pointer.

Sometimes we want to pass an argument directly, without evaluation. To do this we need an identity function, which doesn't evaluate ITS argument. QUOTE serves this purpose.

>  (QUOTE A)
>  'A
(+ '4 '4) returns 8, but (+ '(+ 2 2) '(+ 1 3)) returns what?

(+ 2 2) is just a piece of list structure. If numbers are made by addition etc, how is list structure made from atoms?

Operations on list structure

List structure is made by putting symbols together. The function that puts things together is CONS.
> (cons (cons 'a 'b) 'c)
((a . b) . c)
These are s-expressions: an atom is an s-expression, and two s-expressions CONSed together make an s-expression. The parts of s-expressions are accessed using car and cdr:

(car (cons 'a 'b)) is a; (cdr (cons 'a 'b)) is b.

Lists, of which the functional expressions used above are examples, are special kinds of s-expressions whose rightmost element is the special atom nil. They have a special printed representation:

(cons 'a (cons 'b nil))
(a b)

It's a list if CDRing down it eventually returns nil. nil is the empty list.

Lists can be constructed using list as well as cons. The functions first and rest are the same as car and cdr but more readable:

> (setq l (list 'a 'b '(c d)))
(a b (c d))
> (first l)
> (rest l)
(b (c d))

Defining functions, conditionals and temporary variables

To to associate a function definition with a symbol (not the same as giving the symbol a value), use the defun function. The following code defines a function 1/ which calculates 1/x.
(defun 1/ (x &optional (checkp nil))  ;comments can follow semicolons like this
  (if (and checkp (zerop x))
     (/ 1 x)))
Note the use of optional arguments. Here, checkp does not have to be provided in the calling expression, and if not then it defaults to nil.

``Conditional branching'' used in imperative languages is replaced in LISP by conditional evaluation. The if-expression is evaluated just like any other, but the returned value depends on whether the value of the first argument to if is nil or not. (Truth values in LISP are nil for false, anything else counts as true. t is used as a readable default symbol for true. Both t and nil evaluate to themselves, like numbers.)

Complex test expressions can be formed using the functions and or not.
and returns a non-null value if all its arguments are non-null.
or returns a non-null value if any of its arguments are non-null.
not returns a non-null value if its argument is nil.
The arguments of and/or are evaluated sequentially and evaluation terminates as soon as a null/non-null value is found. For example,

(and (numberp x) (not (zerop x)) (/ 1 x))
never signals an error; it returns nil if x is not a number or is zero; otherwise it returns 1/x.

When you want to return one of several different values depending on several different conditions, use cond (see example below).

Sometimes, the same expression will be used several times in the same function definition. To simplify the code, and save time, one should define a temporary variable to stand for the value of the expression:

(defun age-group (person)
  (let ((n (age person))) 
    (cond ((< n 2) 'baby)
          ((< n 18) 'child)
          ((< n 120) 'adult)
          (t 'dead)))
For temporary variables that are defined at the beginning of the function, as above, one can also use &aux variables in the parameter list:
(defun age-group (person &aux (n (age person)))
  (cond ((< n 2) 'baby)
        ((< n 18) 'child)
        ((< n 120) 'adult)
        (t 'dead))))


The simplest way to get repetitive execution in LISP is to use recursion, wherein one uses the function being defined in the definition of the function itself. The key to thinking clearly about this is the recursion relation that holds for the problem at hand.

For example:
The length of a list is one more than the length of its cdr
The number of atoms in a tree is the sum of the numbers in the left and right-hand sides
The number of digits in an integer is one more than the number of digits in the integer part of one-tenth of the integer.

The other thing to take care of is the cases where the recursion relation is false. For example, an empty list doesn't have a cdr; a tree that is just an atom doesn't have left and right hand sides.

(defun count-atoms (x)  ;; returns number of atoms in list structure x
  (if (atom x) 
     (+ (count-atoms (car x)) (count-atoms (cdr x)))))
Although recursion is often elegant, deeply nested recursion takes a lot of space in some cases, so we also use mapping and iterative constructs.

Mapping and Iteration

In effect, mapping is a way of constructing a big operator out of a little one. The little one works on objects; the big one works on lists of those objects:
> (mapcar #'1/ '(1 2 3))
(1 1/2 1/3)
For now, you can think of the #' as a special kind of quote used on functions. This is different from Scheme, where you would simply use the function symbol. If you write functions that take functions as arguments, you need to be careful in writing the expressions that use the function parameter (unlike SCHEME, where the thing in functional position is evaluated like anything else). For example, if we wanted to write mapcar:
(defun mapcar (f l)
  (if (null l)
    (cons (funcall f (car l)) (mapcar f (cdr l)))))
funcall takes a function and some arguments to apply the function to. apply takes a function and a list of arguments. For example, if you want to find the sum of a list of numbers, use
> (apply #'+ '(1 2 3 4 5))
Sometimes, one needs to map over a list using a function that doesn't have a name. For this, and other occasions demanding dynamically-created functions, we use the special lambda-expression:
> (mapcar  #'(lambda (x) (* x x)) '(1 2 3))
(1 4 9)
Once again, note the use of #', unlike Scheme.

mapc is like mapcar, but doesn't gather up the results of the operations into a list.

Then there are the repetitive tasks for which one just can't come up with a nice clean way to say it using recursion or mapping. For this, one uses the general loop construct.

loop is far too complicated to explain in full and you'll only forget it anyway, so look it up in the book (this is what I do). Generally, the AIMA code sticks to some very simple cases. The following expression executes "body" (which may be any number of expressions) once for each value of i from 2 to 10.

(loop for i from 2 to 10 do ...body...)
You can also iterate over the elements of a list:
(loop for x in '(1 2 4 8 16) do ...body...)


Equality between things other than numbers (use = for that) is tricky. You are bound to run into bugs caused by using the wrong equality test at some point. There are three kinds of equality (at least):

1) Identity: two things are eq if they are the same thing, i.e., if the two arguments to eq evaluate to the same address pointer in the computer. Symbols are always eq to themselves: (eq 'x 'x) is true. However, (eq x x), (eq 412 412), and (eq '(1 2) '(1 2)) may or may not be true because it's up to the compiler whether to re-use the same storage for numeric values, strings, list constants, etc.

2) Eqlity: eql is like eq except that character and number arguments are compared by value rather than address. Hence (eql x x) is always true. In most Lisp functions that use an equality test (such as member, the default test is eql.

3) List-structural equality: two things are equal if they have the same `list structure'; more or less, if they look the same when printed. (equal (list 1 2) (list 1 2)) is true; (eq (list 1 2) (list 1 2)) is false, since each call to list creates a new (and therefore different) piece of list structure. However, vectors (except strings), arrays, and structures are not equal unless they are eq.

4) Structural equality: equalp is like equal but also compares (recursively) the contents of vectors, arrays, and structures.

5) Numerical equality: two numbers (even of different types) can be tested for equality using =.

In testing membership of an object in a list, for example, it is important to distinguish the types of equality: (member x l) tries to find something in l that is eql to x; if you wanted to find something equal, as for example in deciding whether or not the list (1 (2 3) 4) contains (2 3), then use a keyword argument: (member x l :test #'equal). This asks if there is a member of l that is equal to x. If you want to find out if some member of l has a square root that is numerically equal to x, you can use a :key:

(member x l :test #'= :key #'sqrt)


So how do atoms get their values anyway? We've seen binding for atoms as parameters and in let and lambda expressions. Yes, there is such a thing as a global variable.

(setq x expr) sets the value of the atom x to the value of the expression. Thus (setq a b) causes a and b to have the same value; i.e., (eql a b) becomes true. Note that setq doesn't evaluate its first argument; set does.

More generally, you can use setf to cause a place to have a new value. The value cell of an atom is one kind of place; there are places in the cells of an array; in the fields of a defstruct (see below); in the cons cells of a list structure, and so on. More or less anything that you can access with a lisp expression can be changed using a call to setf. For example:

>  (setq x '(1 (2 3) 4))
(1 (2 3) 4)
>  (setf (caadr x) 'ding)
>  x
(1 (ding 3) 4)
Warning: if you really want to use a global variable, then for efficiency you should declare it as such using defvar, instead of just doing a setq to initialize it. This will let the compiler know what kind of thing it is. Also, the standard is to use asterisks around the name like *this*.

Object-oriented programming

CLOS, the Common Lisp Object System, is considered the Rolls-Royce of object systems. Because some lisps have CLOS and some don't, the AIMA code uses the generally available defstruct function rather than the CLOS-specific defclass function. (The main difference in practice is that defstruct does not allow multiple inheritance.) We also use defmethod to define generic functions; this works with both defstruct and defclass.

defstruct defines an object class (or data type if you like) with the given field names, and automatically creates the associated access and constructor functions. Thus to define a data type for storing information about prisoners, we could use

(defstruct prisoner 
which will automatically define the constructor function make-prisoner. We can use this to construct instances of the type:
(defvar manson (make-prisoner :name "charles manson" :crime 'mayhem :term 'life))
Note that in the call to make-prisoner, the field names have colons.

The access functions prisoner-name, prisoner-number etc. will be created when the defstruct is executed. The value of a data field is changed using setf, e.g.,

(setf (prisoner-number manson) 4424275684)

Methods -- that is, functions applicable to this particular class of objects -- can be defined for a class as follows:

(defmethod up-term ((p prisoner) extra-years)
  (unless (eq (prisoner-term p) 'life)
    (incf (prisoner-term p) extra-years))
  (prisoner-term p))
A method is invoked just like an ordinary function: (up-term manson 25). We say that up-term is a generic function. Inheritance relationships between types are asserted using the :include keyword:
(defstruct (lifer (:include prisoner (:term 'life)))) 
When a generic function is called, the method used is the most specific that matches the arguments. For example, we can define an up-term method for lifers:
(defmethod up-term ((p lifer) extra-years)
  (format t "~%Prisoner ~A is alreay serving a life term!" p)
  (prisoner-term p))
With this additional definition, there is no need to test for lifers in the method for general prisoners, because the lifer-specific method will be used for all lifers.

Vectors and Arrays

Vectors are like lists, but have constant-time access to elements and are usually of fixed size. For example, #(a b c) is a three-element vector. Elements are accessed using elt, which uses zero-indexing (i.e., the first element is element zero):
> (elt #(a b c) 0)
Many functions used for lists, such as length, also apply to vectors. Lists, vectors, and strings are all subtypes of the sequence data type, for which many standard functions are defined.

Vectors are also one-dimensional arrays. Arrays can also be multidimensional. For example, you make a 3x4 array by calling (make-array '(3 4)). The access function is aref. Thus to set the first element of a two-dimensional array *table*:

(setf (aref *table* 0 0) 4)

Using Common Lisp

You run ACL by executing the command
on the Solaris servers (but consult the official instructional LISP help page for updates, etc.) You may wish to use split-screen emacs with one half running ACL in a shell.

:help gives general help at the top level or in a break. :zoom shows the current stack state in a break.

You can type defuns directly to the lisp prompt (not recommended), or (load "filename.lisp") to run interpreted, or (compile-file "filename.lisp") to compile a file. Compilation produces a file called something like filename.fasl. You can load this immediately after compiling it by typing (load *). (The ``*'' refers to the result of the previous expression, which, in the case of a compile-file, is the name of the compiled fasl file.) The compiler can spot many errors in your code that are not detected when you load the lisp file. Also, compiled code runs much faster than interpreted code; if your program seems slow, you may have forgotten to compile it (or the supporting AIMA code).

:help ld and :help cf are also useful. For debugging, :help zo and :help loc.

:exit returns to the shell.