Getting started with Lisp

This is a very brief introduction, intended just to help you over the "acclimatization" period that new Lisp programmers usually go through...especially if they've been programming in languages that resemble C. Contents

Why Lisp?

If it's so great, why doesn't everyone use it?

Which Lisp?

Development in Lisp

Interaction with Lisp (starting and exiting from Lisp, executing statements, loading code from a file,...).

Lisp statements

Comments and blank space

Literal values (numbers, strings, boolean values) and names

Symbols (how Lisp stores information about "things")

Operators (functions, macros, special forms) and lists used as operator calls

Collection data types (sequences, lists, arrays and vectors, hashtables,...)

Shortcuts

Common code patterns

Selecting alternatives (conditionals)

Acting on all elements in a list

Combining list elements into one result

Recursion

Loops

Assigning values to variables

Procedural code

Define your own operators

Defining functions

Anonymous ("lambda") functions -- operations that return functions

Personalized functions -- "currying"

Restarting a function in the middle -- "continuations"

Debugging

Why Lisp?

Lisp has been around since 1958 but still has a fiercely loyal following -- why?

It's simple in form -- it doesn't have a lot of overbearing and tricky syntax. If you program in C or C++, you'll likely spend a fair amount of time just getting syntax errors cleaned up. A common frustration is that forms of statements that "make sense" are not allowed due to some obscure little limitation of C / C++. Once you've gotten past syntax errors, there'll be those statements that are legal but just don't do what they look like they ought to do. In Lisp, what you see is what you get.

It doesn't need its hand held -- there is almost no junk that needs to be included merely to help the compiler along. (Why do C, C++, Java programs need a "main"? It's because that tells the compiler where to start executing when you run the program. But it shouldn't need that -- why can't it just start at the first executable statement?) You don't need to tell Lisp what type a variable is -- it'll know by what you store in the variable. You can mix any types in a collection.

It's like a fancy calculator -- you can try out individual statements at the Lisp prompt, and gradually build up to your entire program. If you construct your programs by progressively testing pieces and combining them, you may never need to do any debugging.

But primarily, it has features other "high level" languages only dream of. Most of these have to do with creating functions on the fly.

If it's so great, why doesn't everyone use it?

Beats me. Ok, ok, here's what's wrong with it... Note that some of these "wrong" things are features that Lisp fans greatly approve of. In some cases, they really shouldn't...

It's a "functional" language, not a "procedural" language. In a functional language, your program will build functions to perform services and hand those out to other parts of the program that need those services.

Also, you'll often construct bigger statements by nesting them -- you put a statement that produces a value right in the spot where you need the value. (You can do this in many procedural languages as well -- it's just not the common style.) But people are in the habit of thinking of programs as a set of instructions -- "Now do this...now do that...then do this other thing..." That's how procedural languages work, so it's a path of less resistance to choose a procedural language.

Lisp favors "applicative" style -- you have a collection of things, and you "apply" an operation to the whole lot of them. This eliminates almost all use of loops. It has many built-in operators for doing this type of thing, which makes code very compact. This goes right along with functional programming, because it's very common to create little functions right inside a statement that applies the function to a collection.

But once you get used to the style, and begin doing things that aren't even possible in "high-level" procedural languages, these things won't seem like a problem any more.

(A note: If you write in assembly language, which can be regarded as procedural, you can do whatever you want, including construct machine instructions for new functions and write them into memory -- then you can use the address of the function in memory to call it.)

There are lots of different Lisps, and the differences aren't necessarily trivial. ANSI Common Lisp was formed to try to get everyone to agree to one standard. But the standards committee didn't want to offend anyone, so rather then picking a compact subset of the various Lisp dialects, with just one way to do each operation, they threw in all the variants. So Common Lisp is big, and has many built-in operations that are only slightly different. Everyone has their favorites, usually not the same, so reading other people's code sometimes requires a reference manual. Scheme is partly a rebellion against the size of Common Lisp -- it intends to be a minimal subset.

Sometimes this plethora of similar operations is considered a good thing: Before implementing an operation yourself, look for a built-in that is close enough to use directly or that you can adjust slightly by wrapping it in a simple function -- there was probably someone who wanted the same thing as you do, and threw it into the standard...

Besides all these near-duplicate operators, the folks contributing to Lisp also had their favorite little "shortcuts" -- expressions that don't fit the overall grammar or philosophy of the language, so the parser (and anyone coding in Lisp) has to treat them as special cases. (I'll point these out below.) Compiler purists hate these things -- they call them "syntactic sugar" because they contain nothing you need, and may make you sick. Because they violate the simple form of Lisp expressions, these shortcuts make the language more confusing. (In my experience, shortcuts -- not the functional philosophy -- are the primary source of confusion for new Lisp users.) Whenever you run into some expression, often with odd special characters stuck on the front of it, where you wonder, isn't that inconsistent with something else I was told about Lisp? that expression is probably a shortcut.

Parser technology wasn't as advanced when Lisp was first written, so there are a few operations that include hints to the parser, where the parser really should be able to figure out the statement without the hint. (The main one of these is the operator "funcall" that executes a function stored in a variable.) Scheme fixes these.

Common Lisp didn't include many operators for dealing with the outside world, so each implementation of Lisp invented its own. There were later attempts to standardize GUIs and other missing features, but these weren't universally adopted. So code you write using one Lisp implementation that uses these sorts of operations may need to be changed if you have to switch to a different implementation. Documentation for Common Lisp won't talk about these extensions. Interaction with Lisp at its command prompt may also include non-standard commands. Finally, not all Lisps implement all of Common Lisp. (All of these things happen with C / C++ as well -- much less so with Java.)

Some people just have a thing against parentheses. (You'll hear "Lisp" translated as "Lots of Irritating Silly Parentheses" instead of "List Processor".) They'd rather have curly brackets and semi-colons and backslashes and commas...as well as parentheses.

Lisp "insiders" use weird terminology like "car" and "cdr" and "cons". This is a valid gripe -- it's cliquish and off-putting, and there are often more meaningful substitutes available. In defense, though, can C or Unix (or even Windows) geeks really complain? Count up the number of jargon terms, especially terms that look like random strings, or words that have common meanings but been subverted, and you'll find way lots more in these other domains.

What really derailed common acceptance of Lisp, though, was history -- it was the use of C in Unix, then much open-source software, that caused C to be widely used and taught...and most schools only bother to teach one language, so that's what people get comfortable with, and when they're in a position to decide what language to use for a project, what do you think they'll pick?

Which Lisp?

For this class, you may use any implementation of Common Lisp. Two are recommended. One is Allegro Common Lisp for Windows, Web Edition. See Franz Inc.'s web site for more information. The other is a free, open-source version of Common Lisp called CLISP.

In previous quarters, we used Gnu Common LIsp (gcl). However, it is more difficult to install.

There are several commercial versions of Lisp that include development environments and other convenience features: Harlequin (Xanalsys), and Digitool.

Development in Lisp.

Here's a suggestion for how to work on a Lisp application:

Start a file for each major aspect of the application:

little utility functions that you might reuse
the "driver" for the overall type of problem
the code to use the driver to solve this particular problem

Work mostly from the bottom up -- write for yourself a collection of little tools that you use to build bigger tools. Put the parts that you've already finished and tested into these files.

For each of your files, make another file just for tests of the code in that file. Automate the tests by having Lisp check the returned values from your tests.

Try out new parts at the Lisp prompt (and maybe put them in a scratch file for safety). When you're sure a new piece is working ok, put it in its proper file. Then exit from Lisp, restart it, read in your files again, and re-run your tests. Starting up Lisp again gets rid of all the definitions and variable assignments you made in that session -- it makes sure you aren't accidentally relying on something that's not in your files.

Interaction with Lisp.

Here's where we begin using Lisp. (First, you may need to read about logging in to the machines we're using. Come back here after you're logged in...)

Start clisp by typing "clisp". You'll see the copyright information, then a prompt:

% clisp

  i i i i i i i       ooooo    o        ooooooo   ooooo   ooooo

  I I I I I I I      8     8   8           8     8     o  8    8

  I  \ `+' /  I      8         8           8     8        8    8

   \  `-+-'  /       8         8           8      ooooo   8oooo

    `-__|__-'        8         8           8           8  8

        |            8     o   8           8     o     8  8

  ------+------       ooooo    8oooooo  ooo8ooo   ooooo   8

Copyright (c) Bruno Haible, Michael Stoll 1992, 1993

Copyright (c) Bruno Haible, Marcus Daniels 1994-1997

Copyright (c) Bruno Haible, Pierpaolo Bernardi, Sam Steingold 1998

Copyright (c) Bruno Haible, Sam Steingold 1999-2002

[1]>

You're now talking to the Lisp interpreter. Each time you type something, followed by a return, it executes three Lisp functions -- "read", "eval", and "print" -- with your text as an argument. In Lisp syntax (as we'll see later), it's doing something like this: (print (eval (read xxx))) where the stuff you typed is xxx. (You could use those same functions, but probably won't have to.)

The main point is that it takes whatever you type, and immediately evaluates it and shows the result. For instance:

[1]> (+ 1 2 3 4 5)

If you make a mistake, clisp will dump you into its debugger. Most of the time, the mistake is obvious and you just want to get right back out -- type "abort" to get back out. (Note there are no ()s around this command -- it's a clisp command rather than a Lisp operation.) If you make a mistake while in the debugger, it'll start up a fresh debugger for you, so it doesn't disturb what you were doing in the other debugger. You'll need to "abort" out of as many levels of debugger as you got into. Here's an example:

[2]> xxx

*** - EVAL: variable XXX has no value

1. Break [3]> yyy

*** - EVAL: variable YYY has no value

2. Break [4]> abort

1. Break [3]> abort

[5]>

To exit from clisp, type "(quit)". (This one is a Lisp command... Don't ask me, I just work here.)

[5]> (quit)

Bye.

You'll notice that when you're typing in a statement that contains parentheses, and you get to a close parenthesis, clisp will momentarily "blink" the cursor over to the matching opening parenthesis -- watch for this to make sure your () are matched up the way you expect. (It may not try to blink the cursor if the opening parenthesis is on a different line.)

You can put code in a file, and ask Lisp to read and eval the whole thing (it assumes you don't want to "print" everything) using the "load" operator. If everything in the file evaluated successfully, the load statement itself evaluates to "true" -- if there was anything wrong, it starts the debugger. Say there's code in a file called xxx.lsp -- to load it, do:

[9]> (load "xxx.lsp")

;; Loading file xxx.lsp ...

;; Loading of file xxx.lsp is finished.

You can have one file load another (like C #include files) -- just put a load statement in the file.

Lisp statements

Lisp code is made up of:

Comments and blanks
Literal values (numbers, strings, boolean values) and names
Symbols (how Lisp refers to "things")
Operators (functions, macros, special forms) and lists used as operator calls
Collection data types (sequences, lists, arrays and vectors, hashtables,...)
Shortcuts

These are introduced briefly in the following sections.

Comments and blank space

Lisp uses spaces, tabs, and new lines (collectively referred to as whitespace or blanks) to separate values and names of things. You'll need to put blanks between items in a list (see below), but you don't need blanks between a parenthesis and anything else. Shortcuts (also see below) use some other special characters that are prefixed to names -- put these right next to the name and treat the whole unit like a name (i.e. put a space between it and other names or values). Here are some examples of spacing that Lisp won't fuss about:

(+ 1 2 3)

(quote (1 2 3))

(fred

sam

     joe)

The last few lines there are equivalent to:

(fred (sam joe))

You can include comments (please!). A comment starts with a semi-colon and runs to the end of the line (except if the semi-colon is inside a quoted string). Lisp treats this as though the line ended just before the semi-colon. For instance:

; Here's a comment with no code in front of it.

(list a b    ; Here's a comment after some code, in the middle of a list.

  c d)       ; More of the list.

Literal values (numbers, strings, boolean values) and names

Lisp has the usual sorts of literals -- numbers (integers and floating point) and strings -- and the usual sorts of names for variables and operators. But there are a few odd things...

Numbers:

For numbers, the odd things are:

It doesn't care how long you make integers -- if you want really big numbers, it will do multiple-precision arithmetic.
You can put underscores or carets inside numbers (in case you want to separate up really long numbers for ease of reading) and Lisp will ignore them.
To avoid rounding off, you can use ratios instead of decimal fractions -- ratios are written as two integers with / between them.

Floating point numbers are indicated either by a decimal point with something to the right of it, or by including an exponent on the end of a number. Exponents are set off from the rest of the number by a letter (s for little floats, f for the next size up, d for something bigger, l for the biggest, and e for whatever the "usual" size is -- case doesn't matter). Lisp specifies minimum sizes for each of these, but the actual form of the numbers depends on the machine. Unless you need to worry about memory use, you may as well use the biggest floats.

You can use a radix other than decimal by putting # and a radix specifier in front of the number. Radix specifiers are B for binary, 0 (the letter) for octal, X for hexadecimal, and nnR for an arbitrary radix nn. You might use binary if you want a compact representation of a bit vector, where each bit could represent a boolean value.

Here are some examples of numbers. Integers:

123_456^789

10.

-5

+5

#B1101_0100_1111

Floats:

10.0

1F5

1e-5

48.2L22

Ratios:

27/13

-2/3

Strings:

Text strings are enclosed in double quotes. You'll mostly use these for composing and printing out messages or comparing against user input. You can put line breaks inside strings. Examples:

"This is a string."

"This string

has a line break in it."

As in most languages, a string is also a data type, and is regarded as a collection of characters -- you can "apply" functions to the characters in a string in much the same way as for any other collection.

Names:

Here, I mean names for operators and variables. The general rule is, if it's not something else, it's ok to use as a name. There are some sequences of characters that might be ambiguous -- they almost look like numbers or strings or shortcuts (see below). Lisp won't let you use them as names -- if it complains about something you try to use as a name, get rid of any special characters that Lisp is using for other purposes (e.g. avoid trying to use any of ()#/\'",;:@~ in names, and be careful about using a .+- in places where the name might look like a number.) Also avoid using names of built-in operators for your own operators.

Lisp isn't case sensitive unless you beat it over the head. You should probably stick to the same case whenever you refer to one of your own names, to avoid confusing C programmers who might think you mean two different things...

Examples:

X1

ALongDescriptiveName

another_long_name

*global-variables-traditionally-have-asterisks-around-them*

Boolean values:

"False" is represented by the special value called NIL (case doesn't matter). There are a few other ways of saying NIL -- the most prominent is () because Lisp also uses NIL to mean an empty list. (Note that () is a "shortcut" -- an empty list should really be either (list) or (quote ()). Other forms of NIL are 'NIL and '(), both of which are also shortcuts.)

Anything that's not NIL is considered true. By tradition, the letter t is used as a literal to mean "true". Lisp lets you get away with not quoting the t if it's in some place where a boolean value is appropriate. This is another shortcut -- normally if you put a bare t in there (and hadn't assigned anything to it as a variable), Lisp would fuss. Note this implies you probably shouldn't use the letter t for a variable name...unless you're sure you'll never assign NIL to it!

Symbols

For each "thing" in your program (data, operator definitions), Lisp has a "symbol" the holds information about the thing. The information includes the thing's name (if it has one), a pointer to where Lisp has the thing itself (if there are any contents), the thing's type (e.g. what sort of data or operator it is). It also has a place for a "property list" where any other information can be stored as keywords and associated values. You can add your own information to the property list.

You can have a symbol that has only a name and no contents (which is what Lisp will make for you if you say (quote A) or 'A and haven't yet defined A). And you can have a symbol with no name (e.g. that refers to an anonymous function).

Operators (functions, macros, special forms) and lists used as operator calls

Operators (not lists!) are the heart and soul of Lisp. Everything you do in Lisp is a really an operator call (including some that don't look like they are, because of a shortcut). The reason that lists appear all over the place in Lisp code is that the language uses lists to represent operator calls.

A list is written as a sequence of expressions enclosed in parentheses and separated by blank space. An operator call is a list that has the operator name first followed by its arguments. If Lisp sees a ( and it's not quoted (or not an argument to a special form -- see below), it will try to interpret it as an operator call. So if you type in something like (1 2 3), Lisp will complain that 1 is not a function. What you probably want there is (list 1 2 3) -- an operator call that produces a list with 1, 2, and 3 in it.

There are three types of operators:

Functions are pieces of code that are handed some input values and (usually) evaluate to one value. When Lisp processes a function call, it first evaluates all the arguments, then passes the resulting values to the function. We'll mostly be defining functions.

A macro is a template for generating code, given values to insert. When you execute code containing a macro call, Lisp will fill in the template to get a piece of executable code, then will evaluate the code. Macros are commonly used to replace longer and more convoluted expressions with simpler and clearer ones. Although we may not need to define any of our own macros, lots of Lisp built-in operators that look like functions are actually macros. For instance, (incr X), which adds one to whatever's in X, is a macro that turns into (setq X (+ X 1)).

A "special form" or "special function" is a lot like a function except Lisp doesn't evaluate all the arguments first -- it only bothers with them if the special form actually needs to use them. For instance, "if" is a special form -- it only evaluates the argument that's selected by its test. And, even if they are used, arguments in special forms won't necessarily get evaluated the same way they would if you typed them at the Lisp prompt. For example, the "cond" special form takes as arguments a series of pairs (lists with two elements) where the first element is expected to provide a boolean value, and the second is an executable statement. If you typed the same pair at the command prompt, Lisp would complain that the first element wasn't a function. In general, the special form gets to say what Lisp should do with each of its arguments.

Collection data types (sequences, lists, arrays and vectors, hashtables,...)

Lisp has a number of collection types with different internal storage. Often you can use one in place of another, or use one type to make something that acts like another, but there are times when a particular type will be most efficient (for speed and memory size) and will make the code most understandable. Among the collection types are:

Sequences:

A sequence is a generic collection that has a linear order -- calling something a sequence doesn't say anything about how the items in the sequence are stored in memory. A string is a sequence of characters. A linked list (called just a list -- see below) is a sequence.

Lists:

These are linked lists. Each element in a list is either NIL or it's a thing that holds two pointers -- the first points to the contents of that element, and the second points to the next element in the list. The list ends where some element points to NIL as the next element. But a list can also point to some element inside itself -- this is a list with a cycle in it. Be Very Careful if you're deliberately using circular lists -- those functions that do something to every element in a list will very likely not stop if you use them on a circular list. A list can also have NIL as the contents of an element -- that doesn't terminate the list.

A list can contain other lists as the contents of elements. Here's an example of a list with some nested lists:

(list "xxx" (list NIL 'a 'b) (list 1 (list 2) 3)))

A list is a sequence, so all functions for sequences can be used on lists, but there are more functions that are just for lists.

(For historical reasons, having to do with register names on an early machine on which Lisp was implemented, the first pointer is called the "car" and the second the "cdr". One doesn't need to use the "car" and "cdr" functions to get the first element of a list and the rest of the list -- one can use the equivalent and much more meaningful "first" and "rest" functions. The car and cdr functions have buddies and cousins representing various nestings of car and cdr, used for selecting out elements deep in nested lists -- there are no "first" and "rest" equivalents. But one might argue that these buddies are, themselves, confusing, and should be avoided. You'll also hear lists referred to as "conses", after the "cons" function, used to put things into the two slots in a list element.)

Arrays and vectors:

Arrays (and vectors, which are one-dimensional arrays) permit indexing by number directly to a desired element (unlike a list, where you'd have to follow pointers to get to the element you want). They are (very likely) stored compactly in memory -- like a FORTRAN array, not a C "array".

If you want speed and compactness, don't use nested lists when you can use an array.

Hashtables:

A hashtable stores pairs of "keys" and values associated with them -- you store a key and value together, then later can use the key to look up the value that you associated with that key. Hashtables are very fast -- they compute an array index based on the key. If the "hash function" used to compute the index is well designed, and the hash table is big enough, two keys will rarely get the same index. Only if two keys hash to the same index will Lisp need to search for the key, and then it only needs to look through the few keys at that same index.

You can use anything as a key -- names, lists, functions, other hashtables...

If you need to store and retrieve information by name, you'll probably want a hashtable.

Shortcuts

You're going to see these all over the place, so you may as well get used to them. And they do make the code a tiny bit shorter. Just keep in mind that these mask the underlying everything-is-a-list structure of Lisp code. Every one of these can be replaced by a form that exposes the true structure. I'm only listing here the ones you're most likely to see. If you come across something that has some special characters stuck on the front (and it's not a number or a formal argument in a macro definition), grab your favorite exhaustive Lisp reference and see if it's a shortcut... Here are examples of the common shortcuts:

What is it?	The "real" expression.	A shortcut.
Quoted symbol name	(quote xxx)	'xxx
List of literals	(list 1 2 3)	'(1 2 3)
Both of the above	(list (quote xxx) (quote yyy))	'(xxx yyy) or (list 'xxx 'yyy)
Nested lists	(list 1 (list 2 3) 4)	'(1 (2 3) 4)
Quoted function name	(function +)	#'+

Originally posted by Pat Tressel. Edited by S. Tanimoto, January 5, 2004.