Lisp programming language |
Lisp is a .
The name Lisp derives from List Processing . Linked lists are one of Lisp languages major data structures, and identical basic list operations work in all Lisp programming language dialects. Other common features in Lisp dialects include dynamic typing, functional programming support, and the ability to manipulate source code as data.
Lisp languages also have an instantly-recognizable appearance. Program code is written using the same syntax as lists – the parenthesized S-expression syntax. Every sub-expression in a program (or data structure) is set off with parentheses. This makes Lisp languages easy to parser, and also makes it simple to do Metaprogramming (programming) – creating programs which write other programs. This is a major reason for its great popularity in the 70s and 80s, because artificial intelligence programmers believed that Lisp would lend itself naturally to self-propagating programs.
Originally specified in 1958, Lisp is the second-oldest high-level programming language in widespread use today; only Fortran is older. Like Fortran, Lisp has changed a great deal since its early days, and a number of dialects have existed over its history. Today, the most widely-known general-purpose Lisp dialects for programming are Common Lisp and Scheme programming language.
=History=
Information Processing Language was the first AI language, from 1955 or 1956, and already included many of the concepts, such as list-processing and recursion, which came to be used in Lisp.
Lisp was invented by John McCarthy (computer scientist) in 1958 while he was at Massachusetts Institute of Technology. McCarthy published its design in a paper in Communications of the ACM in 1960, entitled Recursive Functions of Symbolic Expressions and Their Computation by Machine, Part I . (Part II was never published.) He showed that with a few simple operators and a notation for functions, one can build a whole programming language.
McCarthy s original notation used bracketed M-expression externally. These were quickly abandoned in favor of the S-expressions which he originally proposed as an internal representation. As an example, the M-expression car[cons[A,B]] is equivalent to the S-expression (car (cons A B)).
Lisp was originally implemented by : and ) for the operations that return the first item in a list and the rest of the list respectively.
The first complete Lisp compiler, written in Lisp, was implemented in 1962 by Tim Hart and Mike Levin. ([ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-039.pdf AI Memo 39], 767 kB PDF.) This compiler introduced the Lisp model of incremental compilation, in which compiled and interpreted functions can intermix freely. The language used in Hart and Levin s memo is much closer to modern Lisp style than McCarthy s earlier code.
Since its inception, Lisp was closely connected with the artificial intelligence research community, especially on PDP-10 systems. Lisp was used as the implementation of the programming language Planner programming language that was the foundation for the famous AI system SHRDLU. In the 1970s, as AI research spawned commercial offshoots, the performance of existing Lisp systems became a growing issue. Partly because of garbage collection (computer science) and partly because of its representation of internal structures, Lisp became difficult to run on the memory-limited stock hardware of the day. This led to the creation of LISP machine: dedicated hardware for running Lisp environments and programs. Along with modern compiler construction techniques, today s gigantic computer capacities (by the standards of the 1970s) have made this specialization unnecessary and quite efficient Lisp environments now exist.
During the 1980s and 1990s, a great effort was made to unify the numerous Lisp dialects (most notably, Interlisp, MacLisp, ZetaLisp, and Franz Lisp) into a single language. The new language, Common Lisp, was essentially a superset of the dialects it replaced. In 1994, ANSI published the Common Lisp standard, ANSI X3.226-1994 Information Technology Programming Language Common Lisp. By this time the world market for Lisp was much smaller than in its heyday.
Having declined somewhat in the 1990s, Lisp has experienced a regrowth of interest since 2000, partly due to the writings of Paul Graham. Most new activity is focused around open source implementations of Common Lisp, and includes the development of new portable libraries and applications.
The language is amongst the oldest programming languages still in use as of the time of writing in 2005. Algol programming language, Fortran and COBOL are of a similar vintage, and Fortran and COBOL are also still being used.
The now-ubiquitous if-then-else structure, now taken for granted as an essential element of any programming language, was invented by McCarthy for use in Lisp, where it saw its first appearance in a more general form (the cond structure). It was inherited by Algol, which popularized it.
As with Smalltalk, a major benefit of Lisp is rapid prototyping of applications. For example, the first graphical IMAP client was written in Interlisp. A comparable program, written in Objective C, took much longer to write.
However, Lisp did not evolve as successfully as the programming language C programming language, which evolved into enormously popular languages including C plus plus, Java programming language, and C Sharp.
See also [http://citeseer.ist.psu.edu/steele93evolution.html The Evolution of Lisp], a paper written by Guy L. Steele, Jr. and Richard P. Gabriel.
=Syntax and Semantics=
: Note: This article s examples are written in Common Lisp (though most are also valid Scheme).
Lisp is an expression-oriented language. Unlike most other languages, no distinction is made between expressions and statements ; all code and data are written as expressions. When an expression is evaluated , it produces a value (or list of values), which then can be embedded into other expressions.
McCarthy s 1958 paper introduced two types of syntax: S-expressions (Symbolic Expressions), which are also called sexp s, and M-expressions (Meta Expressions), which express functions of S-expressions. M-expressions never found favour, and almost all Lisps today use S-expressions to manipulate both code and data.
The heavy use of parentheses in S-expressions has been criticized – one joke acronym for Lisp is Lots of Irritating Superfluous Parentheses [http://www.catb.org/~esr/jargon/html/L/LISP.html] – but the S-expression syntax is also responsible for much of Lisp s power: the syntax is extremely regular, which facilitates manipulation by computer.
The reliance on expressions gives the language great flexibility. Because Lisp systems, which enables extension of the language almost without limit.
A Lisp list is written with its elements separated by whitespace, and surrounded by parentheses. For example,
:(1 2 foo)
is a list whose elements are three Lisp atoms, the values 1, 2, and foo. These values are implicitly typed: They are respectively two integers and a string of characters, and do not have to be declared as such. Note that foo is quoted in this example; the quoting prevents the atom from being evaluated .
The empty list () is also represented as the special atom nil. This is the only entity in Lisp which is both an atom and a list.
Expressions are written as lists, using prefix notation. The first element in the list is the name of a form , i.e., a function, operator, macro, or special operator (see below.) The remainder of the list are the arguments. For example, the function list returns its arguments as a list, so the expression
:(list 1 2 foo)
evaluates to the list (1 2 foo). If any of the arguments are expressions, they are recursively evaluated before the enclosing expression is evaluated. For example,
:(list 1 2 (list 3 4))
evaluates to the list (1 2 (3 4)). Note that the third argument is a list; lists can be nested.
Arithmetic operators are treated similarly. The expression
:(+ 1 2 3 4)
evaluates to 10. The equivalent under infix notation would be 1 + 2 + 3 + 4 .
Special operators (sometimes called special forms by older users) provide Lisp s control structure. For example, the special operator if takes three arguments. If the first argument is non-nil, it evaluates to the second argument; otherwise, it evaluates to the third argument. Thus, the expression
:(if nil :::(list 1 2 foo ) :::(list 3 4 bar ))
evaluates to (3 4 bar ). (Of course, this would be more useful if a non-trivial expression had been substituted in place of nil!)
==Lambda expressions==
Another special operator, lambda, is used to bind variables to values which are evaluated within an expression. This form is also used to create functions. The arguments to lambda are a list of arguments, and the expression or expressions that the function evaluates to (the return value is the value of the last expression to be evaluated). The expression
:(lambda (arg) (+ arg 1))
is an expression which, when applied, takes one argument, bound to arg and returns the number one greater than that argument. Lambda expressions are treated no differently to named functions; they are invoked the same way. Therefore, the expression
: 5)
evaluates to 6.
==Conses and lists==
A Lisp list is a singly-linked list. Each cell of this list is called a cons (or sometimes a pair , particularly in Scheme, because it contains two pointers), and is composed of two Pointers, called the car and cdr respectively. These are equivalent to the data and next fields discussed in the article linked list .
Of the many data structures that can be built out of singly-linked lists, one of the most basic is called a proper list . A proper list is either the special nil (empty list) symbol, or a cons in which the car points to a datum (which may be another cons structure, such as a list), and the cdr points to another proper list.
If a given cons is taken to be the head of a linked list, then its car points to the first element of the list, and its cdr points to the rest of the list. For this reason, the car and cdr functions are also called first and rest when referring to conses which are part of a linked list (rather than, say, a tree).
Thus, a Lisp list is not an atomic object, as an instance of a container pattern in C++ or Java would be. A list is nothing more than an aggregate of linked conses. A variable which refers to a given list is simply a pointer to the first cons in the list. Traversal of a list can be done by cdring down the list; that is, taking successive cdrs to visit each cons of the list; or by using any of a number of higher-order functions to map a function over a list.
Parenthesized S-expressions represent linked list structure. There are several ways to represent the same list as an S-expression. A cons can be written in dotted-pair notation as (a . b), where a is the car and b the cdr. A longer proper list might be written (a . (b . (c . (d . nil)))) in dotted-pair notation. This is conventionally abbreviated as (a b c d) in list notation . An improper list may be written in a combination of the two – as (a b c . d) for the list of three conses whose last cdr is d (i.e., the list (a . (b . (c . d))) in fully-specified form).
Because conses and lists are so universal in Lisp systems, it is a common misconception that they are Lisp s only data structure. In fact, all but the most simplistic Lisps have other data structures – such as vectors (arrays), hash tables, structures, and so forth.
===Shared structure===
Lisp lists, being simple linked lists, can share structure with one another. That is to say, two lists can have the same tail , or final sequence of conses. For instance, after the execution of the following Common Lisp code:
:(setq foo (list a b c)) :(setq bar (cons x (cdr foo)))
the lists foo and bar are (a b c) and (x b c) respectively. However, the tail (b c) is the same structure in both lists.
In many languages, the usual way to place the same data in two different structures is to copy it. Sharing structure rather than copying can give a dramatic performance improvement. However, this technique can interact in undesired ways with functions that alter lists passed to them as arguments. Altering one list, such as by replacing the c with a goose, will affect the other:
:(setf (third foo) goose)
This changes foo to (a b goose), but also changes bar to (x b goose) – a possibly unexpected result. This can be a source of bugs, and functions which alter their arguments are documented as destructive for this very reason.
Aficionados of functional programming avoid destructive functions. In the Scheme dialect, which favors the functional style, the names of destructive functions are marked with a cautionary exclamation point, or bang — such as set-car! (read set car bang ), which replaces the car of a cons. In the Common Lisp dialect, destructive functions are commonplace; the equivalent of set-car! is named rplaca for replace car. This function is rarely seen however as Common Lisp includes a special facility, setf, to make it easier to define and use destructive functions. A frequent style in Common Lisp is to write code functionally (without destructive calls) when prototyping, then to add destructive calls as an optimization where it is safe to do so.
==Self-evaluating forms and quoting==
Lisp evaluates expressions which are entered by the user. Symbols and lists evaluate to some other (usually, simpler) expression – for instance, a variable evaluates to its value; (+ 2 3) evaluates to 5. However, most other forms evaluate to themselves . They are parsed by the read function, but are left alone by eval. Numbers and strings are this way: if you enter 5 into Lisp, you just get back the same 5.
Any expression can also be marked to prevent it from being evaluated (as is necessary for symbols and lists). This is the role of the quote special operator, or its abbreviation (a single quotation mark). For instance, usually if you enter the symbol foo you will get back the value of that variable (or an error, if there is no such variable). If you wish to refer to the symbol itself , you enter (quote foo) or, usually, foo.
More complex forms of quoting are used with macros. For instance, both Common Lisp and Scheme support the backquote or quasiquote operator, entered with the ` character. This is almost the same as the plain quote, except it allows variables to be interpolated into a quoted list with the comma and comma-at operators. If the variable snue has the value (bar baz) then `(foo ,snue) evaluates to (foo (bar baz)), while `(foo ,@snue) evaluates to (foo bar baz). The backquote or quasiquote is most frequently used in defining macro expansions.
Self-evaluating forms and quoted forms are Lisp s equivalent of literals. However, they are not necessarily constants. In some Lisp dialects it is possible to modify the values of literals in program code. For instance, if a quoted form is used in the body of a function, and is changed as a side-effect, that function s behavior may differ on subsequent iterations. This is usually a bug, and is undefined behavior in some dialects. When behavior like this is intentional, using a closure (computer science) is the explicit way to do it.
Lisp s formalization of quotation has been noted by Douglas Hofstadter and others as an example of the philosophy idea of self-reference.
==Scoping and closures==
A major split in the modern Lisp family is between dynamic scoping and lexical scoping. The latter makes use of closure (computer science) whilst the former is simpler and does not. Today, Scheme and Common Lisp both make use of lexical scoping by default; while the more primitive Lisp systems used as embedded languages in Emacs and AutoCAD use dynamic scoping.
==List structure of program code==
A fundamental distinction between Lisp and other languages is that in Lisp, program code is not simply text. Parenthesized S-expressions, as depicted above, are the printed representation of Lisp code, but as soon as these are entered into a Lisp system they are translated by the parser (called the read function) into linked list and tree structures in memory.
Lisp macros operate on these structures. Because Lisp code has the same structure as lists, macros can be built with any of the list-processing functions in the language. In short, anything that Lisp can do to a data structure, Lisp macros can do to code. In contrast, in most other languages the parser s output is purely internal to the language implementation and cannot be manipulated by the programmer. Macros in C programming language, for instance, operate on the level of the preprocessor , before the parser is invoked, and cannot re-structure the program code in the way Lisp macros can.
In simplistic Lisp implementations, this list structure is directly interpreter (computing) to run the program; a function is literally a piece of list structure which is traversed by the interpreter in executing it. However, most actual Lisp systems (including all conforming Common Lisp systems) also include a compiler. The compiler translates list structure into machine code or (rarely) bytecode for execution.
==Evaluation and the Read-Eval-Print Loop==
Lisp languages are frequently used with an interactive command line, which may be combined with an integrated development environment. The user types in expressions at the command line, or directs the IDE to transmit them to the Lisp system. Lisp reads the entered expressions, evaluates them, and prints the result. For this reason, the Lisp command line is called a read-eval-print loop , or REPL .
The basic operation of the REPL is as follows. This is a simplistic description which omits many elements of a real Lisp, such as quoting and macros.
The read function accepts textual S-expressions as input, and parses them into list structure. For instance, if you type the string (+ 1 2) at the prompt, read translates this into a linked list with three elements – the symbol +, the number 1, and the number 2. It so happens that this list is also a valid piece of Lisp code; that is, it can be evaluated. This is because the car of the list names a function – the addition operation.
The eval function evaluates list structure, returning some other piece of structure as a result. Evaluation does not have to mean interpretation; some Lisp systems compile every expression to native machine code. It is simple, however, to describe evaluation as interpretation: To evaluate a list whose car names a function, eval first evaluates each of the arguments given in its cdr, then applies the function to the arguments. In this case, the function is addition, and applying it to the argument list (1 2) yields the answer 3. This is the result of the evaluation. Evaluation is performed in applicative-order evaluation.
It is the job of the print function to represent output to the user. For a simple result such as 3 this is trivial. An expression which evaluated to a piece of list structure would require that print traverse the list and print it out as an S-expression.
To implement a Lisp REPL, it is necessary only to implement these three functions and an infinite-loop function. (Naturally, the implementation of eval will be complicated, since it must also implement all the primitive functions like car and + and special operators like if.) This done, a basic REPL itself is but a single line of code: (loop (print (eval (read)))).
== Control structures ==
Lisp originally had very few control structures, but many more were added during the language s evolution. (Lisp s original conditional operator, cond, is the precursor to later if-then-else structures.)
Programmers in the Scheme dialect often express loops using tail recursion. Scheme s commonality in academic computer science has led some students to believe that tail recursion is the only, or the most common, way to write iterations in Lisp. Nothing could be further from the case. All frequently-seen Lisp dialects have imperative-style iteration constructs, from Scheme s straightforward do loop to Common Lisp s complex loop expressions.
Most Lisp control structures are special operators , equivalent to other languages syntactic keywords. Expressions using these operators have the same surface appearance as function calls, but differ in that the arguments are not necessarily evaluated -- or, in the case of an iteration expression, may be evaluated more than once.
Both Common Lisp and Scheme have operators for non-local control flow. The differences in these operators are some of the deepest differences between the two dialects. Scheme supports re-entrant Continuations using the call/cc procedure, which allows a program to save (and later restore) a particular place in execution. Common Lisp does not support re-entrant continuations, but does support several ways of handling escape continuations.
Frequently, the same algorithm can be expressed in Lisp in either an imperative or a functional style. As noted above, Scheme tends to favor the functional style, using tail recursion and continuations to express control flow. However, imperative style is still quite possible. The style preferred by many Common Lisp programmers may seem more familiar to programmers used to structured languages such as C, while that preferred by Schemers more closely resembles pure-functional languages such as ML and Haskell.
Because of Lisp s early heritage in list processing, it has a wide array of higher-order functions relating to iteration over sequences. In many cases where an explicit loop would be needed in other languages (like a for loop in C) in Lisp the same task can be accomplished with a higher-order function. (The same is true of many functional programming languages.)
A good example is a function which in Scheme is called map and in Common Lisp is called mapcar. Given a function and one or more lists, mapcar applies the function successively to the lists elements in order, collecting the results in a new list:
:(mapcar # + (1 2 3 4 5) (10 20 30 40 50))
This applies the + function to each corresponding pair of list elements, yielding the result (11 22 33 44 55).
=Examples =
Here are examples of Common Lisp code. While unlike Lisp programs used in industry, they are similar to Lisp as taught in computer science courses.
As the reader may have noticed from the above discussion, Lisp syntax lends itself naturally to recursion. Mathematical problems such as the enumeration of recursively-defined sets are simple to express in this notation.
Evaluate a number s factorial:
:(defun factorial (n) :::(if (|
|