Emacs Mini Manual (PART 2) - LISP PRIMER: WHY PARENTHESES MATTER

Introduction - and why you should learn Lisp
Syntax error
- Unbalanced parentheses:
- Mini-language syntax error:
Semantic error
Lisp Machine
Conclusion

Introduction - and why you should learn Lisp

In this section, you can try out code by M-x ielm, then paste code into the Emacs Lisp interpreter. It's important that you understand the core ideas. After you finish the sub-sections of this section, we will play with Emacs Lisp code for customizing Emacs, and it's really fun to see your Emacs "evolves" gradually.

Before we start

Please be aware that although I intentionally emphasize the importance of parentheses, but I don't mean Lisp is all about parentheses. I just want to clear beginner's confusion with parentheses.

In this chapter, I will introduce to you why language like Lisp exists and what sets it from other languages, along with basic syntax and semantic. For deeper learning into Emacs Lisp, many tutorials exist and you can search easily on the internet. But before all of that, I want you to understand what and why Lisp differs from others, and why should you use Lisp; otherwise, it's just learning a different ways of doing things in a different language and won't motivate you to take Lisp serious because the mentality of "just another programming language".

A bit of history

Lisp has a long history. Lisp was designed by computer scientist John McCarthy. Lisp first appeared in 1958 and Lisp is still with us up to this day with various dialects: Common Lisp, newLisp, Emacs Lisp, Racket, Clojure… Lisp is short for (LIS)t (P)rocessing.

You can read the history of Lisp until 1979 of Lisp, written by John McCarthy, the creator of Lisp: History of Lisp.

Another good read about Lisp history, written by Paul Graham: Lisp History.

Success Stories.

Basic syntax and semantic

Lisp syntax is inherently simple. At its core, this is all that required to understand any Lisp:

Anything is a list, it's an atom. Atoms are the same as primitives in other languages: components that cannot be broken down further into smaller pieces. This is data.
Anything is not an atom, it's a list that contains atoms. List is just anything contained in a pair of parentheses. List can be used either as code for processing something or as data to be processed.
Data:
- Atomic-data, or Atom: It is the same as primitives in other languages, meaning the most basic construct and indivisible. Atoms include following basic types, you should try in IELM:
  - Number: integer like 1, 2, 3… or floating point numbers like 1.5, 3.14…
  - String: strings like "a", "string", "this is a string"… Is also an atom. String is immutable.
  - NIL or (): NIL means null in other languages. () means empty list, which is equivalent to NIL. When using with a true or false test, usch as an if statement, NIL or the empty list () means the value false; all other non-empty values in Lisp are true.
  - Symbol: A symbol is an object with a name. You can think of symbol as pointer or reference in other language. Each symbol has 4 components:
    - Print name: symbol's name as string
    - Value: value of symbol as variable
    - Function: symbol's function definition. A symbol can be both a variable and a function
    - Property-list: list of key-value pairs. You don't need to care much at this stage.
    Example: the variable buffer-file-name is both a variable and a function. You can check by using symbol-value and symbol-function functions:
```
(symbol-value 'buffer-file-name) ;; check the value of buffer-file-name; if you try it in IELM,
;; the value is NIL because the buffer is not associated with any file

(symbol-function 'buffer-file-name) ;; return the function buffer-file-name; 
;; IELM should display something like this: #<subr buffer-file-name>
```
- Non-atomic: There's a reason Lisp is called (LIS)t (P)rocessing. In other languages, list is just another data type; but in Lisp, because of Lisp syntax, list is really special. Basically, this is a list:
```
’(...)
```
  A list can hold data, both atomic and non-atomic. That means, a list can hold a smaller list inside it along with atomic data. List elements are separated by whitespace.
Code: If you want to run code to be executed, here is the syntax:
```
(...)
```
What is the difference between code and data? That's right, a single quote ’. Without the quote ’, we are creating a list that holds code for computer to execute. With the quote ’, we are creating a list that holds data, similar to List data structure in Java or C++ to hold data.

Examples:

’(1 2 3 4) ;;is a list that holds 4 numbers from 1 to 4

’("aa" "bb" "cc" "dd") ;;is a list that holds 4 strings: "aa", "bb", "cc", "dd"

 '() ;; an empty list, which is also an atom

’(if a b c) ;;  a list that hold 4 symbol: if, a, b and c

 (if a b c) ;; is an if expression that executes if..then..else
           ;; logic in common programming languages. Expressions like if are
           ;; called *special form*

’(+ 1 2 3) ;; is a list consists of 4 elements: +, 1, 2 and 3

 (+ 1 2 3)  ;; is a function call to function "+" (yes, "+" is a function)

Both code and data in Lisp can be represented using the same format: a pair of parentheses with items in it: (...); and it is called a list. This peculiar property is called homoiconicity, and languages that have this property is called homoiconic languages. It makes Lisp so powerful: code can be data, data can be code. It is a reason why Lisp contains a lot of parentheses.

Both code and data are represented as a list underlying, but to distinguish between list that holds data and list that holds code, list that holds data is referred simply as list; while list that holds code is Lisp form. But remember, code and data are lists, and because of the single representation for both code and data, list is more special in Lisp than in other languages.

It's worth to repeat again: ’(...) for creating data and (...) for creating code; you hold things in ’(...) and you process things in (...). Easy to remember, right?

You may think: "Cool, so what difference can homoiconity make?" Let's look at an example; this is typical if..then..else:

if (condition) {
    ...statements...
} else {
    ...statements...
}

How do you change its syntax? For example, you prefer Python if..then..else syntax, how can we change C if..then..else to Python if...then...else and write our customized version in C:

if condition:
    ...statements...
else:
    ...statements...

The answer is, it's impossible, even with C macro. With Lisp, this is entirely possible, except one minor thing: the code must be treated as data, meaning the entire Python if construct above must be enclosed within a Lisp form like this:

'(if condition:
    ...statements
  else:
    ...statements...)

Lisp still has syntax, but minimal: a pair of parentheses, with things in in it: (...), along with the syntax for primitives. For that reason, it can adapt to any type of syntax programmers can imagine. Notice the single quote ’, signalling that the entire form is data, and need to be processed to create appropriate code when feed into some processing function.

Now you see why Lisp code has a lot of parentheses. This is how homoiconicity differs. Without being able to treat code as data, you cannot bend the language to your own will (well, unless you implement your own language from scratch). Because Lisp's minimal syntax, you can create your own language for expressing your own ideas. Using your own language means you can use your own terms, your own rules, to write your solutions instead of someone imposes a particular style of language on you, tell you how to do it even if you prefer another style. This is why Lisp is so expressive: minimal syntax and follow the will of programmer.

Lisp forms are classified into 3 types:

Function form: Function form is the most common form. Function form is equivalent to a function call in other languages. If the first element in the list is a function that exists, that function will be called along with its arguments. The remaining elements in the list are function arguments. All arguments are evaluated before the function is called.
Example:

The list (+ 1 (+ 2 3) (* 3 4) (/ 4 2)) is a function call to function +. Nearly everything in Lisp is a function, even arithmetic operators like +, -, *, /. Before the outer most list is processed, the inner ones will be processed first. (+ 2 3) becomes 5, (* 3 4) becomes 12, (/ 4 2) becomes 2; all these three values will then replace its list in the original function call to make it become: (+ 1 5 12 2), and finally function + is called to produce the final result 20.
Special form: Special form has special evaluation rules or special syntax or both. For example, this is if..then..else in Lisp:
```
(if condition  ;; condition is a valid Lisp form
    ...do something if true...
  ...do something if false...)
```
Let's consider the behaviour of if, not just in Lisp but in any language: if condition (a valid Lisp form) is true, then do something, else do something if false. For this reason, if cannot be a function call because condition, true and false are all evaluated and passed into if, while we want first check condition, then depend on the outcome of condition, we select a true or false branch.

Most forms in Lisp are functions, except special cases such as if, and, or… that cannot follow the evaluation rule of a function. They need their own rules that do not exist in other forms. That's why they are special.
Macro form: Macro form is a function, but different: When you call a macro, the macro function generated regular Lisp code; the generated code then is executed. Macro is what makes Lisp so special: it allows Lisp to have any syntax anyone wishes for. The Python syntax enclosed in a Lisp form you saw earlier is an example. But now, instead of having to quote, you won't have to with a macro form. Instead of writing like this:
```
'(if condition:
     ...statements...
  else:
     ...statements...)
```

You can remove the quote ’ and treat your Python syntax as part of Lisp:

(if condition:
     ...statements...
 else:
     ...statements...)

The Python code above is a macro form. Upon calling, the macro will first transform to a valid Lisp form:

(if condition
     ...statements...
     ...statements...)

Then the transformed code is executed. You can have C for loop, Python if, Java class…mix up in Lisp if you want. Thanks to the minimal Lisp syntax, Lisp macro is able to do all of this. Without it, you cannot bend Lisp to your needs.

In reality, ’(...) is just a syntactic sugar for special form (quote ...). In the end, aside from the atoms, Lisp only has one syntax: a pair of parentheses and items in it. With Lisp syntax, many things are easy to do in Lisp, such as generating code as data and execute it later, both in compile time and runtime. In the end, aside from the primitives, the only thing that exists in Lisp is a pair of parentheses, with things in in it: (...). This is the only syntax, along with the semantics that depends on context: a function form, a special form or a macro form; the first item in a form is going to get executed. That's all you need to remember for using any Lisp.

If you are still not convinced with the parentheses, perhaps seasoned Lispers can:

Lisp has too many parentheses…
The above article is inspired by this Usenet post, which is worth reading.

Beyond parentheses

You may ask, can you to create syntax without parentheses in Lisp? Of course, you can create entirely new syntax to extend Lisp without being enclosed inside the parentheses of Lisp, using reader macro. The difference between reader macro and macro:

A reader macro transforms raw text into valid Lisp objects. Reader macro is a special type of macro that allows you to transform non-Lisp code into Lisp code.
A regular macro transforms Lisp's list into valid Lisp code.

For example, you can remove the parentheses with the Python if..then..else using a reader macro and use a non-parentheses Python if..then..else validly in your program. Using a regular macro, you have to keep the parentheses to make it a valid Lisp object: a list of symbols, then that list will be transformed at compile time. Lisp macro is advanced topic, and should really master the basics before getting to it.

Syntax error

Lisp syntax is simple: it's just a pair of parentheses, with things in in it: (...). If you encounter syntax errors, it belongs to these two cases:

Unbalanced parentheses:

Do you miss an opening or closing parentheses, or do you insert unnecessary parentheses? Incorrect usage of parentheses is the only syntax error you get when writing Lisp program. In other languages, you have to remember many syntax rules. For example, to write a for in Tcl, you have to write like this to make it valid

for {set i 0} {$i < $n} {incr i} {
    ...do something...
}

I kept forgetting all the times when I first used it because I get used to C style for loop. In Tcl, to use some variables, you have to put a dollar sign $ before the variable names. Howver, in some context, you must not insert dollar before:

array set balloon {color red}
array get balloon

balloon is an array variable, but to use it you must not insert dollar sign before. It's annoying to remember trivial details like this.

Mini-language syntax error:

If you create a mini language, then you must follow its syntax rules. In this case, you get syntax errors like regular languages if you code is not correct according to syntax rules. However, if you are a beginner, you won't have to worry about macro and mini-languages at this stage.

Semantic error

You might wonder, parentheses cannot be the only source of errors. What would happen when incorrect number of arguments passed into a function? Or non-existent variables, incorrect variable types, array index out of range…? These errors are called semantic errors. It has nothing to do with how statements are constructed.

For example, this is syntax error:

#include <stdio.h>

int main (int argc, const char* argv[]) {
    if argc == 1 { exit(1) }
    printf("Hello world")
}

In the above example, I made two syntax errors:

the condition in if statement is not surrounded by a pair of parentheses. if statement in C requires this generic form:

if (expression) {
    ...statements separated by semicolon...
}

missed a semicolon ; at the end of printf statement.

In contrast, this is semantic error:

void add(int a, int b) {
    return a + b;
}

void main(int argc, const char* argv[]) {
    int a = 1;

    add(a);
    add(a,b);
}

The calls to add are syntactically correct, but used incorrectly: the first call to add requires one more argument; the second call to add contains non-existent variable.

As in other languages, Lisp treats these errors as semantic errors, since syntax errors in Lisp have only to do with parentheses.

Lisp Machine

It would be a mistake when mention about history of Lisp without mention about the Lisp Machine, a computing system that is built to run Lisp natively. In a Lisp Machine, the Operating System, device drivers and applications are written using a single language: Lisp. Such a thing is possible because the computer has a built-in hardware garbage collector, as opposed to the software implementations in garbage collected languages today.

A Brief History of Lisp Machines

Why Lisp? Everyone "knows" that lisp was the language of choice for Artificial Intelligence research, but a big part of AI research is about paradigms for representing knowledge, expressing algorithms, man-machine communication, and machine-to-machine communication: In short, how to use computers in general. Lisp, as the default AI language, was also an important research vehicle for new computer languages, networking, display technology and so on.

Why Lisp Machines? The standard platform for Lisp before Lisp machines was a timeshared PDP-10, but it was well known that one Lisp program could turn a timeshared KL-10 into unusable sludge for everyone else. It became technically feasible to build cheaper hardware that would run lisp better than on timeshared computers. The technological push was definitely from the top down; to run big, resource hungry lisp programs more cheaply. Lisp machines were not "personal" out of some desire make life pleasant for programmers, but simply because lisp would use 100% of whatever resources it had available. All code on these systems was written in Lisp simply because that was the easiest and most cost effective way to provide an operating system on this new hardware.

Why two different kinds? Quite a few groups with different goals were building high priced, high powered workstations at about the same time. All were capitalizing on Moore's law and the emerging consensus that bitmapped displays, windows, mice, and networks were effective paradigms. The C/Unix community spawned Sun, Apollo, and Silicon Graphics. The Pascal Community spawned the PERQ. There were two major branches in the Lisp family tree, Interlisp and Maclisp, so it should be no surprise that there were two main family branches in Lisp machines.

Today, all this hardware and software are commercially extinct, but many features that were commercialized by lispms are present in every PC.

Futher resources:

Lisp OS: Genera: The OS is written entirely in Lisp, both the Operating System and the high-level applications.

The Lisp Machine Software Development Environment

Symbolic Lisp Machine Museum

Symbolics Lisp Machine Museum provided by Ralf Möller

Kalman Reti, the Last Symbolics Developer, Speaks of Lisp Machines

Secrets of the Symbolics Console: Part 1

Secrets of the Symbolics Console: Part 2

A few things I know about LISP Machines

MIT's CADR machine

Conclusion

You won't find any language with such a minimal syntax and uniformity, yet so expressive, since you can choose any language syntax that you want to solve your problems in. Some languages also have homoiconic property, but instead of using just a pair of parentheses, they use more complex syntax constructs. Some languages are simple (still not as much as Lisp), but are not homoiconic. The only syntax you write in Lisp, again, just a pair of parentheses, with things in in it: (...). Because of syntax like this, Lisp requires you to careful match the parentheses. Or you can let Emacs does it for you.

Learning any language has something in common:

Learn syntax and semantic.
Learn idiomatic ways of using the language.
Learn commonly used libraries.
Learn common development tools used with the language.

We already covered the first. I will show you how to use common functions for configuration, and setup a programming environment for any Lisp in the next chapter.

Back to Table of Contents