Some sugared Lisps
Introduction
Alternatives to the classic S-expression syntax are as old as Lisp itself. John McCarthy's original papers make the distinction between M-expressions which were intended as the form in which Lisp would be written and communicated and S-expressions, which were just for input to the original interpreter. The S-expression format quickly became dominant [1] and is now, to all intents and purposes, Lisp syntax.
There are three main areas where this Lisp syntax differs from conventional algebraic languages - function calls, operator expressions and control structures.
A function call in Lisp places the function within the parentheses, as in (F X) rather than before it like F(X). Operator expressions are written as functions (+ X Y) instead of the infix X+Y. Finally, control structures tend to minimize keywords and rely on S-expression structure. Contrast the Lisp:
(IF (= X 1) (* X 2) (+ X 2))
with:
IF X=1 THEN X*2 ELSE X+2 ENDIF
In Lisp terminolgy, IF is called a 'special-form', since it looks syntactically like a function call, but it is not actually evaluated as a function would be [2].
For a recent project, we needed to add a more friendly syntax to a program written in Lisp. A review of the prior art in this area seemed a good start. The Gabriel and Steele paper on Lisp history mentions a number of approaches. What is clear is that few of them ever really got used, whereas we wanted langauges that had been validated by some serious use. We choose three, MuSimp used to write a computer algebra system [3], Genie [4], and Skill [5], both used as the scripting languages for electronic design packages.
MuSimp
Here is the factorial function in MuSimp:
FUNCTION FAC(N), WHEN N = 0, 1 EXIT, N * FAC(N-1) ENDFUN
From this example we can see many of the language features. Functions are written conventionally as are operators. Syntactic structures are marked by keywords, normally with opening and closing pairs. With the exception of the commas used as statement separators, this code looks very conventional.
Conditions and loops are slightly unusual and this is probably due to the desire to minimize the number of keywords and the code size. WHEN ... EXIT is a combination of a test, with an exit from the surrounding lexical block, in the above example the function itself. There is an infinite loop provided by LOOP ... ENDLOOP - this can be exited with WHEN ... EXIT, and a regular block with BLOCK ... ENDBLOCK.
S-expression syntax can be used for data following the quote (') operator as in Lisp.
Output is normally pretty-printed in the input syntax. For example see the following dialog:
? LIST('1, '2, '3);
@: 1 (2, 3)
? LIST('+, '1, '2);
@: 1 + 2
Using & instead of ; at the end of the expression will print the result in S-expression form:
? LIST('1, '2, '3)&
@: (1 2 3)
? LIST('+, '1, '2)&
@: (+ 1 2)
Genie
Here is the factorial function in Genie:
func fac(int n) {
if (@n == 0) {
1
} else {
@n * (fac @n-1)
}
}
This looks a bit like a cross between C and a Unix shell script. To get the value of a variable, it's name is preceeded by @. Otherwise the name is taken to be a Lisp symbol or atom. Variables are typed, which is unusual in Lisp. Algebraic operator syntax is provided but function calls are written like Lisp with the function name inside the parentheses. However the parentheses are often optional as we shall see next.
A top level function call may be written without the parentheses thus:
fac 10
The parentheses being supplied by the parser. A function of no arguments may be called with just its name - this is one reason why the value of a variable requires a prefix (@). A semi-colon may be used to separate multiple functions on the same line. So:
fac 5; fac 6; fac 7
would be translated into the internal Lisp form:
(fac 5) (fac 6) (fac 7)
Braces may be used as an alternative to parentheses, in the example above they were used to mark out the control structures, and ' as in Lisp to quote data. So the following:
'{
name john; age 27
city boston
}
would return:
((name john) (age 27) (city boston))
We see that a new-line is equivalent to the semi-colon in closing the implicit parentheses.
Output is normally written in S-expression form. There is pretty-printer that prints in the implicit parentheses syntax.
Skill
Here is the factorial function in Skill:
procedure( fac(n)
if( (n==0)
then 1
else n*fac(n-1)))
We can see the use of conventional function call and operator syntax. Control structures are somewhat unusual in using both keywords and a function-like arrangement. The trailing keywords are contained within the parentheses of the special-form.
According to the published references, S-expression form is legal input as well with simple heuristics being used to distinguish between them. Examination of examples suggests that the placement an open parenthesis immediately after a symbol indicates the conventional function call syntax. Thus f( g(x)) and f((g x)) are equivalent, but f(g (x)) would actually indicate f( g( x())).
Discussion
It's instructive to compare the approaches taken by each of these languages. The single feature common to all is the support for conventional infix operator notation. MuSimp differs from the two CAD extension languages in not allowing S-expression input except where explicitly indicated with the quote operator. Genie and Skill are, in a sense, true super-sets of Lisp, allowing either S-expression or sugared syntax to be used interchangeably. However, they do this in quite different ways. Skill supports conventional function call notation and uses heuristics to distinguish these from S-expressions. Genie, on the other hand, maintains Lisp's function call syntax but introduces syntax to avoid many of the parentheses.
In my opinion, it's difficult to express a preference since each of these languages reflects legitimate design decisions. One could argue that if the goal is a sugared Lisp as opposed to a completely new language the super-set approach of Genie and Skill is more appropriate. However, this does result in syntaxes that are less conventional than MuSimp achieves. It also introduces the issue of having to explain and document both alternatives. I also feel that the heuristic approach of Skill is more fragile syntatically than that of Genie.
Notes
[1] John Allen's book, Anatomy of Lisp, published in 1978, still contains large amounts of code written in M-expression style.
[2] A function call will evaluate all its arguments before the function is entered. IF will only evaluate one leg depending on the truth value of the test.
[4]
Genie was originally developed by Silicon Compiler Systems for their Genesil product.
After various mergers, it is now owned by Mentor Graphics.
An early reference is:
Cheng, E.K., Mazor, S.
The Genesil Silicon Compiler
In
Gajski, D.D.
Silicon Compilers, Addison-Wesley, 1988, 361-405.
[5]
Skill was originally developed by SDA systems, later Cadence.
Barnes, T.
Skill: A CAD Systen Extension Language
Proceedings 27th ACM/IEEE Design Automation Conference (1990), pp 266-271