// Two more functions are considered part of sys
// but their implementation is given directly in
// the translator, in assembly code
// These are call and return
function init
// building the initial environment
push constant 24576 // symbol(0)
special 0 // special(if)
cons
push constant 24577 // symbol(1)
special 1 // special(define)
cons
push constant 24578 // symbol(2)
push label builtin.begin
builtin // builtin(begin)
cons
push constant 24579 // symbol(3)
push label builtin.add
builtin // builtin(add)
cons
push constant 24580 // symbol(4)
push label builtin.sub
builtin // builtin(sub)
cons
push constant 24581 // symbol(5)
push label builtin.gt
builtin // builtin(gt)
cons
push constant 8192 // 0x2000 = emptylist
cons
cons
cons
cons
cons
cons
pop environment
push constant 0 // each func HAS to take one arg!
call main.main
goto end
New here are the special, builtin and push label vm-level instructions.
These instructions are needed because the type prefix of our procedures goes beyond the 15-bit limit of constants we can push at a time;
a constraint we inherited from the Hack VM implementation.
They are purely syntactic sugar. Arguably symbol here could also benefit from that.
There is also the new special type, which takes prefix 110 in our 3-bit type system.
This gives us all we need in order to support our missing lisp forms if, define, set! and lambda.
// actually call the function stored in local 1
label call
// TODO
call eval.add
return
We instead want to be able to map symbols to builtin vm functions more generally, and the env from sys.vm sets us up for that.
The eval function can be extended so it does a lookup by builtin symbol, but in order to know where to jump to we will need to know the start of the builtin function after compilation.
This is what the push label instruction provides.
push constant 24579 // symbol(3)
push label builtin.add
builtin // builtin(add)
cons
The above snippet creates a cons cell with symbol(3) as the car or key, and the address of the start of the builtin add function as its cdr or value, prefixed with the builtin type prefix 101.
Here is one of the new vm-level instructions in assembly. The rest are similarly simple.
// assembly level translation of vm instruction 'builtin'
// 0x7fff + 0x2001 = 0xa000, or the 101 prefix we are after
// NOTE: since we use OR, if we try to create a builtin larger than 0x1fff,
// this wil fail in unexpected ways!
// for now this is acceptable as we won't have that many and it eliminates an otherwise-needed mask
@0x7fff
D=A
@0x2001
D=D+A
@SP
A=M-1
M=D|M
Now we are ready to modify eval.vm.
// actually call the function stored in local 1
label call
push local 1
call-builtin // new!
return
Once we reach call, we simply mask off the type prefix of our argument and perform a SYSCALL, like so:
// assembly level translation of vm instruction 'call-builtin'
// choice of registers for FUNC and RET mimics SYSCALL implementation
@SP
AM=M-1
D=M
@0x1fff
D=D&A // mask off first three bits (TODO, could check them first)
@R13 // FUNC
M=D
@RETURNLABEL // gensymmed label
D=A
@R15 // RET
M=D
@SYSCALL
0;JMP
(RETURNLABEL)
Apart from the unmask of label, this is equivalent to how we call any other function.
The main reason we added call-builtin is that the default call takes a label argument, which is now a runtime variable.
This lets us define builtin functions and modify them without having to count assembly output lines to jump to them; very useful!
Here is the lisp keyword begin as a builtin vm-level function:
// (begin expr ... )
function begin
label beginloop
push argument
cdr
is-emptylist
if-goto beginend
push argument
cdr
pop argument
goto beginloop
label beginend
push argument
car
return
The begin function evaluates each argument then returns the last. Crucially, this definition of begin works without any calls to eval because in LISP, arguments are evaluated before function application!
We already implemented this in the evalprocedure part of eval.
Which is a nice seque into special builtins, of which there are very few.
They are builtin LISP functions that can only work properly if they do not evaluate their arguments first.
Therefore they must be treated different, and deserve a proper subtype of procedure for eval to handle.
label evalprocedure
push local 1
is-special // new!
if-goto evalspecial // new!
label evalargs
...
The is-special instruction is implemented exactly like its friends is-symbol etc before it.
This next bit is going to be very similar to call-builtin, but instead of using syscall we write an explicit if/else switch.
There are only a handful of special builtins, so this is fine.
label evalspecial
// remove mask
push local 1
push constant 8191 // 0x1fff
and
pop local 1
push local 1
push constant 0
eq
if-goto evalif
push local 1
push constant 1
eq
if-goto evaldefine
push local 1
push constant 2
eq
// ... etc for the other forms
if-goto evalquote
// unknown builtin, return err
goto sys.end
The implementation of quote is very simple so let's start with that.
It returns the first argument, unevaluated!
// (quote arg)
label evalquote
push local 0
cdr
car
return
The other forms are a bit more complicated than that.
Next we go over the implementation of each of them, starting by adding to the eval function and changing our program from there
until we have a full (if barebones) LISP implementation.
// (if test conseq alt)
label evalif
push argument
car // env on the stack
push local 0
cdr
car // test
push constant 8192 // 0x2000 = emptylist
cons
cons
call eval.eval
push constant 0
equal
if-goto evalalt
// label evalconseq
push argument
car // env on the stack
push local 0
cdr
cdr
car // conseq
push constant 8192 // 0x2000 = emptylist
cons
cons
call eval.eval
return
label evalalt
// TODO: if no alt, return 'false'
push argument
car // env on the stack
push local 0
cdr
cdr
cdr
car // alt
push constant 8192 // 0x2000 = emptylist
cons
cons
call eval.eval
return
This was a bit more work than quote, but still possible with the instruction set we have now.
For the next special forms we will have to extend our implementation slightly.
The environment is implemented as an assoc list on the heap. Adding to the environment is easy: we can just cons an extra pair on top. This gives us a new pointer to an extended list, which is the new environment. The problem is that we want to change this environment for everyone, so even captured references to it need to be changed (more on those later). The idea is to modify the contents of the pointer on the heap, without losing any data. If we duplicate the original environment, meaning we obtain a new reference to the same data, we can cons onto that to obtain our new env. Lastly we copy the car and cdr of the new env into the address pointed at by the original env. This modifies the contents of the pointer so that all references to the original env now point to the enhanced one. We implement this functionality as a new vm-level instruction: copy-pointer:
@SP
AM=M-1
D=M
@R7 // dest
M=D
@SP
AM=M-1
D=M
@R8 // src
AM=D
MCDR
@R7
A=M
SETCDR
@R8
A=M
MCAR
@R7
A=M
SETCAR
We are now ready to implement define.
Note that this implementation leaves a duplicate unused cons cell on the heap;
at some point when garbage collection is implemented this should be cleaned up automatically.
// (define symbol body)
label evaldefine
// TODO: this just adds, doesnt check if already exists in env
// meaning currently the assoc list could have duplicate keys
push local 0
cdr
car // symbol
// TODO: if not symbol, error!
push argument
car // env on the stack
push local 0
cdr
cdr
car // exp
push constant 8192 // 0x2000 = emptylist
cons
cons
call eval.eval
cons // (symbol . (eval exp)) = entry
// we need to modify the env pointer, because ENV will be restored
// to whatever the caller's ENV was
// first we duplicate existing env pointer on heap
// then we repoint original pointer to ( entry . duplicatepointer )
push environment
car
push environment
cdr
cons // allocate another cons cell to current env
cons // ( entry . duplicatepointer )
// TODO: difference between caller ENV and argument env ?
push environment // original env
copy-pointer
// cleanup duplicate cons cell on the heap should happen due to GC at some point
push constant 0
return
label evallambda
// (lambda (params ...) body)
// assumption: params is a list of symbols
// TODO: typecheck param arg and check length of args=2
push environment
push local 0
cons
userdefined
return
Following Norvig's Lispy implementation, we store a user-defined procedure as a triple of
env, params and body.
The parameters are assumed to be sent as a list of symbols, the body is one or more expressions.
What we want to store matches almost exactly what is passed to the lambda expression;
the only thing missing is to prepend the environment in which it is being evaluated.
This triple is then tagged with the type prefix for user-defined procedures using userdefined.
The big change is in the code for evaluating procedure application. Previously this was a relatively small part of eval: Apart from special forms (which we are discussing here), we only supported calls to builtin procedures with some sort of goto. What was missing is application of user-defined procedures. We'll extract this part of eval into a builtin apply function, which we then get for free.
function apply
push constant 0 // prepare local 0 = f
push constant 0 // prepare local 1 = f.params
push constant 0 // prepare local 2 = args
push constant 0 // prepare local 3 = numargs
push argument
car
is-builtin
if-goto callbuiltin
Apply starts with a switch: are we evaluating a builtin or user-defined procedure?
// label userdefined
// remove mask
push argument
car
push constant 8191 // 0x1fff
and
pop local 0
// zip f.params with args
push local 0
cdr
car // f.params
pop local 1
push argument
cdr
car // args
pop local 2
User-defined procedures are a triple that we need to unpack.
To apply, we first zip the parameters (list of symbols) and the arguments (list of expressions)
so that we obtain a list of pairs, an assoc list just like the environment:
label applyrec
// TODO: check if len(params) == len(args)
push local 1
is-emptylist
if-goto endapplyrec
push local 3
push constant 1
add
pop local 3 // numargs++
push local 1
car
push local 2
car
cons
push local 1
cdr
pop local 1
push local 2
cdr
pop local 2
goto applyrec
This is then added to the env stored in the procedure when it was created using lambda.
The new extended environment binds the parameters to the values that the function was called with.
What's left is to evaluate the body of the procedure in this new environment, like so:
label endapplyrec
// cons on top of f.env -> newenv
push local 0
car // f.env
label applyconsloop
push constant 0
push local 3
eq
if-goto applyend
cons
push local 3
push constant 1
sub
pop local 3
goto applyconsloop
label applyend
// call eval.eval of newenv and f.body
push local 0
cdr
cdr
car // f.body
push constant 8192 // 0x2000 = emptylist
cons
cons
call eval.eval
return
And finally, here is the evaluation of builtin, mostly unchanged:
label callbuiltin
push argument
cdr
car
push argument
car
call-builtin
return