// Two more functions are considered part of sys // but their implementation is given directly in // the translator, in assembly code // These are call and return function init // building the initial environment push constant 24576 // symbol(0) special 0 // special(if) cons push constant 24577 // symbol(1) special 1 // special(define) cons push constant 24578 // symbol(2) push label builtin.begin builtin // builtin(begin) cons push constant 24579 // symbol(3) push label builtin.add builtin // builtin(add) cons push constant 24580 // symbol(4) push label builtin.sub builtin // builtin(sub) cons push constant 24581 // symbol(5) push label builtin.gt builtin // builtin(gt) cons push constant 8192 // 0x2000 = emptylist cons cons cons cons cons cons pop environment push constant 0 // each func HAS to take one arg! call main.main goto endNew here are the special, builtin and push label vm-level instructions. These instructions are needed because the type prefix of our procedures goes beyond the 15-bit limit of constants we can push at a time; a constraint we inherited from the Hack VM implementation. They are purely syntactic sugar. Arguably symbol here could also benefit from that. There is also the new special type, which takes prefix 110 in our 3-bit type system. This gives us all we need in order to support our missing lisp forms if, define, set! and lambda.
// actually call the function stored in local 1 label call // TODO call eval.add returnWe instead want to be able to map symbols to builtin vm functions more generally, and the env from sys.vm sets us up for that. The eval function can be extended so it does a lookup by builtin symbol, but in order to know where to jump to we will need to know the start of the builtin function after compilation. This is what the push label instruction provides.
push constant 24579 // symbol(3) push label builtin.add builtin // builtin(add) consThe above snippet creates a cons cell with symbol(3) as the car or key, and the address of the start of the builtin add function as its cdr or value, prefixed with the builtin type prefix 101.
Here is one of the new vm-level instructions in assembly. The rest are similarly simple.
// assembly level translation of vm instruction 'builtin' // 0x7fff + 0x2001 = 0xa000, or the 101 prefix we are after // NOTE: since we use OR, if we try to create a builtin larger than 0x1fff, // this wil fail in unexpected ways! // for now this is acceptable as we won't have that many and it eliminates an otherwise-needed mask @0x7fff D=A @0x2001 D=D+A @SP A=M-1 M=D|MNow we are ready to modify eval.vm.
// actually call the function stored in local 1 label call push local 1 call-builtin // new! returnOnce we reach call, we simply mask off the type prefix of our argument and perform a SYSCALL, like so:
// assembly level translation of vm instruction 'call-builtin' // choice of registers for FUNC and RET mimics SYSCALL implementation @SP AM=M-1 D=M @0x1fff D=D&A // mask off first three bits (TODO, could check them first) @R13 // FUNC M=D @RETURNLABEL // gensymmed label D=A @R15 // RET M=D @SYSCALL 0;JMP (RETURNLABEL)Apart from the unmask of label, this is equivalent to how we call any other function. The main reason we added call-builtin is that the default call takes a label argument, which is now a runtime variable. This lets us define builtin functions and modify them without having to count assembly output lines to jump to them; very useful! Here is the lisp keyword begin as a builtin vm-level function:
// (begin expr ... ) function begin label beginloop push argument cdr is-emptylist if-goto beginend push argument cdr pop argument goto beginloop label beginend push argument car returnThe begin function evaluates each argument then returns the last. Crucially, this definition of begin works without any calls to eval because in LISP, arguments are evaluated before function application! We already implemented this in the evalprocedure part of eval. Which is a nice seque into special builtins, of which there are very few. They are builtin LISP functions that can only work properly if they do not evaluate their arguments first. Therefore they must be treated different, and deserve a proper subtype of procedure for eval to handle.
label evalprocedure push local 1 is-special // new! if-goto evalspecial // new! label evalargs ...The is-special instruction is implemented exactly like its friends is-symbol etc before it. This next bit is going to be very similar to call-builtin, but instead of using syscall we write an explicit if/else switch. There are only a handful of special builtins, so this is fine.
label evalspecial // remove mask push local 1 push constant 8191 // 0x1fff and pop local 1 push local 1 push constant 0 eq if-goto evalif push local 1 push constant 1 eq if-goto evaldefine push local 1 push constant 2 eq // ... etc for the other forms if-goto evalquote // unknown builtin, return err goto sys.endThe implementation of quote is very simple so let's start with that. It returns the first argument, unevaluated!
// (quote arg) label evalquote push local 0 cdr car returnThe other forms are a bit more complicated than that. Next we go over the implementation of each of them, starting by adding to the eval function and changing our program from there until we have a full (if barebones) LISP implementation.
// (if test conseq alt) label evalif push argument car // env on the stack push local 0 cdr car // test push constant 8192 // 0x2000 = emptylist cons cons call eval.eval push constant 0 equal if-goto evalalt // label evalconseq push argument car // env on the stack push local 0 cdr cdr car // conseq push constant 8192 // 0x2000 = emptylist cons cons call eval.eval return label evalalt // TODO: if no alt, return 'false' push argument car // env on the stack push local 0 cdr cdr cdr car // alt push constant 8192 // 0x2000 = emptylist cons cons call eval.eval returnThis was a bit more work than quote, but still possible with the instruction set we have now. For the next special forms we will have to extend our implementation slightly.
The environment is implemented as an assoc list on the heap. Adding to the environment is easy: we can just cons an extra pair on top. This gives us a new pointer to an extended list, which is the new environment. The problem is that we want to change this environment for everyone, so even captured references to it need to be changed (more on those later). The idea is to modify the contents of the pointer on the heap, without losing any data. If we duplicate the original environment, meaning we obtain a new reference to the same data, we can cons onto that to obtain our new env. Lastly we copy the car and cdr of the new env into the address pointed at by the original env. This modifies the contents of the pointer so that all references to the original env now point to the enhanced one. We implement this functionality as a new vm-level instruction: copy-pointer:
@SP AM=M-1 D=M @R7 // dest M=D @SP AM=M-1 D=M @R8 // src AM=D MCDR @R7 A=M SETCDR @R8 A=M MCAR @R7 A=M SETCARWe are now ready to implement define. Note that this implementation leaves a duplicate unused cons cell on the heap; at some point when garbage collection is implemented this should be cleaned up automatically.
// (define symbol body) label evaldefine // TODO: this just adds, doesnt check if already exists in env // meaning currently the assoc list could have duplicate keys push local 0 cdr car // symbol // TODO: if not symbol, error! push argument car // env on the stack push local 0 cdr cdr car // exp push constant 8192 // 0x2000 = emptylist cons cons call eval.eval cons // (symbol . (eval exp)) = entry // we need to modify the env pointer, because ENV will be restored // to whatever the caller's ENV was // first we duplicate existing env pointer on heap // then we repoint original pointer to ( entry . duplicatepointer ) push environment car push environment cdr cons // allocate another cons cell to current env cons // ( entry . duplicatepointer ) // TODO: difference between caller ENV and argument env ? push environment // original env copy-pointer // cleanup duplicate cons cell on the heap should happen due to GC at some point push constant 0 return
label evallambda // (lambda (params ...) body) // assumption: params is a list of symbols // TODO: typecheck param arg and check length of args=2 push environment push local 0 cons userdefined returnFollowing Norvig's Lispy implementation, we store a user-defined procedure as a triple of env, params and body. The parameters are assumed to be sent as a list of symbols, the body is one or more expressions. What we want to store matches almost exactly what is passed to the lambda expression; the only thing missing is to prepend the environment in which it is being evaluated. This triple is then tagged with the type prefix for user-defined procedures using userdefined.
The big change is in the code for evaluating procedure application. Previously this was a relatively small part of eval: Apart from special forms (which we are discussing here), we only supported calls to builtin procedures with some sort of goto. What was missing is application of user-defined procedures. We'll extract this part of eval into a builtin apply function, which we then get for free.
function apply push constant 0 // prepare local 0 = f push constant 0 // prepare local 1 = f.params push constant 0 // prepare local 2 = args push constant 0 // prepare local 3 = numargs push argument car is-builtin if-goto callbuiltinApply starts with a switch: are we evaluating a builtin or user-defined procedure?
// label userdefined // remove mask push argument car push constant 8191 // 0x1fff and pop local 0 // zip f.params with args push local 0 cdr car // f.params pop local 1 push argument cdr car // args pop local 2User-defined procedures are a triple that we need to unpack. To apply, we first zip the parameters (list of symbols) and the arguments (list of expressions) so that we obtain a list of pairs, an assoc list just like the environment:
label applyrec // TODO: check if len(params) == len(args) push local 1 is-emptylist if-goto endapplyrec push local 3 push constant 1 add pop local 3 // numargs++ push local 1 car push local 2 car cons push local 1 cdr pop local 1 push local 2 cdr pop local 2 goto applyrecThis is then added to the env stored in the procedure when it was created using lambda. The new extended environment binds the parameters to the values that the function was called with. What's left is to evaluate the body of the procedure in this new environment, like so:
label endapplyrec // cons on top of f.env -> newenv push local 0 car // f.env label applyconsloop push constant 0 push local 3 eq if-goto applyend cons push local 3 push constant 1 sub pop local 3 goto applyconsloop label applyend // call eval.eval of newenv and f.body push local 0 cdr cdr car // f.body push constant 8192 // 0x2000 = emptylist cons cons call eval.eval returnAnd finally, here is the evaluation of builtin, mostly unchanged:
label callbuiltin push argument cdr car push argument car call-builtin return