8

Write your own compiler - Station #3: the emitter

 3 years ago
source link: https://blog.klipse.tech/javascript/2017/02/08/tiny-compiler-emitter.html
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.

Write your own compiler - Station #3: the emitter

Feb 8, 2017 • Yehonathan Sharvit

The plan

Our journey is made of 4 stations - each of them depending on the previous ones:

  1. The tokenizer (aka “Lexical Analysis”): converting an input code - in LISP syntax - into an array of tokens.
  2. The parser (aka “Syntactic Analysis”): transforming an array of tokens into an Abstract Syntax Tree (AST).
  3. The emitter (aka “Code Generation”): string-ifying an AST into C-like code.
  4. The compiler (aka “You made it”): combining all the pieces together.

(The interactive code snippets are powered by a tool of mine named KLIPSE.)

Code generation

Now that we have our AST, it’s not so hard to transform the AST into code (if you are not afraid of recursions). This is the purpose of the emitter.

We are going to have an emitter per node type.

emitter

Simple node emitters

First, Let’s write the non-recursive emitters.

For NumberLiteral nodes - we emit the value as-is:

xxxxxxxxxx
emitNumber = node => node.value
[Function emitNumber]

Let’s see how it works:

xxxxxxxxxx
emitNumber({
  "type": "number",
  "value": "2",
})
xxxxxxxxxx
"2"

For StringLiteral nodes - we wrap the node value with quotes:

xxxxxxxxxx
emitString = node =>  `"${node.value}"`
xxxxxxxxxx
[Function emitString]

Let’s see how it works:

xxxxxxxxxx
emitString({
  "type": "string",
  "value": "Hello World!",
})
xxxxxxxxxx
"\"Hello World!\""

Recursive node emitters

And now the recursive emitters.

For the whole program - we emit the code of each expression appending a ; and joining the expressions with \n:

xxxxxxxxxx
emitProgram = node =>  node.body.map(exp => emitter(exp) + ";").join('\n');
xxxxxxxxxx
[Function emitProgram]

For an expression - this is where we deal with the facts that in C:

  1. the function name comes before the parenthesis
  2. the function arguments are comma separated
xxxxxxxxxx
emitExpression = node =>
`${node.name}(${node.params.map(emitter).join(', ')})`
xxxxxxxxxx
[Function emitExpression]

General node emitter

Combining all of it in the emitter - dispatching according to node type:

xxxxxxxxxx
emitter = node => {
  switch (node.type) {
    case 'Program': return emitProgram(node); 
    case 'CallExpression': return emitExpression(node);
    case 'NumberLiteral': return emitNumber(node);
    case 'StringLiteral': return emitString(node); 
    default:
      throw new TypeError(node.type);
  }
}
xxxxxxxxxx
[Function emitter]

Let’s see how our emitter works with a simple expression:

xxxxxxxxxx
emitter({
  "type": "CallExpression",
  "name": "subtract",
  "params": [
    {
      "type": "NumberLiteral",
      "value": "4"
    },
    {
      "type": "NumberLiteral",
      "value": "2"
    }
  ]
})
xxxxxxxxxx
"subtract(4, 2)"

And now, let’s emit the code from the AST generated by our parser in the previous article:

xxxxxxxxxx
ast = {
  "type": "Program",
  "body": [
    {
      "type": "CallExpression",
      "name": "print",
      "params": [
        {
          "type": "StringLiteral",
          "value": "Hello"
        },
        {
          "type": "NumberLiteral",
          "value": "2"
        }
      ]
    },
    {
      "type": "CallExpression",
      "name": "add",
      "params": [
        {
          "type": "NumberLiteral",
          "value": "2"
        },
        {
          "type": "CallExpression",
          "name": "subtract",
          "params": [
            {
              "type": "NumberLiteral",
              "value": "4"
            },
            {
              "type": "NumberLiteral",
              "value": "2"
            }
          ]
        }
      ]
    }
  ]
}
emitter(ast)
xxxxxxxxxx
"print(\"Hello\", 2);
add(2, subtract(4, 2));"

Now, everything is set up for the last station:The compiler. No need to rest, you immediatly run towards it.

If you enjoy this kind of interactive articles would you consider a (small) donation💸 on Patreon or at least giving a star⭐ for the Klispe repo on Github?

to stay up-to-date with the coolest interactive articles around the world.

Discover more cool interactive articles about javascript, clojure[script], python, ruby, scheme, c++ and even brainfuck!

Give Klipse a Github star to express how much you appreciate Code Interactivity.

Subscribe to the Klipse newsletter:

Feel free to email me [email protected] for getting practical tips and tricks in writing your first interactive blog post.


About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK