9

Inside Microvium Closures

 2 years ago
source link: https://coder-mike.com/blog/2022/08/08/inside-microvium-closures/
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
neoserver,ios ssh client

TL;DR: Support for closures in Microvium sets it apart from other JS engines of a similar size. Closures simplify state machines and enable functional-style code. Closures in snapshots are a new way of sharing compile-time state with the runtime program. This post goes through some examples and design details.

What is a closure?

MDN has already done a great job of explaining what a closure is, so I’m just going to borrow their explanation:

closure is the combination of a function bundled together (enclosed) with references to its surrounding state (the lexical environment). In other words, a closure gives you access to an outer function’s scope from an inner function. In JavaScript, closures are created every time a function is created, at function creation time.

Here is a simple example:

function makeCounter() {
let x = 0;
function incCounter() {
return x;
return incCounter;
const myCounter1 = makeCounter();
const myCounter2 = makeCounter();
console.log(myCounter1()); // 1
console.log(myCounter1()); // 2
console.log(myCounter1()); // 3
// myCounter2 is an independent counter
console.log(myCounter2()); // 1
console.log(myCounter2()); // 2
console.log(myCounter2()); // 3
function makeCounter() {
  let x = 0;
  function incCounter() {
    x++;
    return x;
  }
  return incCounter;
}

const myCounter1 = makeCounter();
const myCounter2 = makeCounter();

console.log(myCounter1()); // 1
console.log(myCounter1()); // 2
console.log(myCounter1()); // 3

// myCounter2 is an independent counter
console.log(myCounter2()); // 1
console.log(myCounter2()); // 2
console.log(myCounter2()); // 3

In the above example, the function named incCounter is a closure because it closes over variable x in its outer lexical scope. A closure is just a function, nested in another function, which accesses variables in the outer function.

Microvium also supports the arrow function syntax, so the following example has the same output but is more concise:

const makeCounter = x => () => ++x;
const myCounter = makeCounter(0);
console.log(myCounter()); // 1
console.log(myCounter()); // 2
console.log(myCounter()); // 3
const makeCounter = x => () => ++x;
const myCounter = makeCounter(0);
console.log(myCounter()); // 1
console.log(myCounter()); // 2
console.log(myCounter()); // 3

For more detail, take a look at the above-mentioned MDN article. The rest of this post will assume that you know what a closure is.

Why are closures useful?

Closures in snapshots

Let’s say that we want a script that exports two functions: one that prints “hello” and the other that prints “world”. Without closures, we could implement this in Microvium as follows:

vmExport(0, printHello);
vmExport(1, printWorld);
function printHello() {
console.log('hello');
function printWorld() {
console.log('world');
vmExport(0, printHello);
vmExport(1, printWorld);

function printHello() {
  console.log('hello');
}

function printWorld() {
  console.log('world');
}

(Recall that vmExport is a function that the script calls to export a function to the host).

In this example, printHello and printWorld are each functions that take no arguments and will print the corresponding string to the console1.

With the introduction of closures, we could factor out the commonality between printHello and printWorld and just have a printX that can print either one:

const printHello = makePrinter('hello');
const printWorld = makePrinter('world');
vmExport(0, printHello);
vmExport(1, printWorld);
function makePrinter(thingToPrint) {
return printX;
function printX() {
console.log(thingToPrint);
const printHello = makePrinter('hello');
const printWorld = makePrinter('world');
vmExport(0, printHello);
vmExport(1, printWorld);

function makePrinter(thingToPrint) {
  return printX;
  function printX() {
    console.log(thingToPrint);
  }
}

This refactors the code so that console.log only appears once but is shared by both printHello and printWorld. For the simple case of console.log this doesn’t add much benefit, but you can imagine cases where printX is a lot more complicated and so this refactoring may be beneficial.

Another thing to note in this example is that makePrinter is called at compile time since it’s in the top-level code. The resulting closures printHello and printWorld instantiated at compile time are carried to runtime via the snapshot, along with their state (the value of thingToPrint). The closures feature here plays nicely with snapshotting as a new way to share compile-time state to runtime.

Depending on the style of code you’re familiar with, we can also write the same example more concisely as:

const makePrinter = s => () => console.log(x);
vmExport(0, makePrinter('hello'));
vmExport(1, makePrinter('world'));
const makePrinter = s => () => console.log(x);
vmExport(0, makePrinter('hello'));
vmExport(1, makePrinter('world'));

If you’re not comfortable with the idea of functions returning other functions, here’s another variant of the example that does the same thing:

function exportPrinter(id, textToPrint) {
vmExport(id, () => console.log(textToPrint));
exportPrinter(0, 'hello');
exportPrinter(1, 'world');
function exportPrinter(id, textToPrint) {
  vmExport(id, () => console.log(textToPrint));  
}
exportPrinter(0, 'hello');
exportPrinter(1, 'world');

We could also get the list of things to export from a dynamic source, such as an array (or even something read from a file at compile time using fs.readFileSync):

const printers = [
{ id: 0, textToPrint: 'hello' },
{ id: 1, textToPrint: 'world' },
// OR:
// printers = JSON.parse(fs.readFileSync('printers.json', 'utf8'));
for (let i = 0; i < printers.length; i++) {
const id = printers[i].id;
const textToPrint = printers[i].textToPrint;
vmExport(id, () => console.log(textToPrint));
const printers = [
  { id: 0, textToPrint: 'hello' },
  { id: 1, textToPrint: 'world' },
];
// OR:
// printers = JSON.parse(fs.readFileSync('printers.json', 'utf8'));

for (let i = 0; i < printers.length; i++) {
  const id = printers[i].id;
  const textToPrint = printers[i].textToPrint; 
  vmExport(id, () => console.log(textToPrint));  
}

Side note: the above example also demonstrates the usefulness of vmExport being a normal function, rather than some special syntax. Think about your favorite language or engine and how you would implement the above in that language. You can’t define an extern void foo() {} inside a for-loop in C, or public static void foo() {} inside a for-loop in C#. The only solution in these environments to the objective of exporting a programmatically-defined set of functions would be to use a code generator and the result would be much more complicated.

Closures for state machines

It’s much easier and better performance to implement a finite state machine using closures. Consider the following two-state state machine:

The above state machine has 2 states, which I’ve called stateA and stateB. When event 1 is received while in stateA, the machine will transition to stateB, but if any other event (e.g. event 2) is received while in stateA, the machine will not transition to stateB. See Wikipedia for a more detailed description of FSMs.

The ability to use closures allows us to implement this state machine using a function for each state. In the following example, stateA is a function that receives events by its event parameter. We “make” stateA only when we need it, by calling enterStateA(). The example includes an eventCount as part of state A to show how states can have their own variables that are persisted across multiple events.

function enterStateA() {
console.log('Transitioned to State A!');
let eventCount = 0; // Some internal state only used by stateA
// A function that handles events while we're in stateA
function stateA(event) {
if (event === 1) {
currentState = enterStateB();
} else {
eventCount++;
console.log(`Received ${eventCount} events while in state A`);
return stateA;
function enterStateB() {
console.log('Transitioned to State B!');
return event => {
if (event === 2) {
currentState = enterStateA();
// We'll start in stateA
let currentState = enterStateA();
// Every time an event is received, we send it to the current state
const processEvent = event => currentState(event);
// Allow the host firmware to events to the state machine
vmExport(0, processEvent);
// Some example events:
processEvent(5); // Received 1 events while in state A
processEvent(5); // Received 2 events while in state A
processEvent(5); // Received 3 events while in state A
processEvent(1); // Transitioned to State B!
processEvent(1); //
processEvent(2); // Transitioned to State A!
processEvent(2); // Received 1 events while in state A
function enterStateA() {
  console.log('Transitioned to State A!');
  let eventCount = 0; // Some internal state only used by stateA

  // A function that handles events while we're in stateA
  function stateA(event) {
    if (event === 1) {
      currentState = enterStateB();
    } else {
      eventCount++;
      console.log(`Received ${eventCount} events while in state A`);
    }
  }

  return stateA;
}

function enterStateB() {
  console.log('Transitioned to State B!');
  return event => {
    if (event === 2) {
      currentState = enterStateA();
    }
  }
}

// We'll start in stateA
let currentState = enterStateA();

// Every time an event is received, we send it to the current state
const processEvent = event => currentState(event);

// Allow the host firmware to events to the state machine
vmExport(0, processEvent);

// Some example events:
processEvent(5); // Received 1 events while in state A
processEvent(5); // Received 2 events while in state A
processEvent(5); // Received 3 events while in state A
processEvent(1); // Transitioned to State B!
processEvent(1); //
processEvent(2); // Transitioned to State A!
processEvent(2); // Received 1 events while in state A

In the above example, state A has the eventCount counter which is part of its closure. When the system transitions to state B, the counter can be garbage collected. This might not be very useful when only considering a single counter variable, but the pattern generalizes nicely to systems that have more expensive states that may hold buffers and other resources.

Once you understand closures and higher-order functions, this is a very natural way to represent state machines.

Closures under the hood

Let’s go back to the simple “counter” example for illustration:

function makeCounter() {
let x = 0;
function incCounter() {
return ++x;
return incCounter;
const myCounter = makeCounter();
console.log(myCounter()); // 1
console.log(myCounter()); // 2
console.log(myCounter()); // 3
function makeCounter() {
  let x = 0;
  function incCounter() {
    return ++x;
  }
  return incCounter;
}

const myCounter = makeCounter();
console.log(myCounter()); // 1
console.log(myCounter()); // 2
console.log(myCounter()); // 3

The runtime memory layout for this example is as follows:

The global variable myCounter is a 2-byte slot, as are all variables in Microvium. The slot contains a pointer to the closure, which is an immutable tuple containing a reference to the function code (incCounter, in this example) and the enclosing lexical environment which in Microvium is called the scope.

The closure and scope are allocated on the garbage-collected heap — if and when the closure is no longer reachable, it will be freed.

When the closure myCounter is called, the VM sees that the callee is a closure and sets a special scope register in the VM to the closure’s scope before running the target bytecode. The bytecode can then interact with the scope variables through special machine instructions that leverage the scope register.

For a look into how variables are accessed by closure code, see my post on Closure Variable Indexing.

More efficient than objects

Closures are much more efficient than objects in Microvium:

  • Every closure variable is one word (2 bytes) while an object property is 2 words (4 bytes) because a property is stored as a key-value pair. Variables aren’t stored with their name because the static analysis can determine an index for them.
  • Closure variables can be accessed with single-byte instructions (a 4-bit opcode and 4-bit literal variable index) whereas typical object properties take at least 5 bytes of bytecode instructions to access. This is in part because the “current” closure scope is implicit, while there is no such thing as the “current” object (the object needs to be specified explicitly), and also because object property access requires specifying the property key which is a reference to a string.
  • All the object key strings take up memory in the bytecode image.
  • Most closure variables are accessed in O(1) time — adding more variables does not slow down the access time, but adding more properties to an object slows it down by O(n).

Conclusion

Closures are useful and form the backbone of a lot of JavaScript programming. I’ve talked before about how closures are useful for callback-style asynchronous code, and in this post, I’ve also shown how they are useful for modeling state machines and make certain kinds of refactoring possible.

Other tiny JavaScript engines such as Elk and mJS don’t support closures, so this feature sets Microvium apart from the crowd2.

The closures feature has been the single most complicated and time-consuming feature to implement in Microvium because the static analysis requires multiple passes and completely changed how the compiler front-end works. It’s really not as simple as one would first imagine. Consider, as just one example, that the let i binding in for (let i = 0 ...) has multiple instances of i relative to the containing function, and that on each iteration the previous value is copied to the new instance.

But after a long journey, I can say that I’m happy with the result.


  1. This example assumes that you’ve added `console.log` to the global scope 

  2. Larger engines such as Espruino and XS support closures along with many other features, but come at a serious size premium. 


About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK