jtenner

Posted on Jan 14, 2019 • Updated on Jun 8, 2019

Optimizing CanvasRenderingContext2D Function Calls Using AssemblyScript

#canvas #assemblyscript #webassembly #typescript

Edit: Due to the fact that the AssemblyScript compiler has radically changed to become a better tool for garbage collection and has a different memory model, this software no longer compiles and requires a very radical rewrite from scratch. The concepts of this software still apply and are valid. Also, there are no planned breaking changes to how it will work in the future. Please read on!

This article is a deep dive into how I implemented a version of the CanvasRenderingContext2D prototype in AssemblyScript.

jtenner / canvas-as

A small canvas rendering framework powered by AssemblyScript 🎉🎉🎉

canvas-as

A small canvas rendering framerwork powered by AssemblyScript 🎉🎉🎉.

About

The canvas-as project was an experiemnt that lead to the development of a highly behavior tested version called as2d. Since this project is not tested, and likely slower, I have deprecated it to be replaced by as2d.

Thank you for your understanding. I have left this repo here for achival reasons.

The canvas-as framework is a fun project that utilizes a lot of the hard work provided by the AssemblyScript project, which actually turned out to be quite performant! It avoids using repeated function calls over the wasm bridge, and compiles a set of canvas instructions to make drawing onto a canvas much easier from within AssemblyScript.

Goal

To provide a performant framework for utilizing the CanvasRenderingContext2Dprototype in the browser, while following the javascript specification as closely as possible, and deviating where it…

View on GitHub

The canvas-as project is completely experimental, and I don't know if this library should be used in production. However, I wanted to share some of the positive benefits I've managed to surface from using AssemblyScript.

Anyone who knows me is aware that I am really passionate about using canvas, and I set out to master its intricacies so I could teach others how to use it. Some of my personal failures into this venture include making a JavaScript library called e2d, which ultimately failed.

The idea of e2d was to create a declarative canvas view layer for developers to pick up and use like react. Some of the code (including a jsx syntax) looked like this:

e2d.render(
  <translate x={100} y={100}> // translate the contents by [100, 100]
    <fillText text="Hello world!" /> // fill some text
  </translate>,
  ctx,
);

Sometimes, when an idea looks shiny and useful, it's hard to pull yourself back and stop pursuing it when it doesn't work in reality. This particular idea was bad because of how memory intensive it was. Returning an object with every function call tends to cause too many heap allocations. It was a very valuable lesson about garbage collection that I won't ever forget.

Once I had gained the courage to try something new again, I decided to pick up AssemblyScript and make some brand new mistakes!

Thus, canvas-as was born.

Starting from the Ground Up

I set out to abstract the CanvasRenderingContext2D prototype using AssemblyScript. After reading some of the AssemblyScript documentation, I chose to write some JavaScript linked functions because it was the obvious thing to do. After hours of work, I hooked everything up and provided Web Assembly with a bunch of callbacks into JavaScript.

// Inside AssemblyScript
@external("ctx", "getFillStyle")
declare function getFillStyle(id: i32): string;

@external("ctx", "setFillStyle")
declare function setFillStyle(id: i32, value: string): void;

export class CanvasRenderingContext2D {
  protected id: i32;
  public get fillStyle(): string {
    return getFillStyle(this.id);
  }

  public set fillStyle(value: string): void {
    setFillStyle(this.id, value);
  }
}

// and inside javascript
function getFillStyle(id: number): number {
  var ctx: CanvasRenderingContext2D = contexts.get(id);
  return wasm.newString(ctx.fillStyle);
}

function setFillStyle(id: number, value: number): void {
  var ctx: CanvasRenderingContext2D = contexts.get(id);
  ctx.fillStyle = wasm.getString(value);
}

After testing and reading more about the Web Assembly engine, I came to the conclusion that calling into JavaScript for every property set was going to be too many jumps between Web Assembly and JavaScript. I decided that it would be faster to use a private property.

// Link to a setFillStyle function
@external("ctx", "setFillStyle")
declare function setFillStyle(id: i32, value: string): void;

// Call this function when the property is set
export class CanvasRenderingContext2D {
  private _fillStyle: string = "#000"; // this is the initial value

  public get fillStyle(): string {
    return _fillStyle;
  }

  public set fillStyle(value: string): void {
    this._fillStyle = value;
    setFillStyle(this.id, value);
  }
}

The problem here is that we do not parse the input and determine its CSS color value like ECMAScript requires. Despite this obvious flaw, it performs mostly as expected, and assumes the developer doesn't make mistakes in generating color values.

I came to the logical conclusion that there had to be a better solution for communicating with the Web Assembly host, so I set out to create something slightly different. What if it was possible to queue a set of canvas instructions and call them in succession without leaving a single linked JavaScript function call?

I created a Serializer<T> class to see what it would look like.

// in AssemblyScript
@external("ctx", "render")
declare function render(id: i32, data: Float64Array): void;

class Serializer<T> {
  private index: i32 = 0;
  private data: Float64Array = new Float64Array(8000);

  @inline
  private write_one(instruction: T, value: f64): void {
    var next: i32 = this.index + 3; // calculate the next instruction pointer
    this.write(<f64>instruction);
    this.write(<f64>next);
    this.write(value);
  }

  @inline
  private write_zero(instruction: T): void {
    var next: i32 = this.index + 2; // calculate the next instruction pointer
    this.write(<f64>instruction);
    this.write(<f64>next);
  }

  @inline
  private write(value: f64): void {
    unchecked(this.data[this.index] = value);
    ++this.index;
  }
}

class CanvasRenderingContext2D extends Serializer<CanvasInstruction> {
  protected id: i32 = 0; // this refers to the context id in javascript
  private _fillStyle: string = "#000";

  // fillStyle implementation
  public get fillStyle(): string { return _fillStyle; }
  public set fillStyle(value: string): void {
    this._fillStyle = value;
    this.write_one(
      CanvasInstruction.FillStyle,
      // the changetype macro function converts reference types into pointers
      <f64>changetype<usize>(value),
    );
  }

  // required for telling the browser to draw
  public commit(): void {
    this.write_zero(CanvasInstruction.Commit);
    this.index = 0; // reset the writer
    render(this.id, this.data);
  }
}

Now, we can queue up a set of function calls in the JavaScript host.

// in JavaScript

function render(id: number, dataPointer: number): void {
  // Get the canvas context from a `Map<number, CanvasRenderingContext2D>`
  var ctx: CanvasRenderingContext2D = contexts.get(id);
  // use the performant AssemblyScript loader functions to create a TypedArray
  var data: Float64Array = wasm.getArray(Float64Array, dataPointer);

  var i: number = 0;
  while (i < data.length) {
    if (data[i] === CanvasInstruction.Commit) break;

    switch (data[i]) {
      case CanvasInstruction.FillStyle:
        // use `send_string` instead of string pointer (
        ctx.fillStyle = wasm.getString(data[i + 2]);
        break;
      ...
    }
    i = data[i + 1]; // the next index was already calculated in Web Assembly
  }
}

This loop avoids many of the required heap allocations that are typically unavoidable in normal canvas development. However, there are a few bottlenecks. The loop itself adds a bit of work for each cycle, along with the switch statement, and each call to getString() performs many heap allocations just to obtain a JS string reference.

I decided to venture down the rabbit hole once more to see if there were some further optimizations that could be made.

Optimizing CanvasRenderingContext2D Further

It was time to cut some corners, starting with learning how the Context object works (through trial and error.) For instance, when performing any kind of fill() operation, it's possible to avoid writing multiple fillStyle values in succession if the implementation is clever.

// For example, the following "red" is ignored by the browser, effectively.
ctx.fillStyle = "red";
ctx.fillStyle = "green";
ctx.fillRect(100.0, 100.0, 200.0, 200.0);

This kind of optimization can occur when the fillRect() function is called. This is a great moment to check and see if the fillStyle was changed since the last fill() operation. If it was changed, we can emit a fillStyle assignment operation, effectively skipping the "red" string in the previous example.

You might be thinking that this kind of optimization is pretty useless, but actually checking to see if the value changed is pretty cheap. Remember that every time the fillStyle property is set, the browser will parse the string and produce a CSS color. Why ask JavaScript to parse a color if the fillStyle isn't even going to be used?

ctx.fillStyle = "black";
var result: string = ctx.fillStyle;
assert(result === "#000");

Another example of canvas optimization is pathing operation reduction.

ctx.beginPath();
ctx.rect(1, 1, 2, 2);
ctx.beginPath(); // overwrite the current path and start over
ctx.rect(100, 100, 200, 200);
ctx.fill();

This set of pathing instructions ignores the first two function calls. Why tell the browser to start a new path if it's just going to be ignored? Instead, we can simply check to see if any relevant path instructions are queued up to be written and emit the following (equivalent) instructions.

ctx.beginPath();
ctx.rect(100, 100, 200, 200);
ctx.fill();

As it turns out, this works really well.

Another good example of optimization lies in calculation of canvas transforms.

ctx.translate(100.0, 100.0);
ctx.rotate(Math.PI);
ctx.moveTo(50.0, 50.0);
ctx.lineTo(0.0, 0.0);
ctx.stroke();

What optimziation can occur here? Take a look at the currentTransform property for reference.

// canvas-as stores the matrix values pre-calculated
var matrix = ctx.currentTransform;
// properties a-f are the same values that `setTransform` needs
var values: Array<f64> = [matrix.a, matrix.b, ...];

Since we have already calculated what the transform needs to be, we can emit a single setTransform(a, b, c, d, e, f) operation instead.

ctx.setTransform(-1, 1.2246467991473532e-16, -1.2246467991473532e-16, -1, 100, 100);
ctx.moveTo(50.0, 50.0);
ctx.lineTo(0.0, 0.0)
ctx.stroke();

As a result, the translate() and rotate() function calls were combined together into a single setTransform() operation that was calculated when the moveTo function was called. This is because each path operation depends on the current transform value anyway. In fact, using translate and rotate will cause the browser to perform those calculations yet again(!), so the gains provided by this optimization combine well.

Finally, one (very important) last optimization involves removing save() and restore() function calls. The save() and restore() functions actually cause the browser to copy the whole state of the context and make a copy of it on the heap. Instead, why not implement a pre-allocated set of stack values, and change a single pointer each time save and restore is called?

Most web developers are familiar with the idea of a virtual DOM that mirrors the state of an html document. Instead, canvas-as will mirror the state of the canvas context, and only perform operations that are absolutely necessary.

Ultimately, we can now avoid those pesky heap allocations altogether!

Take the following example.

ctx.save();
ctx.translate(x, y);
ctx.rotate(r);
ctx.drawImage(img, 0, 0);
ctx.restore();
ctx.fillRect(100, 100, 100, 100);

Combining the transforms and removing the save() and restore() functions, it results in the following operations.

ctx.setTransform(a, b, c, d, e, f);
ctx.drawImage(img, 0, 0);
ctx.setTransform(1, 0, 0, 1, 0, 0);
ctx.fillRect(100, 100, 100, 100);

One final problem arose with this solution involving the clip() function. This is because it's impossible to restore a clipping operation region without the use save() and restore() function calls. The canvas-as optimized renderer actually had to break specification to implement this feature.

ctx.save(true); // This is a hard save! Not a virtual one.
ctx.rect(100, 100, 100, 100);
ctx.clip();
// do some drawing
.
.
.
ctx.restore(); // implied hard restore

Now, with all these examples of optimizations out of the way, we can finally perform a stress test to see what the results of this experiment are.

Testing the API

This is the stress test that I used to measure canvas-as performance.

class Star {
  public x: f64 = Math.random() * 800.0;
  public y: f64 = Math.random() * 600.0;
}
var pi2: f64 = Math.PI * 2.0;
var stars: Star[] = new Array<Star>(0);
for (let i = 0; i < 1000; i++) stars.push(new Star());

function frame(): void {
  // turn the screen black
  ctx.fillStyle = "black";
  ctx.fillRect(0.0, 0.0, 800.0, 600.0);

  // draw each star
  var star: Star;
  for (let i = 0; i < 1000; i++) {
    star = stars[i];
    star.y += 1;
    if (star.y > 600.0) star.y -= 600;
    ctx.fillStyle = "white";
    ctx.save();
    ctx.translate(star.x, star.y)
    ctx.beginPath();
    ctx.arc(0.0, 0.0, 1.0, 0.0, pi2);
    ctx.fill();
    ctx.restore();
    // in assemblyscript, uncomment the next line to batch calls to js
    // if (i % 50 == 0) ctx.commit()
  }
  // in AssemblyScript, uncomment the next line
  // ctx.commit();
}

The only thing to mention here is that Firefox seems to slow to a crawl when running this frame function within a requestAnimationFrame loop. It looks like it has something to do with the CanvasRenderingContext2D prototype. I wasn't able to dig into it further to find the problem. Instead, all my measurements were averaged using Google Chrome and Opera to sample about 20 seconds worth of frame times each.

Anyway, here is the data I collected using devtools in Opera and Chrome on this contrived stress test.

Garbage Collection Rate (less is better)
optimized-as: 2.4/s
as (without optimization): 3.6/s
js: 1.7/s

Total Heap Memory Usage Range: (smaller range is better)
optimized-as: 4.5mb-5.4mb (0.9mb difference)
as (without optimization): 4.5mb-8.7mb (4.2mb difference)
js: 4.2mb-6.8mb (2.6mb difference)

Frame Rate (more is better)
optimized-as: 57.6 fps
as (without optimization): 52.4fps
js: 56.55fps

The results are bittersweet. We reduced a very large amount of garbage collection overhead using the optimized version, but increased how many times the browser asks for a collection. Memory usage itself, however was reduced by quite a lot, and caused the garbage collector to collect less memory overall.

To no surprise, CPU execution time became very consistent when running the Web Assembly versions, and the optimized version (barely) resulted in less CPU usage all around.

Conclusion

AssemblyScript manages to barely outperform pure JavaScript by implementing an algorithm it otherwise might not have been able to use during a requestAnimationFrame loop. Using a virtual stack clearly has benefits in a memory managed enviroment, and results in more consistent code execution.

However, this implementation comes with its own set of drawbacks.

The first glaring issue that canvas-as doesn't address is actually determining where the proper API abstraction should occur. Is it fastest to abstract the drawing API, or is it faster to implement something else that simply uses the canvas? That is why canvas-as is a more of a philosophical experiment, instead of an actual concrete API.

It also has no unit tests whatsoever. I don't know if my library performs as intended, but this can be alleviated by starting from scratch and doing proper test driven development. This will be done with jest, a proper canvas mock, and a lot of free time!

Lastly, it's very important to note that canvas-as is a work in progress, and the example I used was contrived to prove a point about an experimental web technology built on a language that hasn't fully matured yet.

Questions?

Feel free to leave questions below! I want to engage the community on my project to discuss how things can be done, and I look forward to meeting you all on dev.to.

Thank you for reading my article.

-Josh

Top comments (7)

Joe Pea • May 21 '19

What would you conclude from this? Is it worth the effort to use canvas-as? Or should we just stick to JS (and keep in mind those performance tricks you mentioned like not repeating fills, etc)?

jtenner • May 22 '19 • Edited

So far, what I've honestly seen is very little love for CanvasRenderingContext2D. Most of us who use it aren't making fully fledged 2d games. (some of us are, but I don't think they are majority)

One of the major takeaways was that minimizing draw calls, as you state, was the most important thing to optimize. It's not just draw calls either. Setting fillStyle and strokeStyle values call into the color parser. Preventing the need to parse colors can add up to a relatively big deal.

One thing is for certain. If you need Javascript, then it's probably important to just let v8 and Firefox compile your Javascript. I mean this sincerely and honestly as an AssemblyScript evangelist***.

In the case of, say, an engine, it might be more wise to use a compiled to web assembly solution. When assemblyscript gets reference counting and garbage collection, things are going to be very very different! But for now? I think regular js is a perfectly valid solution.

Testing is probably the only way to get the "best" solution. If you can minimize cpu usage and avoid calling into and out from wasm often, it might be worth it. Reduced garbage collection time is always worth looking into.

Edit: please check out github.com/as2d/as2d and use github.com/jtenner/as-pect for your testing if you decide to jump down the rabbit hole to see how far it goes.

Joe Pea • May 24 '19 • Edited

Nice projects!

I'm thinking to make an AS WebGL experiment.

So, it seems like the trade off isn't huge, at least for the that use case (f.e. it isn't doing lots of number crunching, just organizing/optimizing calls to the canvas).

But, you mentioned that size of GC invocations was reduced, and the invocations happened more often. Does this at least reduce the "jank", the periodic pauses, so that the app performs more consistently over time, thus user experience can be improved? If so, maybe that alone is worth it. Thoughts on that?

jtenner • May 24 '19

Yes that was my exact conclusion. Web Assembly is only going to get better, and soon, when multivalue returns and reftypes are supported, the bridge between Javascript and web assembly will be very short to cross.

As for web assembly using webgl, the linked functions you use will require a large amount of Javascript glue, just like as2d. Please feel free to delve in and get your hands dirty. I am interested to see how far the rabbit hole goes.

Joe Pea • May 24 '19 • Edited

I fear how much time the rabbit hole will take to explore, but if the results are worth it...

I'm guessing having a scene graph 🌲 structure (thinking something like Three.js) and doing all the matrix updates in Wasm would show some gain.

Then on the JS side it would need to get the list of objects/commands to render from wasm, which could be in a typed array in some format.

jtenner • May 24 '19

If you decide to venture, I'll join you on your quest! Good luck.

Joe Pea • May 25 '19

Cool. I think I'm going to take Babylon.js, which is already in TypeScript, and start by porting a minimal set of features over in order to get cubes rendering on screen as a PoC.

Mostly I think it'll be converting numbers to i* types, and editing the renderer so that is connects with the JS glue to a canvas.

DEV Community