(This is a technical counterpart to my post on using Wuffs to decode GIFs in the browser. If you'd just like to see fast GIF decoding, go here! đźď¸âĄď¸đ)
Converting C/C++ to Web Assembly (WASM) isn't easy. The de-facto toolchain as of July 2018 is Emscripten, an amazing but challenging set of tools which does the work for us. When Emscripten runs on your code, it generates a few things:
- the compiled
.wasm
file itselfâcontaining your code plus a runtime - boilerplate
.js
alongside that - (optional) a
.html
harness for running the program
For an average build, the .js
fileâwhich is what you include to run WASMâcan be 100k or so. And aside the obvious bloat, why is this a bad thing?
- it adds a singleton
Module
variable to your global scopeânot suitable for libraries, only for monolithic appsâthe docs even call this out - calling JS back from WASM needs
EM_ASM
macros, which call global (!) JS methodsâpoor for modularizing code and hides JS idioms like.bind(this)
- it performs its own
window.fetch
for the.wasm
ânot ideal in Node or if you want to contain everything in a single file đ - and, somewhat subjectively, it's quite hard to read/parse/modify.
To Be Fair
The boilerplate Emscripten generates isn't unreasonableâthe Web Assembly format is actually quite simple, so even the concepts of having a 'heap', or dealing with strings or complex data types, are something compilers need to do. There's no inbuilt concepts here, and Emscripten goes a long way to get you moving fast. đ˛
But everything it does could also be unwieldy. Here, I'm going to document it, so you can instantiate the .wasm
yourself. But beware! This doesn't provide any of the EM_
.. magic, like binding, exposing C++ objects in JavaScript, etc. This sugar is sometimes great, but it comes at a cost. And if you're wrapping simple C (or even C++, which you could wrap) then you don't need it.
First Steps
Let's build a simple C library that reads a PNG file and fill a passed struct
with metadata, and calls JavaScript back via an extern
method. This is a totally artificial demo which shows off some challenging tasks:
- passing arbitrary sized data into WASM, as well as using
malloc
andfree
- dealing with C structs
- calling back to JS
- returning a string error description.
Save the above file to disk. To build, we can run emcc
like thisânote that we must include -O1
, to remove some needless methods:
emcc -O1 -s WASM=1 -s EXPORTED_FUNCTIONS="['_parse_png','_malloc','_free']" png.c -o png.js
Great! We now have png.wasm
and png.js
. For me, the JS is 62k, and the WASM is 10k. Actually, setting -O3
will bring this down quite a lot, but let's keep going with this for now.
Default, Minimal Harness
If you include png.js
in a HTML file and load it from a local web server, you'll get access to the global Module
varâwhich has a bunch of properties, including _parse_png
. But for the reasons above, we don't want to use this boilerplateâit's too prescriptive. So here's the actual minimal JS needed:
const memoryPages = 256;
const memory = new WebAssembly.Memory({initial: memoryPages, maximum: memoryPages});
const stackTop = 2672; // WARNING: Different per-program.
const env = {
DYNAMICTOP_PTR: stackTop,
STACKTOP: stackTop + 16,
STACK_MAX: 1024 * 1024 * 5,
abort() { throw new Error('abort'); },
abortOnCannotGrowMemory() { throw new Error('abortOnCannotGrowMemory'); },
enlargeMemory() { throw new Error('enlargeMemory'); },
getTotalMemory() { return memory.buffer.byteLength; },
___setErrNo(v) { throw new Error('errno'); },
_emscripten_memcpy_big(dst, src, num) {
view.set(view.subarray(src, src + num), dst);
return dst;
},
_chunk_callback() { /* our callback method */ },
memory: memory,
};
// tell Emscripten's malloc where to start
(new Uint32Array(memory.buffer))[env.DYNAMICTOP_PTR >> 2] = env.STACK_MAX;
Promise.resolve(true).then(async () => {
// use async so we can await for Promises to finish
const buffer = await (await self.fetch('png.wasm')).arrayBuffer();
const module = await WebAssembly.instantiate(buffer, {env});
// save exports on window so you can debug them
const exports = module.instance.exports;
console.info('got exports', exports);
return window._exports = exports;
// ... but if you were writing a library, you'd continue using the Promise
}).catch((err) => console.error('oh no!', err));
Great. So, let's start with the environment being passed to Web Assembly. It's not simple, like some "pure" WASM examplesâthat's because Emscripten's runtime expects a lot from us. The methods on the object are pretty self-explanatory:
-
abort
,abortOnCannotGrowMemory
,___setErrNo
are to deal with failures -
enlargeMemory
isn't implementedâwhen our code runs out of memory, it will crash -
getTotalMemory
does what it says on the tin - and
_emscripten_memcpy_big
implements the C methodmemcpy()
.
Your code might need more, depending if it calls other C methods. In many cases, like you see here, we can just throw an Error
âthey're often unexpected conditions that we don't really have to deal with.
But what about the properties above thatâthose ones with the magic numbers? So, they all revolve around memory, and if you don't have them correct, your WASM will probably fail. Let's learn about them.
Memory in Web Assembly
Web Assembly has an inbuilt stack, at top: we can also provide it memory, at bottom
Web Assembly has two types of memory. First, it's inbuilt stack, used for local variables in methodsâthis isn't accessible by your JS. Secondly, the WebAssembly.Memory
object, which most of these magic numbers refer to, and which can be used by the runtime (in this case Emscripten) in literally any way it wants.
memoryPages
: Emscripten's fixed memory
In our boilerplate code, this variable is used to create a WebAssembly.Memory
of this many 'pages'. By default, with no flags, Emscripten requests 16mbâeach page is 65k, so that's 256 pages. (You can request more with e.g., -s TOTAL_MEMORY=32mb
)
But turns out, our WASM file actually knows what it requires. You can load your WASM file in wasm2wat: look for a line like the following.
(import "env" "memory" (memory $env.memory 256 256))
Why this value is in the WASM file, yet I also need to specify it, I'll never know. Although if you know, please email me. đ¤đ
STACKTOP
: Where the "stack" begins
This variable is a misnomer. As I mentioned above, Web Assembly has its own internal stack for its own callsâone that is not exposed in any memory we pass into the object. It has an intentionally ambiguous size, and it's only used for variables local to functionsâand even then, only when they're one of the four primitive Web Assembly types (i32, i64, f32, f64).
So, what is the "stack" (quotes intentional), then? Well, it's generated by Emscripten, which uses it for allocating larger types (e.g. structs, char x[10] = "I'm long\n";
). Emscripten also adds stack-related methods to its exports (e.g. exports.stackAlloc
). How does it use this? Well, a decompiled method looks a bit like:
(func $_test_stack (export "_test_stack") (type $t5) (param $p0 i32) (param $p1 i32) (result i32)
(local $l0 i32) (local $l1 i32) (local $l2 i32) (local $l3 i32)
(set_local $l1 # save stackTop
(get_global $g4))
(set_global $g4 # modify stackTop while in this function
(i32.add
(get_global $g4)
(i32.const 32))) # we want 32 bytes of stack
#
# ... rest of function removed
#
(set_global $g4 # restore old stackTop
(get_local $l1))
(get_local $p0)) # return value
The stackTop
variable is stored in $g4
, so we store it for ourselves into $l1
, modify it to add 32 bytes, and then continue. We restore it at the end. During the method, we now have free reign over these 32 bytes.
But why 2672 (plus 16)?
The value of stackTop
in my exampleâ2672âis the value that was built into Emscripten's JS harness (you can find it by searching for STATICTOP = STATIC_BASE +
âSTATIC_BASE
is 1024, so just add the numbers together).
(We add 16 to 2672, but ignore this for now. I'll explain below under DYNAMICTOP_PTR
.)
This is a number that's bigger than the constants in the Web Assembly program (1), plus room for Emscripten's malloc()
implementation to work (2), storing fixed information seemingly about what's currently been allocated.
a. Your program will have a fixed size of constants, and you can discover this by looking at the wasm2wat output again: at the very bottom of the file, we see our data section. Look for the last line (although in our example, we only have one).
(data (i32.const 1024) "\89PNG\0d\0a\1a\0ainvalid header\00chunk too large\00IHDR\00IHDR has wrong size\00couldn't malloc for copy\00header not found"))
This says that at 1024, we have roughly ~120 bytes of data. (The last string isn't null-terminated, because all memory starts off as NULL
).
b. How much space does malloc()
require? This is unclear, and seemingly an implementation detail of Emscripten. The values it uses are actually inlined into the generated WASM code, so there's no way to configure it.
Unfortunately, the best way to find out our magic constant is to look into the generated .js
, as above. If you want to wing it though, I'd suggest looking at your constants, and maybe adding ~10k of buffer just to be safe.
STACK_MAX
: Where the "stack" ends
This tells Emscripten where to stop allocating larger types on the "stack". It's a fixed number controlled by a compile-time flag, -s TOTAL_STACK=...
.
However, this isn't actually checked unless you enable assertions, passing -s ASSERTIONS=1
to your compile. Without this, you can blow the stack and allocate into the heap.
DYNAMICTOP_PTR
: Where the heap lives
Emscripten's implementation of malloc
will begin its allocation at the value pointed to by DYNAMICTOP_PTR
. If you look at our boilerplate, we set a value:
// tell Emscripten's malloc where to start
(new Uint32Array(memory.buffer))[env.DYNAMICTOP_PTR >> 2] = env.STACK_MAX;
This sets, at DYNAMICTOP_PTR
(we divide it by four, as we're indexing 32-bit ints), the ending point of Emscripten's "stack". Everything after here will be used for malloc
.
Emscripten puts this value in its generated .js
code, at the top of the stack. If you were wondering why we set DYNAMICTOP_PTR
to stackTop
, but then STACKTOP
to stackTop + 16
, this is whyâwe just give it 16 bytes to play with (which is obviously more than the four it needs for a 32-bit number).
Notably, this value is read at instantiation time and used as the 'bottom' of the heap. So you must set its value before you call WebAssembly.instantiate
. Emscripten's runtime will set the value later, so you can examine the value at DYNAMICTOP_PTR
to see how much the heap has grown (aka how much your program has malloc
ed).
Function Table
Some builds require you to pass what's known as a table import. There's some great posts about what this is, but effectively it's a safe way to specify function pointersâin a way that wouldn't be possible if they shared the WebAssembly.Memory
object.
Unfortunately, if any of the code you use or import use function pointers, even just as an implementation detail, you'll need to provide a table (one quick way to demonstrate this is to add a call to sprintf()
).
If so, update the env to add something like:
const env = {
// ... rest
table: new WebAssembly.Table({initial: 2, maximum: 2, element: 'anyfunc'}),
tableBase: 0,
}
You'll need to modify this until WASM is happy. Again, this information is in the compiled WASM format, so it's not clear why we need to specify it, but here we are!
We've digressed enormously, to explain the requirements of setting up Web Assembly with Emscripten. But what about actually using our method? Well, let's now do it.
That code we compiled has two interesting methods on its exports: _parse_png
, and _malloc
. Let's actually use them in JS. We can add an extra .then
to our setup:
.then((exports) => {
const b64PNG = 'iVBORw0KGgoAAAANSUhEUgAAAAEAAAABCAYAAAAfFcSJAAAACklEQVR4nGMAAQAABQABDQottAAAAABJRU5ErkJggg==';
const buf = Uint8Array.from(atob(b64PNG), (c) => c.charCodeAt(0));
const at = exports._malloc(buf.length);
const view = new Uint8ClampedArray(memory.buffer);
view.set(buf, at);
const infoAt = exports._malloc(4 * 3); // alloc space for info_t, 3x4-byte int
const outStringAt = exports._parse_png(at, buf.length, infoAt);
console.info('got output:', readString(outStringAt));
})
And update our chunk_callback
method from the environment:
_chunk_callback(type, len, dataAt) {
console.info('chunk', type, len, dataAt);
},
Strings
The _parse_png
method returns the address of a char *
from the consts section of the WASM program if an error occurs. We need a readString
method to pull that outâfinding the first NULL
byte and converting it to a JS stringâusing TextDecoder
:
function readString(start) {
const view = (new Uint8ClampedArray(memory.buffer)).slice(start);
let end = 0;
while (view[end++]) {}
return (new TextDecoder()).decode(view.slice(0, end));
}
If we try it out but muck with the base64-encoded image, we'll now see an error like this, containing a string:
Of course, if we do it right, you'll see something like this, detailing the chunks in the image (and no string, as we return NULL
):
Structs
In the sample code above, we allocate space for the info_t
struct defined in the program. This is probably the most awkward part of not using Emscripten's generated JS code. Because our struct has three int
values, we know that it takes up 12 bytes of memory, so we can malloc that space:
const infoAt = exports._malloc(4 * 3); // alloc space for info_t, 3x4-byte int
To read our individual values, we need to load them using a view:
const infoAt = exports._malloc(4 * 3); // alloc space for info_t, 3x4-byte int
const outStringAt = exports._parse_png(at, buf.length, infoAt);
console.info('got output:', readString(outStringAt));
// .. add this code
const infoView = (new Uint32Array(memory.buffer)).slice(infoAt >> 2, 3);
console.info('count', infoView[0]); // contains 'chunks'
console.info('width', infoView[1]); // contains 'w'
console.info('height', infoView[2]); // contains 'h'
The three value are in order as defined in the C struct. And any native types (int32
or int64
) will be packed on 4 or 8 byte boundaries. Although as an aside, using int64
with JavaScript is a challengeâthe default JS number type can't accurately represent it (as it's a 64-bit floating point number). Try not to pass these across the JS/WASM boundary.
Information about structs is the sort of thing that Emscripten can generate for us at compile-time: helpers that literally convert memory to a nice, friendly JS structure containing our three values with names. But for most libraries, you're going to need this only a few times, so it's not infeasible to write manually.
Further Thoughts
This post came out of building fastgif as a small, fast library for decoding GIFs. I wanted to use Emscripten, but it's generated code was too bloated and didn't work well to provide a compartmentalized ES6 module.
Fixed WASM
The work you do to hand-write your JS around a compiled bit of WASM is 'fixed'. The WASM has no external dependencies: Emscripten is something you use to build, and the code it generates doesn't change over time.
In fastgif, we have a hand-written env
object that we pass to WebAssembly.instantiate
, just like the one we detailed above. Once it works thoughâit doesn't need to be changed, because WASM is so limited and the core language doesn't change over time.
Promise-based
It's worth calling out that any library that uses Web Assembly must be based on a Promise
or some async work, because fundamentally WebAssembly.instantiate
returns a Promise
.
Fastgif solves this by making the API itself Promise
-based, depending first on the instantiation before doing any further work. By accepting that all our APIs are async, it could also allow us to work more freely with other, dependent APIs.
decode(buffer) {
return this._exports.then((exports) => {
const buf = new Uint8Array(buffer);
const at = exports._malloc(buf.length);
// more stuff
return result;
});
}
Size đď¸ + Shipping â´ď¸
By dropping the JS generated by Emscripten, we can reduce the size of code you ship to clientsâthat's been one of the main themes of this post. Fastgif also ships with just one JS file, by encoding the WASM code itself in base64 and decoding it at creation time.
Without the WASM code itself, the JS wrapped for fastgif is about ~4.3k (uncompressed). With the WASM code bundled as base64, it's ~44k (uncompressed) or ~20k (compressed)âand if we encoded in say, base128 or base192, the size could be even smaller.
This is a bit smaller than default Emscripten output of total ~70k for our demo program, above. And, our approach of including the code in the JSâeven though base64 adds overheadâmeans that we only have to do one network request to fetch our library, and bundlers can be happier as they're not trying to include this random related .fetch
-ed resource.
Fin
If you've got this far, I'm proud of you! This has been a very long post about some of Emscripten's internals as of July 2018. đ´
Why did I write this post? Emscripten is a wonderful tool, but it has a long history (for asm.js
), and isn't perfect. I think it errs too much on the side of "magic", and many posts rave about how it's so easy to EM_ASM_
or use binding-fu, but this all comes at a cost, and can introduce huge amounts of inadvertent overheadâthink copying huge memory buffers around because we're trying to make them immutable or easily exposed.
Every language that is being compiled to Web Assembly needs a runtimeâwhether it be Go, or Rust, or C/C++ as we have here. I don't believe that we'll ever really be able to directly import Web Assembly via ES2015 modules, at least not without changes on the JS side. But it behooves us to write the smallest one we possibly can.
Top comments (4)
Wonderful article, though I have a quick question.
Why do you think this? Personally I think should have been worked on immediately after the MVP and think this is a really important piece of functionality missing from WASM.
I don't think it's unsolvable but as I mentioned, not without changes on the JS side. I suppose the key thing I wanted to say is "directly"âthe challenge is that the current
import
statement just isn't expressive enough, because in 99% of cases we need to provide an environment to the.wasm
code itself.And it's challenging because some extension like this makes no sense:
I've invented
with
to provide the language support here, but regardless: with ES2015 modules,import
is evaluated before all code, soenv
is just undefined on the import line.Very basic WASM could be supported, because it doesn't need importsâbut as soon as you need to provide memory, or any of the calls or variables I described above, all bets are off.
There might also be some solution I've not thought of yet :)
Hi Sam!
Really great article, and really long.
Thanks for that!
Now I see that wasm isn't silver bullet.
But anyway looks promising.
Are there already other runtimes?
I'm not sure of any other runtimes for compiling C/C++ to WASM, no, sorry. The complaints I have are repeated on the bug tracker (possibly in more places too).
So I suspect there might some community projects because of the issues I'm raising. Let me know if you find any!
(Of course, there's also runtimes/compilers for Go, Rust, etcâI've not researched them enormously. I believe Go is much nicer to work with because its JS boilerplate is just a common bit of code you include, it's not built per-binary, which means it's much, much easier to understand and work with.)