WebAssembly Module
The simplest WebAssembly Module is
The first four bytes 00 61 73 6d
represent the header, that translates to \0asm
. This denotes the asm.js
. The asm.js
is the predecessor of the WebAssembly
.
The next four bytes 01 00 00 00
represent the version. Currently, the WebAssembly is in its version 1
.
Every WebAssembly module has this mandatory header information. Followed by the following sections:
- Function
- Code
- Start
- Table
- Memory
- Global
- Import
- Export
- Data
All the above-mentioned sections are optional except for the magic header
and version
.
The JavaScript engine upon receiving the WebAssembly module, decode and validate the WebAssembly module.
Check out my book on Rust and WebAssembly here
The validated modules are then compiled and instantiated. During the instantiation phase, the JavaScript engine produces an instance
. The instance is a record that holds all the accessible state of the module. The instance is a tuple of section and its contents.
How WebAssembly module is constructed
The WebAssembly module is split into sections
. Each section
contains a sequence of instructions or statements.
Header Information (MagicHeader Version)
- function [function definitions]
- import [import functions]
- export [export functions]
Each of the section has a unique ID. The WebAssembly module uses this ID to refer to the respective function.
Header Information (MagicHeader Version)
- (function section id) [function definitions]
- (import section id) [import functions]
- (export section id) [export functions]
For example, the function
section consists of a list of the function definition.
Header Information
- function [add, subtract, multiply, divide]
Inside the module, the function is called using the list index. To call add
function, the module refer the function in index 0
of function section.
Section format
WebAssembly module contains a set of sections. In the binary format, each section is in the following structure:
<section id> <u32 section size> <Actual content of the section>
The first byte of every section is its unique section id.
Followed by an unsigned 32-bit integer
, that defines the module's size in bytes. Since it is a u32
integer, the maximum size of any section is limited to approximately 4.2 Gigabytes
of memory (that is 2^32 - 1).
The remaining bytes are the content of the section. For most of the sections, the <Actual content of the section>
is a vector.
Function
The function section have a list of functions. The function section is of the following format:
0x03 <section size> vector<function>[]
The unique section id of the function section is 0x03
. Followed by an u32 integer, it denotes the size of the function section. Vector<function>[]
holds the list of function
.
The WebAssembly module instead of using function names
uses the index of the function to call the function. This optimises the binary size.
Every function
in the Vector<function>
is defined as follows:
<type signature> <locals> <body>
The <type signature>
holds the type of the parameters and their return type. The type signature specifies the function signature i.e., type of parameters and return value.
WebAssembly is size optimised. All the type signature used in the module is defined in the type section. Refer more about type section below. The function only uses the index of the type section here.
The <locals>
is a vector of values that are scoped inside the function. The function section collates the locals to the parameters that we pass to the function.
The <body>
is a list of expressions. When evaluated the expressions should result in the function's return type.
Note the expressions here are not pure always. The globals of the WebAssembly module are mutable and the shared memory is mutable too.
To call a function, use $call <function index>
(represented by an opcode). The arguments are type validated based on the type signature. Then the local types are inferred. The arguments of the function are then concatenated with the locals.
The expression of the function is then set to the result type defined in the type definition. The expression type is then validated with the signature defined in the type section.
The spec specifies the locals and body fields are encoded separately into the code section. Then in the code section, the expressions are identified by the index.
The order of the types and function sections matters. While hacking on the raw bytecode, proper care should be taken to preserve this order. Refer code section below.
Type
A WebAssembly module with one or more functions starts with a type section.
Everything is strictly
typed in WebAssembly. The function should have a type signature attached to it.
To make it size efficient, WebAssembly module creates a vector of type signatures and uses the index in the function section.
The type section is of the following format:
0x01 vector<type>[]
The unique section id of the type section is 0x01
. Followed by the Vector<type>[]
holds the list of type
.
Every type
in the Vector<type>
is defined as follows:
0x60 [vec-for-parameter-type] [vec-for-return-type]
The 0x60
represents the type of information is for the functions. Followed by the vector of parameter and return types.
The type section also holds the type for values
, result
, memory
, table
, global
. They are differentiated by the first byte.
The type is one of f64
, f32
, i64
, i32
. That is the numbers. Internally inside the WebAssembly module, they are represented by 0x7C
, 0x7D
, 0x7E
, 0x7F
respectively.
Note: The
type
information might change in the future when WebAssembly starts to support other types.
Code
The code section holds a list of code entries. The code entries are a pair of value types
and Vector<expressions>[]
.
The code
-section is of the following format:
0x0A Vector<code>[]
Every code
in the Vector<code>
is defined as follows:
<section size> <actual code>
The <actual code>
is of the following format:
vector<locals>[] <expressions>
The vector<locals>[]
here refer to the concatenated list of parameters
and local scoped inside the function
. The <expression>
evaluates to the return type.
Start
The start
section is a section in the WebAssembly module which will be called as soon as the WebAssembly module is loaded.
The start function is similar to other functions, except that it is not classified into any type. The types may or may not be initialized at the time of its execution.
The start section of a WebAssembly module points to a function index (the index of the location of the function inside the function section).
The section id of the start function is 8. When decoded the start function represents the start component of the module.
At this moment
Webpack
, does not support the start section. The start section is rewritten into a normal function call and it is called when the JavaScript is initialised by the bundler itself.
Import section
- contains the vector of imported functions.
Export section
- contains the vector of exported functions.
If you have enjoyed the post, then you might like my book on Rust and WebAssembly. Check them out here
Discussions 🐦 Twitter // 💻 GitHub // ✍️ Blog // 🔸 HackerNews
If you like this article, please leave a like or a comment. ❤️
Top comments (0)