Cover Photo by Arun Prakash on Unsplash
Have you ever considered enhancing your Rust program's capabilities with
another language? WebAssembly (wasm) is a perfect candidate to script
behavior and add capabilities without the hassle. Starting with
WebAssembly is not as difficult as it might seem. This article will
explore how to embed Web Assembly code in a Rust application.
What is Web Assembly?
According to MDN:
WebAssembly is a new type of code that can be run in modern web
browsers — it is a low-level assembly-like language with a compact
binary format that runs with near-native performance and provides
languages such as C/C++, C# and Rust with a compilation target so that
they can run on the web. It is also designed to run alongside
JavaScript, allowing both to work together. –
https://developer.mozilla.org/en-US/docs/webassembly
wasm has been picking up steam since its inception in March 2017. While
initially designed for the browser, people are actively working on
building runtimes and environments on all sorts of platforms. This post
focuses on using wasm as a secure runtime to script application
behavior.
Implement a Custom Programming Language?
Sometimes, we might need the user to script certain behaviors in our
application. While use cases vary, the first question that comes up is:
Do I need to build my own Domain Specific
Language(DSL)?
The inner geek might answer this with a definite YES! It's a great
challenge, though rolling a custom language has some serious drawbacks:
- Users need to spend extra time learning it
- There's little to no support on pages such as StackOverflow
- Additional confusion if the DSL imitates syntax from an existing language
As the first two bullet points line out, a lot of effort goes into
supporting a custom language. While it grants you many liberties when
implementing use cases and features, the developer experience often
suffers. Before discussing why Web Assembly is the solution, let's look
at a real-world example of embedding another language.
CouchDB: Embedding JavaScript
Apache CouchDB belongs to the family of
NoSQL databases. It is a document store with a strong focus on
replication and reliability. One of the most significant differences
between CouchDB and a relational database (besides the absence of tables
and schemas) is how you query data. Relational databases allow their
users to execute arbitrary and dynamic queries via
SQL. Each SQL query may look
completely different than the previous one. These dynamic aspects are
significant for use cases where you work exploratively with your dataset
but don't matter as much in a web context. Additionally, defining an
index for a specific table is optional. Most developers will define
indices to boost performance, but the database does not require it.
CouchDB, on the other hand, does not allow developers to run dynamic
queries on the fly. This circumstance might seem a bit odd at first.
However, once you realize that regular web application database queries
are static, this perceived restriction does not matter as much anymore.
While implementing use cases, a web developer defines a query while
implementing code. Once defined, this query stays the same, no matter
where the code runs (development, staging, and production).
CouchDB also does not use SQL or a custom language. Instead, it
leverages a language most web developers are already familiar with:
JavaScript. The main database engine is written in Erlang, allowing
users to specify queries in JavaScript. This design decision has several
reasons: CouchDB requires users to build an index by default. Users will
query a specific index when they want to fetch data. To construct an
index, CouchDB will run user-defined JavaScript code. This code comes in
the form of a map
function that gets called for each document in the
database. The map
function decides if a document should be listed in
the index.
//Example of a map function in CouchDB
//Source: https://docs.couchdb.org/en/stable/ddocs/ddocs.html#view-functions
function (doc) {
if (doc.type === 'post' && doc.tags && Array.isArray(doc.tags)) {
doc.tags.forEach(function (tag) {
emit(tag.toLowerCase(), 1);
});
}
}
The advantage here is clear: The user can define complex logic to build
their indices in a well-documented and understood programming language.
The barrier to getting started is much lower compared to a database that
uses a custom language.
WebAssembly is the Answer
Embedding JavaScript has been a well-accepted solution for a long time.
The main reason is that JavaScript does not have a substantial standard
library. It lacks many features other languages ship out of the box,
such as any form of I/O. This lack of features is primarily due to the
fact that JavaScript running in the browser does not require filesystem
access (it's a security feature).
-
The disadvantage*: JavaScript runtimes are complex and are designed
for long-running programs. Also, we might not want to rely on
JavaScript in the first place but on other languages. That's where
WebAssembly becomes an alternative. First of all, its footprint is
much smaller. It does not come with a standard library nor allows
access to the outside world by default. From a developer's point of
view, the main benefit is: We can compile almost every currently
popular programming language to WebAssembly. We can support both
users if one user prefers to write Python while another wants to
stick to JavaScript. While we will look at some wasm code later in
the tutorial, it is mainly a compilation target. Developers do not
write wasm by hand.The wasm runtime does not care what language you write your code in.
Our advantage: An improved developer experience, as developers can
leverage the language they're most comfortable with and compile down
to Web Assembly at the end of the day.
wasm is the perfect candidate as an embedded runtime to allow users to
script application behavior.
How to Embed WebAssembly in a Rust Application
In this tutorial, we rely on
wasmtime
. wasmtime
is a standalone runtime for WebAssembly.
Let's start by creating a new Rust project:
$ cargo new rustwasm
$ cd rustwasm
Next, let's install a few crates we'll need:
$ cargo add anyhow
$ cargo add wasmtime
Once the crates are installed, we add the code to our main.rs
file
(explanation below):
//Original Code from https://github.com/bytecodealliance/wasmtime/blob/main/examples/hello.rs
//Adapted for brevity
use anyhow::Result;
use wasmtime::*;
fn main() -> Result<()> {
println!("Compiling module...");
let engine = Engine::default();
let module = Module::from_file(&engine, "hello.wat")?; //(1)
println!("Initializing...");
let mut store = Store::new(
&engine,
()
); //(2)
println!("Creating callback...");
let hello_func = Func::wrap(&mut store, |_caller: Caller<'_, ()>| {
println!("Calling back...");
}); //(3)
println!("Instantiating module...");
let imports = [hello_func.into()];
let instance = Instance::new(&mut store, &module, &imports)?;
println!("Extracting export...");
let run = instance.get_typed_func::<(), ()>(&mut store, "run")?; //(4)
println!("Calling export...");
run.call(&mut store, ())?; //(5)
println!("Done.");
Ok(())
}
Step 1:
We start by loading a module from disk. In this case, we're loading
hello.wat
. By default, you would distribute Web Assembly code in
binary form. For our purposes, however, we rely on the textual
representation (wat
).
Step 2:
We also want to share some state between the host and wasm code when we
run wasm code. With Store
, we can share this context. In our case, we
don't need to share anything right now. Therefore we specify a Unit type
as the second argument to Store::new
.
Step 3:
In this tutorial, we want to make a Rust function available to the wasm
code it then subsequently can call. We wrap a Rust closure with
Func::wrap
. This function does not take any arguments and also does
not return anything; it just prints out "Calling back"
when invoked.
Step 4:
wasm code by default does not have a main
function; we treat it as a
library. In our case, we expect a run
function to be present, which
we'll use as our entry point.
Before we can invoke the function, we need to retrieve it first. We use
get_typed_func
to specify its type signature as well. If we find it in
the binary, we can invoke it.
Step 5:
Now that we located the function let's call
it.
Inspecting Web Assembly Code
With our Rust program in place, let's have a look at the wasm code we
want to load:
(module
(func $hello (import "" "hello"))
(func (export "run") (call $hello))
)
The wasm text format uses so-called S-expressions.
S-expressions are a very old and very simple textual format for
representing trees, and thus we can think of a module as a tree of
nodes that describe the module's structure and its code. Unlike the
Abstract Syntax Tree of a programming language, though, WebAssembly's
tree is pretty flat, mostly consisting of lists of instructions. –
https://developer.mozilla.org/en-US/docs/WebAssembly/Understanding_the_text_format
This small program performs three tasks:
- It imports a function with the name
hello
from its host environment and binds it to the local name$hello
. ((func $hello (import "" "hello"))
) - It defines a function
run
which it also exports. - In
run
, it calls the imported$hello
function.
As mentioned earlier, .wat
(same for the binary representation) is not
something we usually write by hand but is generated by a compiler.
Run everything
With all components in place, let's run it:
$ cargo run
Finished dev [unoptimized + debuginfo] target(s) in 0.62s
Running `target/debug/rustwasm`
Compiling module...
Initializing...
Creating callback...
Instantiating module...
Extracting export...
Calling export...
Calling back...
Done.
Find the full source code on GitHub.
Top comments (0)