I’ve been working recently on a benchmark, to try and see how to get the most performance from the Dark backend. I’ve reimplemented the core of Dark in various languages and web frameworks, notably in OCaml, F# and Rust.
As a reminder, Dark is a language and platform for building backend web services and APIs. Implementation-wise, it’s basically an interpreter hooked to a webserver and a DB. The language is a statically-typed functional, immutable language which is garbage collected.
Dark users can write arbitrary code that runs on our server, including making HTTP calls to slow and badly-behaving 3rd-party web servers. This means we need to efficiently support both computation and IO on the server. This benchmark is meant to measure that.
FuzzBuzz is a well-known interview question, that may or may not have been appropriate in a bygone era, but that remains in existence today due to interviews that have not moved on. It asks you to list 100 numbers, and if they are divisible by 3 or 5 you print “fizz” or “buzz” respectively.
FizzBoom is the same benchmark, except instead of printing “fizzbuzz” when a number is divisible by both 3 and 5, you instead make a HTTP call to a local server which takes 1 second to respond. Here’s what this looks like in Dark:
let range = List::range 1 100 List.map range \i -> if i % 15 == 0 then HttpClient::get "http://localhost:8000/delay/1" else if i % 3 == 0 then "Fizz" else if i % 5 == 0 then "Buzz" else toString i
Typically, someone builds benchmarks and then releases them to the wild. Invariably, they’ve overlooked something, and hordes of language advocates decry how it’s unfair. To avoid this unfairness, I’d like to create an opportunity for language advocates to improve the benchmarks before I release them. I’ll discuss the goals of this below, but feel free to jump right to the issues if you prefer.
The benchmark measures two numbers, calculated using wrk:
- requests per second for HTTP calls to /fizzbuzz, which returns FuzzBuzz as JSON
- requests per second for HTTP calls to /fuzzboom, which return FizzBoom as JSON.
Benchmark name Req/s --------------------------------------------------------- fsharp-giraffe: 24962.95 fsharp-giraffe-async: 19476.78 fsharp-suave-async: 1147.71 fsharp-suave-partial-async: Skipping broken benchmark ocaml-httpaf: 14034.62 ocaml-httpaf-lwt: 14158.74 rust-hyper: 15985.69 rust-hyper-async: Skipping broken benchmark
However, it also shows that we’re not doing async right on any platform — FizzBoom languishes at 1 req/s on all platforms. Obviously, this is because the code I wrote doesn’t work, and is not actually a reflection on the languages and frameworks.
Benchmark name Req/s -------------------------------------------------------- fsharp-giraffe: 1.00 fsharp-giraffe-async: 8.99 fsharp-suave-async: 1.00 fsharp-suave-partial-async: Skipping broken benchmark ocaml-httpaf: 0.10 ocaml-httpaf-lwt: 0.10 rust-hyper: Invalid fizzboom output rust-hyper-async: Skipping broken benchmark
If you’re worried about your language doing well in the benchmark, or are simply looking to help, there are a number of things you can do:
optimize your language’s web server: I may have used a poorly performing web server, have it in a poor configuration, or have hooked things up poorly
fix your language’s async benchmark: when making a HTTP call to a 3rd-party webserver, the server should free the CPU to handle other requests while the IO is running. I don’t have that working correctly in any language yet (unsure why this isn’t trivial in all platforms, but there we are)
optimize your language’s build performance: fix the build settings so that it’s being optimized to the best of its ability.
I’m also interested, but very cautiously, in improving the implementation of the interpreters. The prime directive for the interpreters is that they’re easy to modify and extend: Dark is a language going under much change, so rewriting the interpreters with a JIT, or assembly, or peephole optimizations, is not something I’m interested in. But small implementation changes that have big wins are valuable, whether they’re applicable to all languages or show off specific features of your favorite language. All the same, I’m going to be quite conservative on this, I don’t want to turn this into a game where we write non-idiomatic code to squeeze out a win that no-one would want to maintain.