At some point, every software engineer will find themselves in a situation where they need to benchmark system performance and test the limits of what a given system can handle. This is a common problem in software engineering, and even more so in the applications that are well suited for Elixir.
Finding bottlenecks early on in an application can save a lot of time, money, and effort in the long run, and give developers confidence in the upper limit of a system.
In this post, we will introduce a tool called Benchee
to benchmark parts of an Elixir application. We will also show you how to integrate Benchee with your automated test suite.
By the end of the article, you'll understand Benchee's full functionality and capabilities, and will be able to use it to measure your application's performance.
Let's get going!
Why Do I Need to Benchmark My Elixir Application?
Benchmarking, in a nutshell, is the process of measuring the performance of a system under specific loads or conditions. As part of the benchmarking process, you will be able to identify potential bottlenecks in your system and areas to improve.
For example, we can use benchmarking tools to answer questions like:
- Can a system handle ten times the load of normal traffic?
- Can the system run on a smaller infrastructure to handle the same load?
- How long does it take to process 10, 100, 1,000, or 10,000 requests? Does the processing time scale linearly with the number of requests?
Just answering these questions alone can help your team avoid costly mistakes and proactively identify areas for improvement, avoiding downtime and unhappy users.
Benchmarking doesn't have to be expensive or time-consuming; it can be simple to get the right tools in place and make them part of an application's natural development life cycle.
What Is Benchee?
This is where Benchee comes in. Benchee is a tool that you can use to benchmark parts of an Elixir application. It is versatile and extensible, with more than a few plugins to enhance its functionality.
Prerequisites
Elixir Environment
To follow along, you will need to locally install Elixir and Phoenix. The easiest way to do so is to follow the official Elixir instructions, which will give you a couple of options for:
- Local installation on Linux, Windows, and macOS
- Dockerized versions of Elixir
- Package manager version setups
I recommend a local installation for the best results.
Setting Up Our Elixir Application
For this article's purposes, we will set up a simple Elixir application that can calculate the Fibonacci sequence.
Start by creating a new application with mix new fibonacci_benchmarking
:
As output, you will see the following:
* creating README.md
* creating .formatter.exs
* creating .gitignore
* creating mix.exs
* creating lib
* creating lib/fibonacci_benchmarking.ex
* creating test
* creating test/test_helper.exs
* creating test/fibonacci_benchmarking_test.exs
Your Mix project was created successfully.
You can use "mix" to compile it, test it, and more:
cd fibonacci_benchmarking
mix test
Run "mix help" for more commands.
Next, in your favorite editor, add the following code to the lib/fibonacci_benchmarking.ex
file:
defmodule FibonacciBenchmarking do
def list(number), do: Enum.map(0..number, &fibonacci/1)
def fibonacci(0), do: 0
def fibonacci(1), do: 1
def fibonacci(n), do: fibonacci(0, 1, n-2)
def fibonacci(_, prv, -1), do: prv
def fibonacci(prvprv, prv, n) do
next = prv + prvprv
fibonacci(prv, next, n-1)
end
end
Note: The original code can be found in rosettacode.org.
Go to the fibonacci_benchmarking
directory and run the following commands:
mix deps.get
iex -S mix
And once inside the elixir shell, you can run this:
iex(1)> FibonacciBenchmarking.list(10)
[0, 1, 1, 2, 3, 5, 8, 13, 21, 34, 55]
If you see the above output, you have successfully set up your application, and are ready to proceed with Benchee.
Implement Benchmarking on an Elixir Application
First, we will need to install Benchee. Start by adding the following to your mix.exs
file:
defp deps do
[
{:benchee, "~> 1.0", only: :dev}
]
end
Next, run the following command:
mix deps.get
We can validate that Benchee is installed by running the Elixir shell and the following snippet:
Benchee.run(%{
"10_seq" => fn -> FibonacciBenchmarking.list(10) end
})
Benchee might take a second or two to warm up, but on completion, you should see the following output:
Operating System: Linux
CPU Information: Intel(R) Core(TM) i7-7700K CPU @ 4.20GHz
Number of Available Cores: 8
Available memory: 62.76 GB
Elixir 1.13.3
Erlang 24.2.2
Benchmark suite executing with the following configuration:
warmup: 2 s
time: 5 s
memory time: 0 ns
reduction time: 0 ns
parallel: 1
inputs: none specified
Estimated total run time: 7 s
Benchmarking 10_seq ...
Name ips average deviation median 99th %
10_seq 973.04 K 1.03 μs ±2735.15% 0.77 μs 1.60 μs
The code above makes a call to FibonacciBenchmarking.list(10)
, and Benchee measures the time it takes to execute the function.
Let's take a moment to understand the output of Benchee. By default, Benchee will output the following information:
- ips stands for iterations per second. This number represents how many times a given function can be executed in a second. Higher is better.
- average is the average time it takes to execute the function. Lower is better.
- deviation is the standard deviation of the results. This is a measure of how much the results deviate from the average.
- median is the middle value of the results.
- 99th % - 99% of all the measured values are less than this value.
While running Benchee in this fashion can be useful for ad-hoc benchmarks, a much better method is to include Benchee as part of our unit tests.
Automate Benchee for Elixir and Run Tests
By default, all Elixir and Phoenix applications have a test
directory and use ExUnit
to run tests. Our goal is to get Benchee running as part of our test suite and test a different implementation of the Fibonacci sequence.
Start by creating a new file called test/benchee_unit_test.exs
, and copy the following code into it:
defmodule BencheeUnitTest do
use ExUnit.Case
alias Application.TestHelper
@tag :benchmark
test "benchmark fibonacci list generation" do
# capture benchee output to run assertions
output = Benchee.run(%{
"case_10_numbers" => fn() ->
FibonacciBenchmarking.list(10)
end
})
results = Enum.at(output.scenarios, 0)
assert results.run_time_data.statistics.average <= 50_000_000
end
end
Go ahead and run mix test
on the console. Validate that the output looks like the following:
Operating System: Linux
CPU Information: Intel(R) Core(TM) i7-7700K CPU @ 4.20GHz
Number of Available Cores: 8
Available memory: 62.76 GB
Elixir 1.13.3
Erlang 24.2.2
Benchmark suite executing with the following configuration:
warmup: 2 s
time: 5 s
memory time: 0 ns
reduction time: 0 ns
parallel: 1
inputs: none specified
Estimated total run time: 7 s
Benchmarking case_10_numbers ...
Name ips average deviation median 99th %
case_10_numbers 1.31 M 765.77 ns ±3552.51% 599 ns 1165 ns
.
Finished in 10.7 seconds (0.00s async, 10.7s sync)
1 test, 0 failures
Randomized with seed 612867
So far, we have integrated Benchee into our test suite and added the first test to validate one of the test cases. Let's add the second test case to compare. Update the test
function to:
defmodule BencheeUnitTest do
use ExUnit.Case
alias Application.TestHelper
@tag :benchmark
test "benchmark fibonacci list generation" do
# capture benchee output to run assertions
output = Benchee.run(%{
"case_10_numbers" => fn() ->
FibonacciBenchmarking.list(10)
end,
"case_1000_numbers" => fn() ->
FibonacciBenchmarking.list(1000)
end
})
results = Enum.at(output.scenarios, 0)
assert results.run_time_data.statistics.average <= 50_000_000
end
end
Just like we did before, we can run the test suite with mix test
, and validate that the output looks like the following:
Operating System: Linux
CPU Information: Intel(R) Core(TM) i7-7700K CPU @ 4.20GHz
Number of Available Cores: 8
Available memory: 62.76 GB
Elixir 1.13.3
Erlang 24.2.2
Benchmark suite executing with the following configuration:
warmup: 2 s
time: 5 s
memory time: 0 ns
reduction time: 0 ns
parallel: 1
inputs: none specified
Estimated total run time: 14 s
Benchmarking case_1000_numbers ...
Benchmarking case_10_numbers ...
Name ips average deviation median 99th %
case_10_numbers 1.33 M 0.00075 ms ±3419.36% 0.00059 ms 0.00109 ms
case_1000_numbers 0.00010 M 10.21 ms ±11.54% 9.83 ms 15.29 ms
Comparison:
case_10_numbers 1.33 M
case_1000_numbers 0.00010 M - 13598.03x slower +10.21 ms
.
Our second test scenario tries to compare the performance of the Fibonacci sequence with a list of 1,000 numbers; however, this is not a very practical way to test with multiple inputs. We can take advantage of the Benchee.run
hooks and provide a list of inputs for each scenario.
Go ahead and open the test/benchee_unit_test.exs
file and replace the contents with this code:
defmodule BencheeUnitTest do
use ExUnit.Case
alias Application.TestHelper
@tag :benchmark
test "benchmark fibonacci list generation" do
# capture benchee output to run assertions
output = Benchee.run(%{
"generate_list" => fn(input) ->
FibonacciBenchmarking.list(input)
end
},
inputs: %{
"case_10" => 10,
"case_100" => 100,
"case_1000" => 1000,
"case_10000" => 10000,
"case_100000" => 100000
})
results = Enum.at(output.scenarios, 0)
assert results.run_time_data.statistics.average <= 50_000_000
end
end
In this new version of the code, we have generalized our generate list case to accept a list of inputs, and we can now run the test suite with mix test
.
However, because of the size of our last input, you will get a message like this:
❯ mix test
Operating System: Linux
CPU Information: Intel(R) Core(TM) i7-7700K CPU @ 4.20GHz
Number of Available Cores: 8
Available memory: 62.76 GB
Elixir 1.13.3
Erlang 24.2.2
Benchmark suite executing with the following configuration:
warmup: 2 s
time: 5 s
memory time: 0 ns
reduction time: 0 ns
parallel: 1
inputs: case_10, case_100, case_1000, case_10000, case_100000
Estimated total run time: 35 s
Benchmarking generate_list with input case_10 ...
Benchmarking generate_list with input case_100 ...
Benchmarking generate_list with input case_1000 ...
Benchmarking generate_list with input case_10000 ...
Benchmarking generate_list with input case_100000 ...
1) test benchmark fibonacci list generation (BencheeUnitTest)
test/benchee_unit_test.exs:6
** (ExUnit.TimeoutError) test timed out after 60000ms. You can change the timeout:
1. per test by setting "@tag timeout: x" (accepts :infinity)
2. per test module by setting "@moduletag timeout: x" (accepts :infinity)
3. globally via "ExUnit.start(timeout: x)" configuration
4. by running "mix test --timeout x" which sets timeout
5. or by running "mix test --trace" which sets timeout to infinity
(useful when using IEx.pry/0)
where "x" is the timeout given as integer in milliseconds (defaults to 60_000).
code: output = Benchee.run(%{
stacktrace:
(elixir 1.13.3) lib/task.ex:794: Task.await/2
(elixir 1.13.3) lib/enum.ex:1593: Enum."-map/2-lists^map/1-0-"/2
(benchee 1.1.0) lib/benchee/benchmark/runner.ex:77: Benchee.Benchmark.Runner.parallel_benchmark/2
(elixir 1.13.3) lib/enum.ex:1593: Enum."-map/2-lists^map/1-0-"/2
(elixir 1.13.3) lib/enum.ex:1593: Enum."-map/2-lists^map/1-0-"/2
(benchee 1.1.0) lib/benchee/benchmark.ex:103: Benchee.Benchmark.collect/3
(benchee 1.1.0) lib/benchee.ex:48: Benchee.run/2
test/benchee_unit_test.exs:8: (test)
(ex_unit 1.13.3) lib/ex_unit/runner.ex:500: ExUnit.Runner.exec_test/1
(stdlib 3.17) timer.erl:166: :timer.tc/1
(ex_unit 1.13.3) lib/ex_unit/runner.ex:451: anonymous fn/4 in ExUnit.Runner.spawn_test_monitor/4
Finished in 60.0 seconds (0.00s async, 60.0s sync)
1 test, 1 failure
Randomized with seed 567196
As it happens, we hit a timeout error after 60 seconds. Fortunately, as part of the stack trace, we get a couple of suggestions on how to solve this problem. For now, update the test suite with this code:
defmodule BencheeUnitTest do
use ExUnit.Case
alias Application.TestHelper
@tag :benchmark
@tag timeout: :infinity
test "benchmark fibonacci list generation" do
# capture benchee output to run assertions
output = Benchee.run(%{
"generate_list" => fn(input) ->
FibonacciBenchmarking.list(input)
end
},
inputs: %{
"case_10" => 10,
"case_100" => 100,
"case_1000" => 1000,
"case_10000" => 10000,
"case_100000" => 100000
})
results = Enum.at(output.scenarios, 0)
assert results.run_time_data.statistics.average <= 50_000_000
end
end
Note: Depending on your system, running that last scenario will take a while; feel free to remove it to continue with the tutorial.
Now that we have a baseline of our Fibonacci sequence generator's performance, a common and useful exercise is to compare the performance of different implementations of the same algorithm. In this case, we have an alternative implementation of the Fibonacci sequence generator based on a recursive function.
Start by updating the lib/fibonacci_benchmarking.ex
file with the following code:
defmodule FibonacciBenchmarking do
def list(number), do: Enum.map(0..number, &fibonacci/1)
def list_alternate(number), do: Stream.unfold({0,1}, fn {a,b} -> {a,{b,a+b}} end) |> Enum.take(number)
def fibonacci(0), do: 0
def fibonacci(1), do: 1
def fibonacci(n), do: fibonacci(0, 1, n-2)
def fibonacci(_, prv, -1), do: prv
def fibonacci(prvprv, prv, n) do
next = prv + prvprv
fibonacci(prv, next, n-1)
end
end
Following that, we will update the test/benchee_unit_test.exs
file to account for both implementations:
defmodule BencheeUnitTest do
use ExUnit.Case
@tag :benchmark
@tag timeout: :infinity
test "benchmark fibonacci list generation" do
# capture benchee output to run assertions
output = Benchee.run(%{
"generate_list_enum" => fn(input) ->
FibonacciBenchmarking.list(input)
end,
"generate_list_stream" => fn(input) ->
FibonacciBenchmarking.list_alternate(input)
end
},
inputs: %{
"case_10" => 10,
"case_100" => 100,
"case_1000" => 1000,
"case_10000" => 10000,
})
results = Enum.at(output.scenarios, 0)
assert results.run_time_data.statistics.average <= 50_000_000
end
end
The update test case will run two scenarios side by side for each of the prescribed inputs, letting us compare their overall performance. Go ahead and run mix test
to see the results.
Compiling 1 file (.ex)
Operating System: Linux
CPU Information: Intel(R) Core(TM) i7-7700K CPU @ 4.20GHz
Number of Available Cores: 8
Available memory: 62.76 GB
Elixir 1.13.3
Erlang 24.2.2
Benchmark suite executing with the following configuration:
warmup: 2 s
time: 5 s
memory time: 0 ns
reduction time: 0 ns
parallel: 1
inputs: case_10, case_100, case_1000, case_10000
Estimated total run time: 56 s
Benchmarking generate_list_enum with input case_10 ...
Benchmarking generate_list_enum with input case_100 ...
Benchmarking generate_list_enum with input case_1000 ...
Benchmarking generate_list_enum with input case_10000 ...
Benchmarking generate_list_stream with input case_10 ...
Benchmarking generate_list_stream with input case_100 ...
Benchmarking generate_list_stream with input case_1000 ...
Benchmarking generate_list_stream with input case_10000 ...
##### With input case_10 #####
Name ips average deviation median 99th %
generate_list_stream 1.66 M 601.29 ns ±3858.34% 441 ns 955 ns
generate_list_enum 1.35 M 742.83 ns ±3201.26% 610 ns 1164 ns
Comparison:
generate_list_stream 1.66 M
generate_list_enum 1.35 M - 1.24x slower +141.55 ns
##### With input case_100 #####
Name ips average deviation median 99th %
generate_list_stream 258.49 K 3.87 μs ±512.17% 3.29 μs 8.04 μs
generate_list_enum 31.71 K 31.54 μs ±26.84% 30.57 μs 41.17 μs
Comparison:
generate_list_stream 258.49 K
generate_list_enum 31.71 K - 8.15x slower +27.67 μs
##### With input case_1000 #####
Name ips average deviation median 99th %
generate_list_stream 17.67 K 0.0566 ms ±14.14% 0.0550 ms 0.0988 ms
generate_list_enum 0.102 K 9.84 ms ±9.92% 9.60 ms 14.14 ms
Comparison:
generate_list_stream 17.67 K
generate_list_enum 0.102 K - 173.90x slower +9.78 ms
##### With input case_10000 #####
Name ips average deviation median 99th %
generate_list_stream 501.16 0.00200 s ±33.05% 0.00190 s 0.00370 s
generate_list_enum 0.27 3.75 s ±12.05% 3.75 s 4.06 s
Comparison:
generate_list_stream 501.16
generate_list_enum 0.27 - 1876.93x slower +3.74 s
.
Finished in 65.3 seconds (0.00s async, 65.3s sync)
1 test, 0 failures
Randomized with seed 839542
As you can see, our list function's Enum implementation is much slower than the Stream
implementation, especially when the input size is larger. Comparing the performance of the two implementations is valuable in understanding the trade-offs and will help you develop more performant applications.
When adding benchmarking tests to parts of your automated testing, consider the potential drawbacks, such as the increased time it takes to run the tests. In this case, the benchmarking tests are tagged with :benchmark
and can be excluded from the default test suite. This allows us to run the benchmarking tests separately from the unit tests and only when we need to.
A much better approach is to take advantage of CI/CD pipeline integration like GitHub Actions and run the benchmarking tests as part of the pull request validation process. This way, we can run the benchmarking tests as part of the CI/CD pipeline and get the results without having to run the tests locally.
Improving Benchee Reporting
Now, while seeing results on the console can be useful for a quick glance, the console is not the most convenient way to share results with your team. Benchee provides a number of different ways to export your results to a file.
For this example, we will use benchee_html
to generate an HTML report with our benchmarking test results. To do this, we will add the benchee_html
dependency to our mix.exs
file:
def deps do
[
...
{:benchee_html, "~> 1.0", only: [:dev, :test]}
]
end
Next, we will update the test/benchee_unit_test.exs
file to generate the HTML report:
defmodule BencheeUnitTest do
use ExUnit.Case
@tag :benchmark
@tag timeout: :infinity
test "benchmark fibonacci list generation" do
# capture benchee output to run assertions
output = Benchee.run(%{
"generate_list_enum" => fn(input) ->
FibonacciBenchmarking.list(input)
end,
"generate_list_stream" => fn(input) ->
FibonacciBenchmarking.list_alternate(input)
end
},
inputs: %{
"case_10" => 10,
"case_100" => 100,
"case_1000" => 1000,
"case_10000" => 10000,
},
formatters: [
Benchee.Formatters.HTML,
Benchee.Formatters.Console
])
results = Enum.at(output.scenarios, 0)
assert results.run_time_data.statistics.average <= 50_000_000
end
end
Let's go ahead and run the tests again:
mix test
On completion, you should see the following report open in your browser:
The HTML report provides a much more detailed view of the benchmarking results and allows us to share results with our team easily. For example:
In addition to the HTML report, Benchee also supports exporting results to JSON, CSV, and XML formats. Exporting results to a file is a great way to integrate them with automation, such as CI/CD pipelines.
Monitoring Your Elixir App in Production
Benchee can help you discover potential performance bottlenecks, but what about how fast things really are in your production app?
To be able to discover new and existing bottlenecks, and solve bugs and other issues your users may face, you need to use an APM. AppSignal has been supporting Elixir developers for years and seamlessly integrates with your app. Bonus: We're the only APM that ships stroopwafels to new users 😎
Wrapping Up and Next Steps
In this tutorial, we discovered how to benchmark Elixir applications with the Benchee library.
We also learned how to compare the performance of different implementations of the same algorithm.
Yet we have only scratched the surface of Benchee's capabilities. As a next step, I highly encourage you to explore the available Benchee configuration options and visualization plugins.
Happy coding!
P.S. If you'd like to read Elixir Alchemy posts as soon as they get off the press, subscribe to our Elixir Alchemy newsletter and never miss a single post!
Top comments (0)