Let’s say we have the same task written in the different programming languages, the questions is: which language is faster? In order to get an answer to this question, we will write a simple program, and write the same code on different programming languages. For our test we will be writing the same code in NodeJs, Deno, Python3, Golang, Rust and C. Then Will be bench-marking and comparing the time it took for each language to complete the task.
Let’s get started! For our test we will coding an application to read a large file then search for a specific string and output the index location of that string once found.
The Task written in JavaScript looks like this, and the same code will be written in the other languages:
In order to ensure each programming language is doing the exact same task we have the following restrictions:
Constrains:
The buffer size in all languages is 65536 bytes which is the default for NodeJS
The file in which all programming languages search must be the same, and in this case is a 10GB binary file filled with random characters.
The string to be searched is the word “ — boundary — “ which is at the very end, and won’t be changed.
For the benchmarks, all Languages performed the task 100 times.
In this first case, I tested the same application on NodeJS, Deno, Python3, GO, Rust and C.
For the Bench-marks I measured them using a Linux program called “Hyperfine” which is dedicated for this purpose only. I performed all the benchmarks on a fresh install of Manjaro using a CPU: AMD Ryzen 9 5950X (32) @ 3.400GHz, while no other program was running in the background, to ensure the same result each time.
Results (lower and stable is better)
Deno: The first thing we notice is that Deno is the slowest and most unstable of them, I was definitely expecting this, and I believe this is one of the reasons Deno never took off and is still very unused to this day compared to others like NodeJs. Even though is written in Rust, was very slow and unstable on this task.
StreamSearch NPM Package: Applications like “Busboy” made for Express/NodeJs and others that depend on it like “Multer”, both of them depend on StreamSearch that allows searching a stream using the Boyer-Moore-Horspool algorithm, in this particular case the algorithm made the task to perform slower than the rest.
Rust: I hear all the time about how fast Rust is, but on this particular task was not particularly good, not to mention that the code was hard to write due to it’s complexity, and very unstable compare to the rest.
Node: The level of optimization that Node has received, and the large amount of documentation and it’s friendliness for the new user , makes it the most used, even though was not blazingly fast, it completed the task, and was very easy to write.
Go: On this particular case Go was superior to Rust, and was way more easy to write, well… compared to rust.
Python 3: This result was a big surprise for me, but knowing the level of optimization that python has received makes it one of the most used programming languages in the world, and the new version 3.11 claims to be even faster, I will have to test it.
C: Being the fastest of all was not a surprise, C is known for being extremely fast, but also is hard to write, and if you ask me I wouldn’t recommend to migrate to C just to improve a few seconds in performance, unless you are a big company of course.
Additional: Testing different NodeJS versions.
All versions performed around the same, but the latest version 18, did notably slow compared to the rest, so it may not be a good idea to upgrade to the latest version of Node, at least not yet
Final Ranking:
Final thoughts:
If I were to migrate an application to improve performance, I would definitely consider python 3.10.4 as one of my option, not just because is easy compared to Node or Go, but because is so easy to write and implement, also it’s documentation is available everywhere, even though the errors in the console are not friendly at all, and don’t tell too much about the error itself, compare to Rust, for example which the errors in the console, tells exactly whats going on, Python may be the best option and fastest option, at least for now.
If you want to check the code I wrote to perform this tests, please check my repository github.com/FredySandoval/benchmark-indexof
Top comments (14)
Whenever Rust performs worse that an interpreted language, my alarm bells ring :) It usually plays within the same league as C/C++.
Your Rust implementation doesn't stop on the first match but always continues to the end of the file in order to find all matches. Also the sliding
window
method isn't very fast as you can see in the comments: stackoverflow.com/questions/359015...I suggested an alternative implementation here which performs twice as fast on my machine: github.com/FredySandoval/benchmark...
Maybe you could do an update on your numbers?
There are a lot more optimizations one could do, but I guess that's true for the other languages as well.
OTOH it would have been very nice if the Rust's stdlib would have a idiomatic method for finding subslices. There's a crate for it: docs.rs/subslice/0.2.3/subslice/tr... but I don't know how it performs.
Are you not really also testing the implementation of finding a string in a block, rather than reading. I'm presuming some of the "odd" timings may be down to how this is implemented?
For example in Rust you are using
match
and thehaystack
thing, rather thanfind
- not enough of a Rust programmer to comment on which is better, but there does at least seem to be a choice of method.Yes, i also think the rust bad results may come from
find_subsequence
impl in rust.Indeed, it's creating a slice iterator that will generate an iterator over 65536 arrays then launch a find algorithm over the iterator.
But maybe the rust issue comes from elsewhere. We should measure.
You can find more extensive language benchmarking around the internet, for example,
benchmarksgame-team.pages.debian.n...
programming-language-benchmarks.ve...
The second site provides summary plots, which ranks some languages running the GZip algorithm: C, Chapel, Julia, Rust, C++, C#, FORTRAN, Pascal, F#, Go, Java, OCaml, Haskell, Swift, JavaScript.
I will definitely take a look on those sites, thanks.
Interesting comparison. Keep in mind though that it mostly just compares time to read from a file. For other tasks, you will find rather different results. For example, Python's "surprising" 2nd place performance, only slightly behind C, isn't actually that surprising for this task. I would expect the Python case to be only marginally behind C as you saw, and ahead of the others. Why? Almost all of the cost of this task is in the file i/o, and in this case in the calls to Python's f.read, which the Python interpreter implements natively in C (link to source code). So the small time difference between the C and Python versions comes from the rest of the benchmark task, which is little compared to the overhead of the file i/o portion.
Your example demonstrates that Python can be very fast if the task at hand spends a lot of time in functions that the Python library implements natively.
If you really want to compare speed of languages, you need to use a variety of benchmark tasks. Each language may excel in different areas. So you want to look at performance over a diverse set of tasks.
I agree, thanks for the comment.
I would not have expected this outcome. How can it be that Python is faster than go and Rust?! This seems ridiculous. I had several tasks using python to iterate over much smaller files and it took much longer than 2.x seconds. My machine has also similar specs.
Python is definitely slower on other tasks. but is receiving a lot of improvements at byte level, and the beta version claims to be 3 times faster. thanks for the comment.
The C implementation only looks for "--bou" not "--boundary--" this might have a positive impact on the performance.
I just found the
memmem
crate for Rust ( docs.rs/memmem/latest/memmem/ ) which improves the performance even more (7.47 s down to 2.05 s) and the readability improves significantly.Thanks, you right. I will update the results.
Why you not try with PHP?
I will definitely give it a try.
I would rather go with Node which is faster than any python.