rhymes

Posted on Nov 5, 2018

An example of why performance matters (with Python and Rust)

#rust #performance

Long intro

My first real post on dev.to, in September 2017, was the following:

Adventures in TrailDB with millions of rows, Python and Go

rhymes ・ Sep 22 '17

#traildb #python #go

I was trying to extract information from around 60 GB of CSV files corresponding to 139 million events. I started with Python to see how it behaved. The experiment was sparked by my frustration at Redshift and because I wanted to play with TrailDB, a library to query event series. My tests were non-scientifical but, after switching to Go (by copy and pasting code because I didn't really know the language back then), I was able to setup the DB with a speedup of 2.6 times than in Python and to query the data 2.54 times faster.

The topic of speed and performance is dear to probably everyone on this website,even if speed and performance can be relative to a context (see the concept of "fast enough"). You can see this topic permeate conversations on dev.to around the slowness of the web, memory occupation of browsers and desktop apps and other related topics. A couple of nice examples with long and interesting discussions attached:

The Web I Want

Chris James ・ Aug 20 '18

#web #html #performance #a11y

by @quii

Does your website really need to be larger than Windows 95?

tux0r ・ Sep 23 '18

#discuss #coding #technology

by @tux0r

Why you're here reading this

Nobody would argue against cost effective speed improvements, and this brings me to the gist of this post. An article titled Parsing logs 230x faster with Rust by André Arko (lead developer of Ruby's Bundler) caught my attention.

I've been aware of Rust's speed since... well that and its advantages around memory management is what everyone talks about when they talk about Rust :-D

I've since switched to two Rust based tools that I use everyday on the command line: bat instead of cat and especially ripgrep instead of grep and ack. The speed improvement is noticeable (thanks @dmfay for the tip) with the naked eye!

Back to the article. Arko wanted to query Bundler's treasure trove of 500 GB of logs per day to extract useful information about the community. Each log file contains millions of events in JSON (BTW: use structured logs if you can, JSON or key-value, you'll thank me later). Currently those files are sitting compressed in a S3 bucket for a few dollars per month.

Hosted logging solutions were too expensive so he tried to see if he could cook something up.

The first attempt was in Ruby and it took an insane 16 hours for a day's worth of data. Nope.

The second attempt was in Python using AWS Glue and the full power of Amazon's servers. He went down to 3 hours with an average of 36 minutes per each log file (out of 500) using 100 parallel workers for 1000 dollars per month. Nope.

The third attempt was in Rust. He initially went down to 3 minutes per file, then to 60 seconds per file. After fiddling with it more and receiving feedback from readers, he managed to parse a single file in 8 seconds (!!).

The fourth attempt was in Rust again and he used parallelization. It was 3.3x faster than the sequential attempt. That's how he got to the 230x multiplication factor in the title.

A few notes on the comparisons

If you read closely you'll notice the following:

the first attempt shouldn't probably be mentioned in the post because it collects less data than the others (and we don't know how much less)
the first attempt in Rust amounts to 8.33 hours if run sequentially, more than 30 times faster than the experiment with Python and Glue
the last "sequential" experiment in Rust amounts to a little more than 1 hour for the entire set of 500 GB which is a huge speedup

Deploy time

The last thing André Arko talks about is how he managed to deploy the Rust script so that it can work on the production logs stored on AWS. This part made me laugh:

I discovered rust-aws-lambda, a crate that lets your Rust program run on AWS Lambda by pretending to be a Go binary

Another wonder of distributing an app as a binary :D

On AWS Lambda the speedup he got was 78 times the initial Python example, not bad!

He did some calculations and it was safely in the free tier for AWS Lambda.

So he went from 1000$ a month to 0 a month, by rewriting a script with Rust.

I checked the repository of the script and people are already suggesting ways to make it even faster 😂

Stuff to think about if you made it this far

Performance can save you a lot of money
Knowing (or being willing to learn) more than one language is a good idea
Rust is definitely worth looking at for this kind of parsing
Sometimes better is better than good enough

Top comments (15)

Massimo Artizzu • Nov 6 '18

So he went from 1000$ a month to 0 a month, by rewriting a script with Rust.

This phrase was used when this article was tweeted. And I thought: "Wait, does it mean he was fired?!" 🤣

Guney Ozsan • Nov 7 '18

I just logged in to like this XD

rhymes • Nov 6 '18

hahaha like the whole concept of automating yourself out of your job :D

Alain Van Hout • Nov 5 '18

Excellent write-up :) It's nice to explorative programming leading to such cost-effective solutions.

I do have a small quibble with the title, which you already touch upon in your second paragraph: it should be 'performance can matter'. Because under the more general statement also fall things like pointlessly dense one-liners and writing scripts to save 10 seconds of typing per day.

rhymes • Nov 5 '18

Because under the more general statement also fall things like pointlessly dense one-liners and writing scripts to save 10 seconds of typing per day.

Ah ah true :-) Saving 10 seconds is probably a stretch too far in the performance category but I see what you mean.

I have to honest: I'm not totally against dense one liners if they actually increase performance (after careful benchmarking), provided that they are well documented and hopefully isolated from the rest of the code.

Alain Van Hout • Nov 5 '18

That's where 'fast enough' comes in: if it takes you a minute to read the documentation, another minute or two to mentally parse the code and 15 minutes of carefully checking your changes, and perhaps 2 days debugging to notice what is actually wrong, then what point is a 2 ms speedup for something that likely has no gain from that speedup? (Which is far too common an occurrence)

Or more concisely: too many people have wasted too many hours due someone who felt a need to be clever.

There are of course plenty of situations where performance is really needed. And there it's worth investing the time to do benchmarking and properly document the code. But most of the time, I'd be grateful if code were maintainable rather than clever.

rhymes • Nov 5 '18

I totally agree with you, only a comment:

That's where 'fast enough' comes in: if it takes you a minute to read the documentation, another minute or two to mentally parse the code and 15 minutes of carefully checking your changes, and perhaps 2 days debugging to notice what is actually wrong, then what point is a 2 ms speedup for something that likely has no gain from that speedup? (Which is far too common an occurrence)

Well yes, if the speed up of the oneliner is 2ms then no, it's definitely not worth it. The only upside in your case is that you might have gain a better knowledge of the system but that's due to the 2 days of debugging, not due to the oneliner :D

Alain Van Hout • Nov 5 '18

Indeed :-)

Dustin King • Nov 6 '18

This reminds me of a talk where, if I recall, a Python reporting process was sped up by first improving the algorithm, then compiling with Cython:

Rishabh Gupta • Nov 6 '18

Maybe this experiment can benefit if compiled in cython 🤔

rhymes • Nov 6 '18

Maybe, it really depends on the code

smuschel • Nov 5 '18 • Edited

One thing that would fit your 'stuff to think about...' list nicely is 'always choose the right tool for the job'. But then again I think everybody knows that one by now