In my earlier experience my Flask apps were pretty simple and everything worked just as it has been created from the scratch. I loved it, but then the inevitable thing has happened: my application got really slow and I HAD to do something about it. In this post I’ll tell my story about looking for the bottleneck of my Flask app, solving the problem and will share some amazing tools I used for it.
So I have a Flask application and a MySQL database which contains a lot of one-to-many super-nested objects. And by saying ‘super-nested’ I’m not hyperbolising: an object of total 3000 table rows from five different tables is a common case. At the beginning everything was fine, but at some point processing the request started to take 3-5 or even 10 seconds! Oh.My.God. This was something I didn’t want to be happening, so I started to think what might cause the problem.
At first, I naively went through my code base and tried to determine the area of possible bottleneck. The answer seemed to be obvious: I’m using the marshmallow library to serialise and validate data before inserting and after selecting it from the database. This is it, right? Googling “marshmallow + slow” returned a few articles that confirmed my suspicions: it’s the library. One of the articles I found was by some guys from Lyft, that said: “We’re using marshmallow and it’s so slow, so we created toasted-marshmallow which is 15x times faster”. Amazing!- I thought. This is what I need. At that point my requests took an average of three seconds. So I updated my code to using toasted marshmallow and prepared myself a red stripe and scissors. Sent a request… bam. Six seconds. That was impressive.
I got pretty upset because I thought I have to rewrite half of my app’s logic. And then a colleague asked me if I tried using a profiler? Yes, at this point I have to admit I didn’t know profilers existed. I’m happy I do know now, though!
What is a code profiler? Long story short, it’s a tool for dynamic code analysis that helps to detect performance problems, also known as bottlenecks of your program. The profiler gathers information on various metrics of how your program works, and based on this information you can identify where to move on with code optimisation.
There are a lot of profiling tools for Python code, and most of them are built-in — like profile or cProfile. Since I’m speaking about Flask application, let’s see what the world has especially for it. There is a beautiful lib called flask-profiler, which has a web interface with some cool features such as route or date filters. But Flask also has a built-in in werkzeug's profiler. It looked awesomely easy in use, so it was the first — and the last — one I tried.
To use the built-in profiler you’ll need to add only two lines of code to your project:
from werkzeug.middleware.profiler import ProfilerMiddleware app = ProfilerMiddleware(app)
You can also configure it, for example, specify the profile_dir or set restrictions for the stats you want to see.
After adding the two lines before the Flask
app.run() function and executing the program, overview result on each request will be displayed in the stdout.
Sometimes this short result can give you an idea on what’s slowing your program down. But usually it’s interesting to see thee detailed analysis, which is put into the chosen profile directory as *.prof files.
There are a few tools to visualise the profile dumps. Some of them providing a full GUI for navigatig within your profiling results ( RunSnakeRun), some of them represent the analysis result as a Graph (gprof2dot).
I stopped on snakeviz, which is a browser based visualizer.
It is easy installed using
pip install snakeviz, and then simply run with snakeviz profile_dir. The result looks something like this, and you can dive in to each of the visual parts to see its more close detalization, which is in my opinion is super cool and handy.
After analysing the visual representation of the profiling results, it turned out that most of my performance problems were coming from service-database interaction. When I knew exactly which lines of code took inexcusably lot of time to execute, I was able to solve my problems whether by rewriting the queries, or improving the code itself. Improving code performance via enhancement of database interaction is a topic for a whole new article, so I will probably write about my experience with it later.
To sum up: of course, automatic profilers are not perfect and they won’t 100% show you the mistakes in your code. Sometimes it’s simply ‘staring at your code’ method that actually works really good. But at least with the help of profilers you can detect the weak area which in most cases is more than enough.
Thank you for reading and I hope it was somehow useful. :)