Code optimization and refactoring are crucial for enhancing the efficiency and speed of software. Share your experience of a specific instance where you had to tackle performance issues in code. What steps did you follow to improve it? We're interested to know the outcome of your efforts and any lessons learned along the way.
Follow the DEVteam for more discussions and online camaraderie!
Top comments (12)
I have 3 performance tips:
It's almost always one form or another of caching (assuming it isn't a bug). One of the earliest examples I did of this was in the 80s where I pre-caculated the results of trig functions and stored them in an array. I could then perform trig calculations needed for my game with the cost of an array lookup.
About 10 years ago on earlier Android devices. I worked on image processing apps and a couple games. Floating point calculations were super slow on Android devices back then, specifically for image processing and video game math. So step one was to convert those floating point calculations to integer calculations, which already gave a pretty decent performance boost. Step two was to rewrite those routines in C and call them using the Java Native Interface.
The next problem was garbage collector activity that would stall especially video game animations and game performance. So the optimization trick was to recycle all objects and arrays etc., so nothing would be ever garbage-collected while the game was running. So if the game had entities such as enemies, projectiles etc, I would use so-called "pools" for each entity type and retrieve them when needed, and put them back again when done.
Pools are great for this sort of thing! I hear games using MonoGame using similar approaches, never allocating in the game loop. In C I just use static arrays for this kind of thing, saves a bunch of work on memory management too. In C++ you can use custom allocators that'll automate this behaviour for you.
a while back, working on a SASS platform, I implemented something I named at the time "request caching". so caching requests instead of the results. Much later I found out this prwctise is named 'query batching' and it worked just like the
Dataloader
library from facebook. just that my solution worked with higher order functions, could seemless integrate with transactions and other contexts like pagination.by adding this into the projects own ORM, the entire app got a perfornance boost.
by the way it was a time when node apps where made with callbacks and not even with promises.
by the way, tcacher still has some advantages over Dataloader. But it could be lifted to the age of ESM modules.
I was part of a development team that were developing Microsoft ISAPI extensions in C++
for a web application.
Ironically, I've not done much performance tuning/refactoring in the 30+ years I've been developing or maintaining software (except perhaps for caching, as Shai Almog says ... caching!).
But my most memorable and lauded work has often been quite the opposite.
That is, actively slowing things down or at least discarding performance as a criterion in order to pursue competing goals (generally, maintainability, the cost or even facility of maintenance).
I'm not the only person to have landed on a project that was a black box because no-one, since implementation, wants to touch it. Anyone who's looked at it saw a house of cards, a mysterium of complexity and fled. The risks of making changes or the costs of a complete replacement both judged too high ... Just leave it be, if it ain't broke don't fix it.
But then a rewrite is budgeted, mainly because new hardware is bought with new peripherals, firmware and OS etc... And so this black box needs porting which equates to a rewrite.
A deep analysis of the thing to be rewritten begins the job, teasing it apart, building an internal documentation of the old, an internal spec, all the things lost to time in this legacy system. Then a rewrite, but often the main goal this time is not to land here again, but to have software that can be maintained, enduring staff turn over. And with that goal eclipsing performance, with a new generation of hardware providing enormous performance gains, a lot of complicated and difficult to describe optimisations in the old software, on the old hardware are tested against a simpler implementation on the new hardware ... scrutinising performance and accuracy and precision (I have always worked in the engineering and science realms). If the new is not significantly slower than the old or if, on the new hardware, is still faster, then with its clarity of code, internal documentation and interface specifications, it is the winner.
The result: performance optimisations removed in favour of simpler code that can be maintained into the future and evolve incrementally unlike the black box that just burst from its bubble.
As for actual examples, there's two that come to mind:
MinBy
query checking squared distances. The JIT generated really tight vectorized code for this and it was super fast.I once managed a 1000x speed up from some code. The language had two similar types lists and arrays.
Lists had some features not available to arrays but we're basically wrappers around strings. So doing any operation like sorting them involved a lot of string manipulation.
One feature was talking over a minute to run and it was due to this list processing. All of the things inside it could be done with arrays. So I tried changing it. Converting to and from lists on either side of the system. I was hoping it would run in less than thirty seconds but it came in as fractions of a second.
I guess the moral is know your data types and how they are handled internally.
Yes.
When I joined my recent company, I was given the task to improve the performance of a Socket Server (Socket.io + Nodejs).
At that time we were only able to handle 2K concurrent users, after that our EC2 instance used to go down due to High CPU.
We were doing some API call whenever the users connects to the Socket Server. Doing lots of parallel API calls was resulting in High CPU as HTTP 3 way handshake is CPU intensive.
Then instead of making API calls we decided to do DB queries on the Socket Server only, eliminating the REST call.
Then we started queuing requests in a local IN Memory queue and were processing multiple users request in a Single DB Batch Query (one DB query per 500 requests).
Then we used Redis to horizontally scale these socket servers.
Now we are handling 35K concurrent users and have tested out Service to handle 100K concurrent connections per ec2 instance (1CPU x 2GB RAM).
This is oddly specific but for some reason
Array.from
is faster than a spread operator on strings in JavaScript.