Hafez

Posted on Jul 9, 2021

Software Performance – A Pragmatic Guide

#performance #codequality #productivity

A code example:

let i = 0;

const stopAt = Date.now() + 1000;

while (Date.now() < stopAt) {
  i++;
}

console.log(i); // ~8,312,450

Running the script above on my machine, outputs a number around 8 millions.

This means that my machine can evaluate the current date, compare it to a point of time, and increment a variable 8 million times in a single second.

How to tackle performance for your application

We've seen developers spend endless hours optimizing their code to gain every last bit of performance possible, and we've seen developers not caring at all about performance as long as the code worked.

To understand what's the most optimal way of tackling performance, we have to understand why we write software in the first place.

We write software to help people.

The more we write software, the more tools, technologies and methodologies we get introduced to, the easier it is to forget the sole purpose of the software we build: to help people.

With that in mind, any change we introduce to our software has to achieve one of two goals:

A change that makes our software more helpful for people.
A change that enables us to make the software more helpful for people.

Writing code that runs fast serves the people who use our software, in order for our code to run fast, we often have to come up with clever tricks that we add to our code that make it run fast, and clever tricks make our code more difficult to work with.

So we need to find a balance between speed and maintainability. This balance varies a lot depending on our application, who we serve, how fast our code needs to run, and in my opinion, most importantly how often our code runs.

For example, the way developers handle performance at Google is completely different from how we should handle performance for an MVP for a startup.

When a developer at Google writes a line of code, it has the potential to run billions of times in a matter of days.
When we write a line of code for a startup, that line of code can run only a few times a day, or never.

So when we're at a startup, it may make sense to overcome the performance issues by paying an extra $10/month for a better machine that executes our code, but it would cost a lot more if developers at Google do that. For Google, spending the time and effort optimizing the code will be cheaper than upgrading hundreds of thousands of machines.

Optimize where it matters:

Back in the day, developers used to argue a lot about the following question:

let i = 0;

// Which one is faster? i++? ++i?
i++;
++i;

Which one is faster? i++? ++i?
Here's the answer: It does not matter.

It's really easy to lose our focus and try to optimize bits of code that already run in microseconds. A typical machine will do that operation tens of times in less than 1 millionth of a second.

So presenting a change to the code in order to optimize performance for bits like that will likely conflict with one of the two main goals of any change to the code: Making the code easier to work with, and the difference will be a few microseconds at best.

A typical program spends most of its loading time in its I/O, so instead of trying to optimize a loop that takes 2 microseconds, trying to make it take 1 microsecond, that optimization energy is much better spent trying to optimize a database query, or trying to group HTTP requests together.

Equal effort !== Equal reward

We may have the following code in JavaScript that calls two database queries, the second query is only initialized after the first one is finished, even though there is no dependency between them:

const user = await User.findOne({ id: 1 });
const orders = await Order.find({ userId: 1 });

If we run the second query without having to wait for the first query to finish, we can save ~100 milliseconds:

const [user, orders] = await Promise.all([
  User.findOne({ id: 1 }),
  Order.find({ userId: 1 })
]);

With just a couple of lines of code, we saved 100 milliseconds.

Notice how we're putting the same effort of modifying a couple of lines of code, but in the first change the result is saving a few microseconds, and in the second change we're saving 100 milliseconds (100,000x of the first change).

Another example would be executing a database query in a for loop:

const usersIds = [1, 2, 3, 4, 5, 6, 7];
for (const userId of usersIds) {
  const user = await User.findOne({ id: userId });
  // do something with the user
}

This code takes an array of users ids, and for each id, it queries the database to fetch that user, in our example that's sending 7 calls, resulting in 7 network round trips, which is very expensive.

If we change the code to find all the users with one query, then find the matching id in memory, we just saved 6 network round trips, potentially 600 milliseconds here.

const usersIds = [1, 2, 3, 4, 5, 6, 7];
const users = await User.find({ id: { $in: usersIds } });

for (const userId of usersIds) {
  const user = users.find(user => user.id === userId);
  // do something with the user
}

Now, it's really common when we look at this code we'd notice that we loop over the array again inside the top-level-loop, causing an O(n^2) time complexity, and we'd want to convert it to a hash map and use it instead, resulting in an O(n) time complexity for the top-level-loop.

const usersIds = [1, 2, 3, 4, 5, 6, 7];
const users = await User.find({ id: { $in: usersIds } });

const usersMap = {};
for (const user of users) {
  usersMap[user.id] = user;
}

for (const userId of usersIds) {
  const user = usersMap[user.id];
  // do something with the user
}

But when we look at the facts: looping through an array item takes 1 microsecond (1 millionth of a second), so if my usersIds array has 10 items on average, O(n^2) would take 0.1 millisecond.

There is room for improvement, but I'd rather spend my time improving the performance in places that are worth it, and also notice how we're writing more code to squeeze those last bits of machine performance, which would make it difficult for developers to understand this code, costing us their expensive time. It's just not worth it, one could even argue that it's degrading the code.

Conclusion

We write software to help people, and as long as people are happy with the software, and we can deliver it in a timely manner we're doing a good job.
As important as helping people, enabling ourselves to help people.
Most of the software performance bottlenecks lie in I/O where you read from an external source (HTTP/Database), or from the disk, focus your energy on optimizing those.
When a software engineer at Google writes a line of code, it will run billions of times more than the average developer does. Therefore the average developer should not write their code with the same standards.

Should we all follow the standards set by big companies? Or should we come up with our own?
Let me know what you think in the comments below.

About me

I'm a software engineer who's obsessed with automating things and helping people out by using technology.

Top comments (7)

Venkatesh KL • Jul 9 '21 • Edited

Does it work as per our needs?
Should it be better than it is now?
If it should be better, how much?

I think asking the adove questions would make the life much easier.

I've seen people argue over timestamp with or without timezone when that value was never used in real-time, except for debugging purposes.

So I would support you n-times regarding this. Outcome based improvements are the way to go or optimisations that can meet a goal is sufficient. It doesn't need to work on Google's scale. It just should work at your user base & organization scale.
Great words. Cheers 👍

James dengel • Jul 11 '21

I feel what this pragmatic guide is missing is the reason of optimising code.

No one will run code that does not do what we require it it do. So the right result from the code is priority number 1.
Does it give the right result in a time scale in which the function gives benefit? Reloading a webpage < 1 second Weather forecast for tomorrow < a few hours

Keep the task in mind always :)

It’s an effective bar for optimisation

Hafez • Jul 15 '21

Good point!

I'm not sure what you mean by "Weather forecast tomorrow < a few hours", though.

James dengel • Jul 17 '21

Well if you want to predict the weather tomorrow but it’s going to take you till tomorrow to predict it, then it’s pretty useless.

Uzlopak • Jan 17 '22

To be honest, I always benchmark my code for fastest implementation and checking for v8 deoptimizations. Not because it is a waste of effort, but because there is already too much bad code on npm, which results in the assumption, that javascript is a piss poor performance language, despite the fact, that it can do better, but people implement always the "more convenient" or "more readable" solution, neglecting the fact that it needs to be fast and using as less as possible RAM to improve the enduser experience. The enduser gives a damn about how readable your code is. If it is slow, it is slow.

Mahmoud Faragallah • Jul 10 '21

Thank you for your article Hafez. It helps me.

Hafez • Jul 11 '21

More than happy to help