loading...

The way programmers think about performance is too narrow

sunrick profile image Rickard Sundén ・4 min read

Whenever I read a blog post, comment or website about the performance of some code, a programming language or some framework I'm usually annoyed.

Discussions are usually based around some performance benchmark with zero to little analysis.

Performance benchmarks compare the raw speed, either in iterations per second or in total duration, of some code written in X amount of languages. Some may even show you some memory usage comparisons between implementations.

These discussions are often focused on relative performance. This framework is 5 times faster than this framework. This language can process 2 times as many items per second than this language.

They can also be expressed in absolute terms, this JS framework takes 120 milliseconds to update 1,000 DOM elements. This other JS framework takes 150 milliseconds to do the same thing. Who are these people updating 1,000 DOM elements at the same time?

From the perspective of a programmer obsessed about speed, when you look at these comparisons, its pretty obvious what your decision should be. Pick the fastest algorithm, language or framework.

I feel like this is a trap that many new programmers fall for. Maybe some more experienced developers do too.

I remember searching for what language I should learn 4 years ago when I was starting to learn how to program. I kept reading about some languages being REALLY slow and some being SUPER fast.

I remember thinking, why would I ever choose a slow language? I don't want to program in a slow language. That sounds really bad. I felt shame for liking Ruby.

But what do these measurements even mean in real life? What do people mean when they say something in programming is super slow?

Take Ruby for instance, if we iterate over a 500 item array and do something with each item, it will be probably be much slower than doing the same thing in a language like Rust. For the sake of argument, let's say that this specific Ruby implementation is 50 times slower than the Rust implementation. What would that look like in a real life situation?

Let's say it takes Ruby 2 milliseconds to iterate over that 500 item array.

Do I really care about the computer taking 50 times longer? Its 2 milliseconds. What if we're replacing a manual process of entering 500 customer orders a day? Maybe we had a dedicated person solving what our code does now. We're down from 8 hours a day to 2ms, that is insane!

Let's extend that argument to 1 million items. It would take Ruby roughly 30 minutes when Rust could have done it in less than 1 minute. In the world of computers, thats an eternity.

But then you have to ask yourself, do I really care if it takes my program 30 minutes to execute versus 1 minute? Someone on the internet definitely does, but do I?

Well it depends. What does it depend on?

Your needs, your customer's needs, and your business' needs.

Your time is valuable. If the Ruby code took you 10 minutes to write and the Rust version took you 3 hours to write and you only need use that code once, you've probably wasted a lot of time.

What if this code needs to run 20 times a day, would it still be acceptable to use Ruby?

The answer would seem obvious. Ruby would need a total of 10 hours every day to run and the Rust implementation would take 20 minutes. Just on that first day you will have saved a A LOT of time.

But then again, is some random computer's processing time more valuable than your own? Servers are cheap, people's time isn't.

Here are some questions you could ask yourself in order to make a decision:

  • How important is this code to me or my business?

  • Could I find another solution that doesn't need to iterate over a million items? Different data structure, or algorithm?

  • Could I execute this program in parallel? More servers?

  • What if the code needs to change? Do I need to spend another 5 hours changing it?

  • Do we care if it costs extra to run my program? Does it mean anything in real dollar terms? 10 cents, 10 dollars, 20,000 dollars?

  • How fast do I need results from the program? Real time, once a day, every five minutes?

  • If my business grows, will I need to run this 100, 50,000, 5 million times a day?

  • How difficult is this code to write? Will I have to write 50 lines or 1,000 lines? What about the quality of code? Am I proficient in this language?

  • What about running tests? Can I test with 2 items? Do I have to test with 1 million items?

  • Does a faster language give me any additional benefits besides speed?

  • Will my company or I be able to hire other people to work on this code? Are they expensive?

  • Can I switch to a faster implementation or language later?

  • Did I even have fun writing this program?

After answering some of these questions, you may decide that going for the faster implementation, language or framework is justified. But usually faster comes with a tradeoff of more development time, complexity and harder to read code.

As programmers we deal with things a lot in the abstract. But there are times where we shouldn't. Try to think about performance in real terms. Does it really matter if my solution is 50 times slower? Will it actually cost me more money? Do I really need to spend extra time making this faster? It's up to you!

Discussion

pic
Editor guide
Collapse
jorgecc profile image
Jorge Castro

The main problem with performance is, once the performance is lost, sometimes it's impossible to get it back. We could add more machine (and toss more money in the process), it is always possible until at some level. if we touch the peak of a machine, then we need to add more machines, and it means to change the code and we need to add more moving parts to the system (moving parts that each one could break anything).

For example, let's say the next pseudo-code:

products=query("select * from products");
show(products);

This code could work fine with a small set of data, 10 products = yes, 100 = yes, 1000 = well, yes too. But what about 20k?. If we try to show on the screen (web), then it could crash or put in slow mode the browser.

Now, what about 1 million of products?. It will slow the whole system, the database, and the system.

Then

products=query("select * from products").paging("1 to 20");
show(products);

These changes do a lot of different. If the pagination is done in the database, then the system could escalate to 1 million of products in a snap, without adding new server or infrastructure, and it is just a single line of code. Sometimes, this simple fix requires to rebuild the core of the system (i.e. practically most of the code) and it shows that we failed to evaluate the system.

If my business grows, will I need to run this 100, 50,000, 5 million times a day?

It is exactly the main point of software architecture. It must evaluate the size and impact BEFORE we start coding. So, we must evaluate a maximum size until we hit the ceiling and we need to invest in more machine and new software or build a new version.

For example, SAP.
One of my customers has SAP and a legacy ERP.

He runs SAP in a high-performance server (servers) and SAP is dog-slow. Internal customers complain a lot about it.

While he has a legacy ERP running in a modest configuration. It's ugly at best but it's blazing fast. Some of the internal customers have even complained that they want to turn back to the old system.

Collapse
sunrick profile image
Rickard Sundén Author

Hey Jorge, thanks for sharing! Totally agree that it is a judgement call you'll have to make before coding!