Raphael Jambalos

Posted on Jan 2, 2021 • Edited on Apr 11, 2021

Basics of Load Testing

#testing #devops #jmeter #tutorial

This is the first post of the Create Realistic Load Tests with JMeter series. This series will help you understand basic load testing concepts and create realistic load testing scenarios with JMeter best practices.

Load Testing Basics - we are here
Load Testing Best Practices - coming soon
Create your first load test case in JMeter - coming soon
Realistic Load Testing with JMeter - coming soon

In this first post, we learn about the basics of load testing and how we can use realistic load tests to optimize our system.

Why load test?

In the Software Development Lifecycle, load testing is often considered a luxury. Before we test out how much traffic it can carry, we have to make the application work first. But after the deployment, fixes and new features come up, and load testing is pushed further down the "laundry list."

When an incident occurs, the team scrambles to throw money at the problem (by scaling up servers). If the incident is major enough, we'd see some devs diverted to the task of optimization: adding DB indexes, fixing O^n problems, and so on.

But wouldn't it be nice to know what kinds of problems an increased load would bring even before you deploy to production?

Enter Load Testing

Load testing tests how a particular application or system performs under a given load. The load is measured by the number of requests per second that we send to the app. To create a realistic test, we create test cases that reflect common user behaviors in the application. As an example, the workflow below reflects a simple one-product checkout workflow for an eCommerce startup.

The user signs in - /api/users/signin
The user get redirected to the home page - /
Clicks on a product - /product/8888
Adds that product to their cart - /cart/add
Clicks the view cart - /cart
Clicks checkout - /checkout
Pays via PayPal - /payments/paypal

Companies usually define multiple test cases to cover different patterns of behavior. Other test cases for the typical eCommerce startup include browsing many products or creating a product review.

Then, we perform the load test with a specific number of virtual users (VU). For example, a load test with 50 VU simulates the behavior of 50 people accessing your website all at the same time. In our one-product checkout workflow example, we are simulating 50 people will go through each step of the workflow in sequence. Once they are done with the 7-step sequence, the user repeats the sequence over and over until the load test is done (i.e can be in 20 minutes). Hence, if the 7 steps take 30s to complete, one VU would have had completed the workflow 40x for the 20 minute period.

Metrics

After the load test, we are presented with 4 types of metrics that measure how our website responded to the load test.

Response Time - The time it takes to respond to a request
Throughput - number of transactions per minute
Error Rate - % of the total request that resulted in an error
Scalability - measures whether or not the infrastructure scales in response to an increase in traffic

The higher the VUs of our load test, the worse these metrics will get. But some metrics will be worse than others, and that will point you in the direction of what your next fix will be. After addressing the fix, rerun the load test to measure the improvement caused by fixing the issue. At this point, you may see another issue that needs to be addressed. To get the most value from load testing, you may need to repeat this process several times more to maximize the optimizations you can make.

I have the load test results. So what now? 🤷🏼‍♀️

The job does not end with the results of the load test. We need to act on the results and see where the bottleneck is. The whole point of doing the load test, after all, is knowing which part of the system you can optimize so you can increase the amount of load your system can accommodate.

Solution 1: Throw more money at it 💰💰💰

The easiest way to increase your system's capacity is to add more servers or choose a more powerful server. We look at the performance metrics of each component of the system (i.e., servers, database, caching, etc.) and see what component has their CPU or memory utilization spike during the load test. That's the component we upgrade or add more servers to.

But this is just a stopgap measure. I've worked with customers who just keep increasing the size of their database instance. Eventually, we reached 24xlarge (the biggest DB size in AWS) and we couldn't upgrade anymore, so we had to get more creative.

While increasing compute capacity serves as a short-term solution, it comes at a steep price (literally - the customer's AWS bill blew up). Another cost is that by not addressing the issues causing the problems in the first place, more technical debt is pushed further down the line.

Solution 2: Tweak your app / web server's configuration

When we develop, we often use the default application server that comes packaged with the framework we are using. For Rails, that's the default WEBrick app server.

Most of us are aware that we should use an application server more suited to production when we deploy. In Rails, the Puma app server is often used. While this is a big upgrade from WEBrick, Puma will perform faster if we took the time to know how to tweak its configuration.

Some app servers have configuration that has to be adjusted based on the number of CPU cores is inside the machine.
Some have a thread count limit that defines how many sessions each machine can accommodate, and so on.

Taking the time to study your app server's configuration really goes a long way to ensure your configuration enables your app server to take full advantage of the expensive servers that you put it on.

Some web frameworks (i.e Rails) even require a web server like Apache / Nginx alongside running the Unicorn app server. This gives us two sets of configurations to tweak and optimize.

Solution 3: Fix the application 🛠⚙️🛠

Jmeter produces rich HTML-based reports that show the four metrics on a per-endpoint basis.

The screencap of the report below shows the total number of requests sent per endpoint (seen below as "Samples"). The KO shows how many of them resulted in an error. Many different statistics are shown for the response time for each endpoint. We will cover them in the next post.

With the per-endpoint view, we narrow down which endpoint performs the worst across the 4 metrics. Once that's identified, we review the code and trace how the request is served from start to finish. I usually look for blocks of code that are computationally expensive and try to fix that. Then, I rerun the load test to see if there's any improvement. I repeat the process until I'm satisfied with the result.

As you may have guessed, this isn't really the best way to do this. Looking at each code block to see if it's computationally expensive is not the same as actually knowing if it is. There are so many other factors hidden from us that, at best, this is just intelligent guesswork.

Use Application Profilers

To address this, companies install an Application Profiler in their systems. These profilers sample a small percentage of the requests served by each endpoint and compute how long each call in the stack trace takes. With this data, the developer will be more certain which parts need to be optimized.

For profilers, I have worked with New Relic and Data Dog. They have a free tier option that you can use to identify high-level problematic sections of your application. To avail of a more low-level, per function sampling, you will have to get a paid version.

Solution 4: Create database indexes

Another angle to look at is the database queries. The latest RDS databases in AWS have a feature called Performance Insights. It allows engineers to see which SQL queries take the most time.

With this knowledge, we can re-examine how SQL calls are made in the application. The most common solution is to examine the database schema and see if we have created the appropriate database index for the heaviest queries.

Be deliberate in adding indexes

Database indexes make read queries faster by having a second copy of the table where querying specific columns is faster. However, this comes at some cost. Every time we write new data or modify existing data, the database exerts some effort to update the indexes.

If we keep on blindly adding new database indexes, we will end up with an excessive number of indexes. The indexes will weigh down the database writes, and it will be harder for us to optimize.

That's all!

In gist, load testing allows us to see problems that will only surface when the system is under sustained load. It gives us the opportunity to simulate real-world behavior in a safe environment, allowing us to be one step ahead of costly bugs.

Special thanks to Allen, my editor, for helping this post become more coherent

I'm happy to take your comments/feedback on this post. Just comment below, or message me!

Top comments (3)

Anton Moldovan • Feb 21 '24

Nice write-up.
Regarding the load tests tool, I suggest considering NBomber, a .NET tool for load testing. It's a modern and flexible .NET load-testing framework for Pull and Push scenarios, designed to test any system regardless of a protocol (HTTP/WebSockets/AMQP, etc) or a semantic model (Pull/Push).

Raphael Jambalos • Mar 3 '24

Thanks Anton! Will definitely check NBomber out

Siva Krishna • Jun 25 '21

Thank you, Raphael, for coming up with an elaborate article on the basics of load testing. This is really helpful for beginners and for those who wish to pursue a career in testing. Applications are data-intensive, and a large amount of data is exchanged in order to provide a better user experience. Load testing aids in identifying any issues that may obstruct the application's smooth operation, as well as any bottlenecks that may emerge due to excessive load on the application's software. The application under test (AUT) is evaluated and reported on under a variety of predicted and unexpected loads. End user reaction times, as opposed to business operations, CPU, and memory data, are reported in a way. This allows application/website owners to see how their site performs in a live environment. Here are a few interesting articles on load testing that I found useful and shared them for the benefit of other readers - bit.ly/2T4IdNL

Some comments may only be visible to logged-in visitors. Sign in to view all comments.