DEV Community

Cover image for Interview Question: Best Size for a Thread Pool?
Arthur Rio
Arthur Rio

Posted on

Interview Question: Best Size for a Thread Pool?

Introduction

This week I did an interview process where I needed to answer the question: Determine a threadpool size to handle 5000 requests where each request takes 10 milliseconds. This is a common problem in a production environment; every day we need to think about how I can scale a microservice or how many microservice instances I should use. If just increasing the number of threads would be the best solution, I will not talk about concurrency and parallelism. Today I'll focus on how to solve this in case someday I need to come back here 👍🏻

Solution

It's hard to determine the right size for a threadpool. What I could do is estimate the closest size that answers the question, but creating a scenario that does not exist, where I have a server with UNLIMITED resources, and my goal is just to reach 5000 requests per second (RPS) where the response time is 10ms, so I should use this formula:

Creating a formula based in requests per second

If each request took one second, to handle 5000 requests in one second, our server would need to create 5000 threads. (Remember, this is just a fictional scenario to make it easy to understand)

Performance(RPS)=Threadpool sizeresponse time in seconds Performance (RPS) = \frac{\text{Threadpool size}}{\text{response time in seconds}}

Performance(RPS)=50001=5000 requests(threads)/s Performance (RPS) = \frac{5000}{1} = \text{5000 requests(threads)/s}

Therefore, each response only takes 10 milliseconds, and we need to achieve a performance of 5000 requests per second. If 1 second = 1000 milliseconds, then 10 milliseconds = 0.01s = 1/100s

10 milliseconds=0.01s=1100seconds \text{10 milliseconds} = 0.01s =\frac{1}{100}\text{seconds}

So we just adjust the formula:

Performance(RPS)=Threadpool sizeresponse time in seconds Performance (RPS) = \frac{\text{Threadpool size}}{\text{response time in seconds}}

Threadpool size=Performance (RPS)(response time) \text{Threadpool size} = \text{Performance (RPS)} * \text{(response time)}

Threadpool size=50001100=50 \text{Threadpool size} = 5000 * \frac{1}{100} = 50

Conclusion

Remember, to answer this question should clarify that we can't easily determine the right threadpool size based on response time and the throughput an app wants to achieve. When we think about achieving better throughput, first, we should identify all bottlenecks, which could be a database, an API, or even an algorithm. Pay attention to the server's limits like CPU-bound, Memory RAM-bound, and I/O bound, as all these limits will determine how many threads your system can handle. We should consider whether our server is shared with other apps/systems. Increasing the number of threads without taking all these factors into consideration may not meet expectations or even create a point of failure in the system. And we are not talking about concurrency and parallelism, which is an awesome topic related to this one.

This is just a short answer. In another post, I could bring more details about it and take all the factors mentioned above into consideration, along with other examples.

I hope that you enjoyed it, and all comments are welcome, if possible with a reference to a book or post to enrich the post that is open to editing and improvements!

I will leave some references below.

Resources

Java Concurrency in Pratice - Chap. 8

Latency Numbers Programmer Should Know: Crash Course System Design #1 - YouTube

Weekly system design newsletter: https://bit.ly/3tfAlYDCheckout our bestselling System Design Interview books: Volume 1: https://amzn.to/3Ou7gkdVolume 2: htt...

favicon youtube.com

Top comments (2)

Collapse
 
webjose profile image
José Pablo Ramírez Vargas

The solution is overly simplistic, and if I were the interviewer, I would say the answer is incorrect. The problem here is that creating and managing threads have a cost associated that makes the calculation you show inaccurate. The more threads in the thread pool, the more management overhead. I understand you're trying to introduce the topic, but I think this warning should be in place since day 1.

Since you seem to be hinting you'll write more about the topic, I'll just stop here as you might delve into this further. Overall, I applaud any mathematical solution to problems. Most developers seem to forget mathematics exist.

Collapse
 
arthurrio profile image
Arthur Rio

Thank you Jose for your comment and adding knowledge in this post!
For sure! This approach is very simple and just a quick answer in a fiction scenario! But you're right, hope as soon as possible bring more posts about it and arrive at a full answer!