What is Bottleneck and why do I need it in my coding life?
If you've spent any time working with 3rd party APIs you'll have come up against an issue where you make a tonne of calls to an API and it doesn't finish giving you what you want. You might get a helpful error like 429 - Too Many Requests or something less helpful like ECONNRESET
Either way, what is happening is that as a consumer of that API you are only allowed to make so many requests in a certain period of time, or the number of concurrent requests you're allowed to make is restricted.
In Javascript your code might look something like this:
const axios = require('axios');
async function getMyData(data){
const axiosConfig = {
url: 'https://really.important/api',
method: 'post',
data
}
return axios(axiosConfig)
}
async function getAllResults(){
const sourceIds = []
// Just some code to let us create a big dataset
const count = 1000000;
for(let i = 0; i < count; i++){
sourceIds.push({
id: i
});
}
// Map over all the results and call our pretend API, stashing the promises in a new array
const allThePromises = sourceIds.map(item => {
return getMyData(item);
})
try{
const results = await Promise.all(allThePromises);
console.log(results);
}
catch(err){
console.log(err);
}
}
What's going to happen here is the code will call the 1000000 times as fast as possible and all requests will take place in a very short space of time (on my MacBook Pro it's < 700ms)
Understandably, some API owners might be a little upset by this as it's creating a heavy load.
What do we need to do?
We need to be able to limit the number of requests we're making, potentially both in terms of the number of API calls in a space of time and in terms of the number of concurrent requests.
I'd encourage you to attempt to roll your own solution as a learning exercise. For example, there is a reasonably simple solution that can get you out of a hole using setInterval. What I think you'll find is that building a reliable solution that limits rate and concurrency is actually trickier than it looks and requires you to build and manage queues. It's even more complicated if you're clustering.
We can instead turn to a gem of a package on NPM - Bottleneck
https://www.npmjs.com/package/bottleneck
The author describes this as:
Bottleneck is a lightweight and zero-dependency Task Scheduler and Rate Limiter for Node.js and the browser.
What you do is create a 'limiter' and use it to wrap the function you want to rate limit. You then simply call the limited version instead.
Our code from earlier becomes:
const axios = require('axios');
const Bottleneck = require('bottleneck');
const limiter = Bottleneck({
minTime: 200
});
async function getMyData(data){
const axiosConfig = {
url: 'https://really.important/api',
method: 'post',
data
}
return axios(axiosConfig)
}
const throttledGetMyData = limiter.wrap(getMyData);
async function getAllResults(){
const sourceIds = []
// Just some code to let us create a big dataset
const count = 1000000;
for(let i = 0; i < count; i++){
sourceIds.push({
id: i
});
}
// Map over all the results and call our pretend API, stashing the promises in a new array
const allThePromises = sourceIds.map(item => {
return throttledGetMyData(item);
})
try{
const results = await Promise.all(allThePromises);
console.log(results);
}
catch(err){
console.log(err);
}
}
getAllResults()
As you can see, we've created a limiter with a minTime property. This defines the minimum number of milliseconds that must elapse between requests. We have 200 so we'll make 5 requests per second.
We then wrap our function using the limiter and call the wrapped version instead:
const throttledGetMyData = limiter.wrap(getMyData);
...
const allThePromises = sourceIds.map(item => {
return throttledGetMyData(item);
})
If there's a chance your requests will take longer than the minTime, you're also easily able to limit the number of concurrent requests by setting up the limiter like this:
const limiter = Bottleneck({
minTime: 200,
maxConcurrent: 1,
});
Here we'll ensure that there is only one request submitted at a time.
What else can it do?
There are many options for setting up Bottleneck'ed functions. You can rate limit over a period of time using the reservoir options - e.g. send a maximum of 100 requests every 60 seconds. Or, send an initial batch of requests and then subsequent batches every x seconds.
The documentation over at NPM is excellent so I advise you to read it to get a full appreciation of the power of this package, and also the gotchas for when things don't behave as you expect.
Wrapping up
If you've ever in need of a highly flexible package that deals with how to rate limit your calls to an API, Bottleneck is your friend.
Top comments (19)
I tried the above suggestion to make multiple requests to a remote third party API to solve the " Socket hangout issue, ECONNRESET in Express.js (Node.js) with multiple requests: but still getting the error. I will appreciate your help. Thanks
That means the other end of the connection closed it for some reason. Can you share your code?
This is the code I use to simulate the bulk operation
Then this is the one for bottleneck
How you're wrapping the function looks fine. If you call initializeParsingProcess() directly, what happens?
I will get the error below when I call direct. I use bottleneck to see whether that can be solved after going your amazing post here but still getting the same error below
Ah, I see, I thought you were suggesting the problem was with your usage of Bottleneck. Can you share the code that you use to call the API?
I used Axios to make post request to a remote server. This is the code below
After that, I import into the file below to make the call
Can you try with a raw axios instance, i.e. without the KeepAliveAgent?
I have done that. It was after some research that I added KeepAliveAgent to see whether it can be solved but still proved abortive
How about making a single, non-bottlenecked call to the API? Does that work?
Making a single call even with bottleneck works perfectly
Then I'm guessing there's something weird in the way you're building the URLs or making the requests when there are multiple API calls.
To clean things up, if I was you I'd change the code to use map() on the array of cvUrl returning a promise for each call.
The await promise.all() on the result of that map, then do your parsing.
Put console.logs in each iteration to determine exactly what you're sending and wrap in try/catch to see if you can find any more information about what's actually going wrong with the connection.
This is where I tried it with Promise.all but got the same error. Is this code below look like what you suggested above?
I need your assistance to get this resolve. Thanks
@ross Coundon, I would appreciate your assistance from the wealth of your experience in the field interacting with several third part API on how I can make a concurrent request to the server without experience the "socket hangout issue". In the first interaction of the loop shown above, I will get results from the third-party API but on the second iteration, there will be a delay for a response from the third party and hence the error message below
Hi - I'm not sure what to suggest, are you able to share what the 3rd party API is? Do they provide any documentation/information on acceptable usage, time between requests, number of concurrent requests etc?
They do not have that spell out on their API documentation. I have sent a mail to them to inquire about the acceptable usage, the time between requests, number of concurrent requests
Great stuff, thanks for bringing
bottleneck
to the broader audience.Been using it for quite a while now, it's amazing!