Serverless technologies are definitely the new "hotness", but does the hype live up to the reality?
I recently completed a project called Listeri where I went all in on serverless after a decade of writing traditional backends. Here's what I learned:
What does "serverless" mean?
TLDR: Skip this if you are familiar with serverless as a concept
For those of you that aren't familiar with the term, "serverless" doesn't mean there aren't servers. It just means you don't run the servers.
Let's think about a typical backend. You have a server, which is just a computer somewhere. It runs your application in the background and waits for web requests. You pay for the whole server with all of its memory and CPU no matter how many requests come in and out. I understand there are some variations here in more complicated contexts with autoscaling, docker hackery, and virtualization. Let's just keep it simple for our example. So you've got your computer running somewhere. Now if tomorrow you hit get featured on Hacker New and your user base gets 100x bigger, you have to figure out how to add more servers to handle that load. If that doesn't happen, you are still stuck paying for a full server when you only need part of it.
Serverless technologies, like AWS Lambda, Netlify Functions, Vercel functions, etc turn that on its head. You don't boot up a server. You just give them your code. It stays dormant until someone makes a web request. The serverless provider then boots up your code, runs it, returns the response. You only pay for the little bit of compute you used. Sounds amazing, right!
What's the catch?
Like every new technology, there's a ton of marketing by the providers and tech zeitgeist telling you how to implement serverless functions, but not a lot about whether you should.
The Dreaded Cold Boot
Problem: Serverless providers can offer such cheap service because they generally don't keep your code running. If no one visits your site or hits that endpoint for a bit of time, then they put your code to sleep. The next person that visits will often see a 5-7 second delay in getting a response while your code gets booted back up. This can vary a lot by provider.
Workaround:
Have a ton of traffic - This isn't a problem if you have a high frequency API that's getting hit all day. If you have a smaller site or API, then it can really ruin the user experience. It will also feel erratic to the user since it will only happen once and then the site will feel fast.
Keep Hitting Your Own API to Keep it Alive - Yeah, this is a hack, but even the noble developers over at the serverless framework kind of recommend this.
Use AWS Lambda and fiddle with Provisioned Concurrency - Realizing this is a big problem. Amazon is trying to make it less impactful. There is a setting in AWS Lambda to try and reduce the cold boot time, but it doesn't eliminate it entirely. Cloudflare Workers also have an approach to minimize cold boots, but it results in other compromises I won't go into here.
Basically, cold boots can be a problem and the right solution isn't clear yet. Everyone is still experimenting with ways to solve it.
Database, Anyone?
Problem: Most major databases aren't yet built to work with serverless technologies, including MongoDB, PostgresQL, or MySQL. Remember that when your function goes idle, it shuts down. That means it loses its active database connection. That connection has to be reestablished when you boot back up. In a normal server, the connection is persistent since your server is always up. This can create a significant performance issue for your app that's hard to discover.
Secondly, most hosts for these traditional databases have connection limits as to how many connections you can have open. Serverless providers scale up the number of copies they have of your code running on different machines as they see fit. This means you can very easily blow your connection limit without realizing it. For example, MongoDB Atlas (a Mongo hosting provider) has a 500 connection limit even on their serverless tier! Connections are also very memory intensive for traditional databases so it can hurt performance.
Workaround
Use a Database With High Connection Limits - You can just brute force this by running your own database server on a big machine but now you have to manage a database server! You can also try DigitalOcean's Managed Postgres service, which has PG Bouncer built in. That means you can get up to 5000 connections per database on Digital Ocean.
Use Amazon DynamoDB or FaunaDB - Neither of these have connection limits issues. Dynamo can be really hard to query though and FaunaDB is not as battle tested as MongoDB.
Use Amazon Aurora Serverless With a Data API - This is Amazon's clever way of allowing you to use Postgres and MySQL without a traditional database connection. Instead you make HTTP API calls to your database. This isn't great for performance since now you have the overhead of a
SSL
handshake to make a database call. This solution is also very very new at the time of writing this article.Don't Work Around It - Just hope you never get big enough to have this problem. That's kind of ironic though since you probably want to go get big enough to have this problem.
Performance, What?
Problem: In a traditional environment, there is pretty mature tooling for most languages / frameworks to measure performance. That's just not true in serverless, yet. You can measure the total response time for a call, but there are a lot of other factors outside of your code that could impact timings. Was it a cold boot? Was it a cold boot with a delay in connecting to a database? Did your serverless provider glitch and leave you on a box where someone else took all the CPU? Unlike a normal server, you can't look at a chart and see that somehow the CPU spiked. Remember, you can't look at the server!
Workaround:
- Add a bunch of logging and tooling by hand - Yeah and guess?
Conclusion
Serverless technologies work, but the ecosystem isn't there in terms of databases, performance management, and reliability to replace traditional server architecture. Even the big players are still experimenting with solutions they might throw away in a few years. If that's fine with you, then go for it. Just do it with eyes open and not clouded by the chatter in the techverse.
Top comments (2)
The solution to coldstart is to author small single responsibility cloud functions. This is the best practice advice from AWS on Lambda. Coldstart is directly correlated to function payload size and to avoid it: write smaller functions. We've found sub 5mb will load sub second. Usually 150ms cold. (Aside: pinging/lambda warmers DO NOT fix coldstart. They hide it. If you get 2 concurrent requests you will still coldstart 1 of them. Pinging only keeps 1 Lambda warm.)
Thanks for the feedback. Is this only specific to Lambda or have you found this to be the case on all the major cloud function as a service tools? I found on Vercel that this was always multi second even with single line functions. How difficult is it to keep to the sub 5MB size given that you might use a NPM package that's bigger than that (thinking of some of the Node database clients)?