A common claim from serverless advocates is that it can increase a team's feature velocity and reduce the ops overhead of running an application in the cloud. But if you're an engineering manager, developer or ops engineer who has never run a serverless workload in production, you may find this claim hard to visualise and quantify.
How will serverless increase our team's productivity?
In what ways will it affect the skillsets we need in our engineering team?
To help answer these questions, I've compiled a list of specific concerns and tasks encountered when building and operating server-based applications that no longer apply when using a serverless architecture.
By the end of the article, you should have a clearer picture of how selecting a serverless architecture can potentially reduce the total cost of ownership of a new application within your organisation.
When reading through the list, try to think of a server-based app you are currently or have previously worked on, and ask yourself the following questions about each list item:
- Do I have the knowledge to complete this task myself or is this something a specialist or more experienced member of my team takes care of? If the latter, how much time would it take me to understand what's involved in doing this?
- How much time do I (or another member of my team) currently spend doing this and how frequently does it need repeated?
- Even though I can do this task myself, am I confident that I've implemented it correctly using best practices and that I haven't introduced any security holes or performance issues?
- Is this something that we don't currently do on our team (due to time or skill constraints), but that I know that we should be doing, and thus we've taken on extra risk by omitting it?
A note on scope: I focus on the AWS ecosystem so the items listed below primarily relate to the EC2, VPC, RDS, ECS/EKS and ELB services Amazon provides. Nevertheless, most points are generally applicable to other major cloud providers. I've assumed the server-based system is run either in containers or directly on the virtual machine instances.
Ok, let's dive into the list of concerns which which go away in a serverless world...
Server Provisioning and Scaling
- Configure AMIs for your VM instances with specific OS version and any required application software
- Set up a VPC and subnets using best-practice security settings
- Configure security groups and identify what ports need to be open on each instance
- Create launch configurations and auto scaling groups for each EC2 instance type
- Configure load balancers and associated health checks
- Set up internet gateways
- Configure route tables
- Configure VPC peering
- Configure RDS cluster with appropriate storage and instance size
- Regularly observe load-related metrics and modify scaling limits or instance resource allocation accordingly
- Repeat most of above steps for each environment (dev, test, staging, production)
Application Development & Maintenance
- Define your container environment (Dockerfile)
- Configure your container orchestration cluster (ECS, Kubernetes, etc)
- Configure the pods/services/task definitions within your cluster
- Debug container inter-connectivity/service discovery issues
- Write script to deploy build artifact (Docker image, zip file) to EC2 instance
- Regularly update base Docker image with latest patches (e.g. to Node.js/Python/Java or whatever language your app uses)
Server Maintenance
- Set up a secure VPN/SSH bastion instance (and keep it patched)
- Manage VPN/SSH access to different servers for authorised engineers
- Manage regular patching of all VM instances (either manually via SSH or automated via script/Systems Manager)
- Be available to promptly deploy emergency patches (e.g. heartbleed)
- Set up alerts to be notified about emergency patches
- Set up monitoring to watch for low disk space
- Manually expand a volume when it's out of space
- Handle SSL certificate renewal and deployment (if installing keys directly to instances and not just to load balancers where it's managed by AWS)
- Repeat most of above steps for each environment (dev, test, staging, production)
Cost Control
- Pay for an EC2/RDS/ElastiCache instance when it's not in use
- Over-provision instances to handle occasional sudden traffic spikes
- Write cron jobs to spin down dev/test environment instances at evenings and weekends
📖 Related: How to calculate the billing savings of moving an EC2 app to Lambda.
By now, you're probably shouting "That's great Paul, but what about all the new concerns that serverless brings?" And you would be correct to do so, my healthily skeptical friend!
I do intend to write about this soon, but for now I will point you at this article on Containers vs Serverless from a devops standpoint and also Martin Fowler's section on the drawbacks of operating serverless systems.
That said, and while this is very difficult to objectively measure, I personally believe that for most development teams building a greenfield production system in the cloud today the total cost of ownership of a serverless app will be lower than that of a server-based app.
This is especially true if your organisation doesn't have a skilled ops team already in place with availability to help your dev team build out the infrastructure and automation required to provision and maintain your application.
Got any more items that I should add to the list? I'm sure there are plenty of things I've forgotten so please leave me a comment below and I'll add it in.
💌 If you enjoyed this article, you can sign up to my weekly newsletter on building serverless apps in AWS.
Originally published at winterwindsoftware.com.
Top comments (13)
Hi Paul,
Great article. In my previous workplace, I set up an on-the-fly image resizing job using AWS Lambda.
That said, many of the problems with setting up servers are one-time heavy with lesser effort required to update. To a newcomer, wrapping your head around the plethora of services AWS provides is daunting. It is a soup of weird names. But once you have a stable setup, it is easy to add/remove new servers, whitelist ports, generate SSH keys for new team members. It also pays off in the long run because not all jobs lend themselves to serverless.
Hey Raunak, thanks for sharing your experience. 🙂
It seems like you have taken a very pragmatic approach there 👍
You are absolutely correct to say that not all jobs lend themselves to serverless, and limitations such as cold starts may prove to be deal breakers for certain types of apps.
However, I honestly believe that these types of apps are the exception and not the norm.
In an organisation, you don't need to pick a side: "We only do serverless apps" or "we only do containers/VM-based apps". You can (and should) make the architecture decision on a per-application basis.
Now that I have a breadth of experience in building and running both serverless and container-based apps (on AWS), my default starting point would always be serverless. This is due to the faster feature velocity it enables and that scaling is (almost) completely managed for me. If there is a specific task which seems like it won't work in a Lambda, say, then a container-based solution can be used for it.
Do you have any recommended resources for porting standard CRUD apps to serverless?
You could check out these 2 articles from Yan Cui:
I do find that the conversation around porting existing (brownfield) apps to serverless is much more complex than it is with greenfield apps. Lots of different things to consider before making a decision if it's worth doing.
Thanks a lot for the links. They are very helpful.
You're very welcome
Nice post.
I have to admit that I'm not a big fan of the serverless architecture. Maybe someone here will be able to convince me to it :)
The reason why I'm so careful with serverless is that it is tightly coupled with a particular provider. I'm experienced with Azure which provides Azure Functions. At the moment if you decided to use it, you have to use Azure (with small exceptions like using Docker images). I agree that serverless is now cheap but who knows if next year pricing won't kill us. In that case, I can't easily migrate my solution to a competition cloud (like AWS), because my code is connected with Azure solutions.
As I mentioned, I can create a Docker image and migrate it to AWS, but in that case, in my opinion, the idea of serverless is being lost. Azure Function hosted on a Docker container can't use all triggers offered by Azure hosting That is the price of moving solution from the cloud provider :(
You compared IaaS (Infrastructure as a Service) with FaaS (Functions as a Service) in your article. I'm curious how will the compare look like between PaaS (Platform as a Service) and FaaS solution. In the PaaS solution, there is less work to do like no need for
Configure route tables
,Configure AMIs for your VM instances with specific OS version and any required application software
orSet up a secure VPN/SSH bastion instance (and keep it patched)
. Have you ever compare PaaS vs. FaaS?I don't want to offend you, so please don't take my comment personally. I'm just looking for someone who convinces me to Serverless and explains my the phenomenon of it.
Cheers.
Hi Rafal,
Firstly, thanks for your comment. It doesn't offend me at all — it's important to constructively debate these issues 🙂
You are right that serverless architectures (currently) do enforce a lock-in moreso than container or VM-based apps/workloads.
I don't think the issue of lock-in with the functions themselves is that big of an issue. By that, I mean that it wouldn't require that much work to refactor the code inside functions from say AWS Lambda to work in Azure Functions as the function signatures are very similar. I wrote about FaaS vendor lock-in here.
However, the main lock-in comes with using the other proprietary services from the same cloud provider that the function calls out to (e.g. S3, DynamoDB, SNS, SQS, etc). Rewriting your app to use other services in the event of migrating to another cloud would be a big job. As a big open-source fan, I would love to see AWS (and the other cloud providers) provide more open-source serverless services, but that's not where we are today.
Ultimately, it really comes down to how much of a risk this lock-in really is to you or your company.
What are the chances that you might need to switch cloud provider at some point in the future? What reasons would trigger this move?
Many container-based solutions I've seen also make use of these proprietary services anyway so it's not solely a serverless issue (albeit it's more prominent there).
Moreover, do the other benefits of getting your app developed faster and cost savings from a paid-for-usage bill and less time spent on traditional ops outweigh this risk? I personally think that in the majority of cases, they do.
Regarding your PaaS vs FaaS question - by PaaS, are you talking about the likes of Heroku, AWS Elastic Beanstalk and Azure Web Apps?
I haven't written an article doing this comparison.
They do share the benefit of being able to get up and running quickly with less server management but I guess I would say the main difference between using these services vs a serverless (FaaS + BaaS) architecture is that when they scale they tend to get very expensive quickly. You hear lots of stories of companies outgrowing Heroku and then moving to EC2 on AWS. This shouldn't happen with serverless.
Thanks for your replay. I missed your article about FaaS vendor lock-in. I think it hits the point.
As you wrote, it isn't hard to migrate the function body itself. The real problem is the ecosystem supplied by the cloud provider. As you said, It isn't likely that I’ll have to migrate my solution between cloud providers. But I don't like the feeling that migration could be painful.
Thanks again for sharing your opinion and experience. It was a pleasure to read your article and participate in the discussion.
I’m looking forward to the next article from you.
Cheers.
Regarding minimising the pain / risk of vendor lock in, Pulumi looks interesting (haven't used it in anger so YMMV): pulumi.com
Great write-up Paul! I agree that we tend to state serverless makes our lives easier, but you have really highlighted how it makes our lives easier.
Cheers, Kyle.
Enjoying your weekly newsletters by the way 👍
It would be nice to get the same serverless app running on azure and aws. That'd be some fantastic disaster recovery.