My latest project parler.io has been built from the ground up using serverless microservices. Now that it is up and running it seems worthwhile to document the current architecture, advantages I have gained, and also highlight some of the pain points I have encountered.
First, a little bit of context on Parler. It is a service that allows you to convert any blog posts from an RSS feed into audio. This audio can then be downloaded, embedded, or published to other platforms.
The tools and languages in play
As of this writing, the architecture consists of five different serverless microservices.
The infrastructure for each microservice is managed via a combination of Terraform and Serverless Framework. This mixture of tooling is a path I have gone down thus far and it has worked alright, however, it does introduce a bit of confusion.
For instance, you might be thinking what is managed by Terraform versus Serverless Framework. It roughly breaks down into these two categories.
- Terraform manages any build, test, or deployment infrastructure that is needed to maintain a given service.
- Serverless manages the infrastructure that the service actually runs on like AWS Lambda functions, API Gateway endpoints, and even SNS topics or SQS queues for event handling.
To be honest, there are times where this feels a bit off. The reason being is that I could pretty easily move all infrastructure out of
serverless.yml and represent it in Terraform. The reason I haven't is that I like the simplicity Serverless Framework gives me when it comes to deploying the infrastructure a service runs on as part of my CI/CD pipeline.
Furthermore, I also like the simplicity in which I can spin up a new microservice and its support infrastructure using Terraform modules. With one Terraform module, I can represent all of the necessary bits for a new microservice. This includes a new Git repository in AWS CodeCommit as well as an entire CI/CD Pipeline using AWS CodePipeline and CodeBuild.
Another burning question folks might have, what language(s) are you using for your serverless functions? TypeScript.
Having done a lot of .NET development I have become a fan of typings, but I also really enjoy the flexibility and tooling around Node. Therefore, TypeScript was a nice split down the middle and I have been happy with it thus far.
The serverless microservice architecture
As I mentioned earlier, there are five different serverless microservices to make Parler turn out high-quality audio from written blog posts.
At the front, we have the
parler-io service that is effectively the front-end that is displayed to the user and the supporting AWS architecture for doing so. Not surprisingly the website is entirely static, built using Gatsby V2, and is hosted via a combination of AWS S3 and CloudFront. If that is an interesting front-end architecture that you would like to explore, consider checking out my Learn AWS By Using It course where we learn all about that.
The front-end communicates with a couple different API endpoints. First, we have the
load-post service that consists of an API Gateway and AWS Lambda function. It is responsible for taking in your RSS feed and returning the collection of posts that you can choose to convert.
We then have the
post-conversion service that is another API Gateway and Lambda combination. It has one endpoint that takes in the post you want to convert and does the necessary processing to forward it on to the
conversion service. More on that service in a moment.
Next, there is the biggest microservice, the
status service. It's primary job as far as the front-end is concerned is to return the status of a conversion. In essence is the post that has been requested been converted, processing, or has it errored somewhere. It also has additional logic for returning the embedded player for a given conversion. In terms of AWS architecture, it consists of multiple API Gateway and Lambda endpoints, two event-driven Lambda functions, and the primary storage is DynamoDB.
The final microservice is the
conversion service that handles the actual conversion of a blog post into audio. This consists of another Lambda function that takes a conversion job, loads the post, and passes it along to a collection of voice conversion services including AWS Polly. Conversion status is sent via an SNS topic that the
status service can listen on.
All five services are built, tested and deployed independently using the combination of AWS CodeCommit, CodePipeline, and CodeBuild. As mentioned earlier, those services are provisioned and maintained using Terraform.
Now that we have covered the lay of the land, let's cover some pain points I have come across and things id like to improve as this project scales.
Observations so far
In building out parler.io to this early stage alpha release I have learned a few valuable lessons within the context of microservices in general. Here are a few that I have tried to generalize to be beneficial to other projects.
Pick a toolchain that makes sense for you. While it feels odd, my
serverlesstooling is working pretty well for me. Certainly, I could choose one or the other and that would minimize the tools at play. However, I would have to do some translations from one to the other. All of this to say, pick the tools that help you deliver quickly and iteratively but don't fear a pivot in the future either. What makes sense now, may not make sense 6 months from now.
- Managing dependencies across service boundaries is hard. This is a common situation in the world of microservices (serverless or not). As developers, we want to abide by DRY (don't repeat yourself) as much as possible. It feels odd to say, but breaking DRY or minimizing shared code across microservices is beneficial in the long run. Where I have found that shared code is necessary is with things like logging, monitoring, or models between front-end/back-end services. The former I have moved up to a separate repo that can be leveraged in other services. The latter I am still trying to determine which path I want to go down.
What starts as a clear microservice might become more than one in the future. I talked about this briefly up above, the
statusservice has become a bit large. There is more than one set of business logic in here and that likely means this service should be broken down in the near future. My hypothesis for how this happened: laziness around refactoring my data model which introduced coupling to the one data store there is, DynamoDB. Pay attention to how your data is accessed and segregated, this can keep the boundaries between your services a little clearer.
These are some general observations and lessons I have learned with respect to microservices in general. I don't believe any one of those three is specific to serverless microservices per se. However, here are a few observations I have had with respect to serverless microservices in Amazon Web Services.
- Serverless Framework is great for provisioning, deploying, and maintaining the infrastructure for a given service, but it could be better. There are a few subtle things in the framework that are odd or missing, as is the case for most frameworks. An example, no SQS triggers for Lambda functions out of the box.
- One AWS account per microservice. This is not the current state of the Parler architecture, but it is likely the direction I will head. By segregating services at an account level we can gain a couple of benefits. First, our resource limits can be scoped specifically to the service running in the account. Second, other resources like VPC's, security groups, permissions, etc. can be specific to a single service. Lastly, we can avoid being your own noisy neighbor in terms of AWS Lambda: you have one function steal all the invocations, effectively starving your other functions.
AWS Code* services are quite good. I mentioned earlier that I have a Terraform module that stands up all of the necessary infrastructure for a new microservice. This includes a new CodeCommit repo, CodePipeline, and two CodeBuild stages. This setup has been awesome. I essentially have CI/CD for a given microservice with one Terraform template and updating a
buildspec.ymlso that I run the steps I need. These services haven't gotten the best rep as there is a lot of other tools in the CI/CD space (i.e. Circle CI, Travis, etc), but I think folks should give these a second look.
Wrapping it up
I would not classify anything in this article as broad advice that you should apply to your own serverless architecture. These are just some thoughts and experiences I have gathered early on in the development of parler.io.
That said, I think these are useful to keep in mind when it comes to building your next microservice or product.
The power, flexibility, and agility that serverless gives you is hard to deny. There are certainly cases where it still isn't practical. If you have an AWS Lambda function that is constantly running, the cost of that is likely going to get out of hand.
But, for a large number of workloads, serverless microservices is going to unlock the ability for you to focus solely on the problem that service is trying to solve. If you automate your deployment, use infrastructure as code, and have a few tools to keep things lightweight. You can really accelerate your application development using serverless.
If your interested in learning more about parler.io, check it out and give it a run for yourself. I'm always looking for feedback on how to improve the service as I ramp up to an MVP release.