David Pereira

Posted on Feb 2, 2021 • Edited on Feb 6, 2021

Learning: Microservices

#architecture #microservices #learning

Introduction
Orchestration VS Choreography
Integration technology
Conclusion

Introduction

I started being really interested in microservices architecture in the past year. That interest got me reading Building Microservices: Designing Fine-Grained Systems by Sam Newman, and it introduced me to a lot of new ideas. Another catalyst to my interest is my job, as I want to understand better the challenges with this type of architecture and get familiar with the solutions.

In this post, I'll focus on the parts I'm most interested in and still have doubts about, then share at the end some links that helped me in my study. Also, I will try to keep specific technology mentions to the minimum. The reason being I care more about the fundamentals and the problems in this realm, rather than a specific tech solution.

Throughout my journey learning about microservices, I Googled a lot 😅, read and listened to a lot of opinions, and tried getting familiar with all the terminology used in this space.

In summary, choosing a microservices architecture over a monolith is a good option if you have a very good reason (said by Sam Newman on many occasions 😄). This makes sense because even though this architecture buys you a suite of options that could benefit your business and developer experience, it comes with additional complexity and other trade-offs.

So before continuing with the other topics, one tip I'd give you if you are new to microservices is: understand the implications of this decision and the value it provides to your use case. We can see a lot of talks and hear about tons of cool stories, but we should prioritize our work context and apply only what fits. What I mean by that is if you are starting a new project, maybe you have not built up a deep understanding of your business domain, so it could feel extra hard to separate services within each business context.

With that being said, I want to touch on the subject that I enjoy most and would like to continue to learn even more in the future.

Orchestration VS Choreography

In the book, the author Sam Newman mentions we should prefer choreography rather than orchestration. Not just in the book, this seems to be known throughout the industry. In most systems I've worked on, orchestration is used alongside choreography, just in different parts of the system 😃.

Imagine a Web API that exposes many operations regarding a customer and is consumed by a client-side application, like a web browser or a mobile app. In this case, we want a request/response model, because the client-side application makes a request and the user wants to see the UI change right away. Let's say we have an operation to create a new customer, that needs to:

Link the account to a GitHub account
Get capabilities associated with the price plan chosen
Send an email with a welcome letter and links to useful documentation after it's done
Notify two internal systems that want to know about the existence of this new customer

Now, we have an option to go the orchestration route or the choreography route. I've experienced choosing one route first for every workflow and then moving to the choreography way on specific actions. Let's dive into an example to showcase that transition and the decisions behind it.

By choosing orchestration first, we get an explicit workflow of execution, because this service contains all the business logic that decides when to call each service. This means it would also have to call each external service's API that needed to be updated about events happening regarding a customer. In our example, two internal systems need to be notified of the creation of this customer. So, our Web API makes those HTTP calls and sends the customer information in the payload.

At this time, we should ask ourselves this "Do these calls make sense to be in our service?". I believe this should be a team discussion that takes into account the team's context, but for this example, let's say we realized the answer was no. The reason being that it creates a dependency on these external services. The Dependency Inversion Principle (DIP) states that High-level modules should not depend on low-level modules, and our Web API is a higher level module. It wants to notify other services that an event has happened, but it doesn't care who needs to be notified. Adding a new service to be notified of this event shouldn't require changes to the service that publishes the event.

Therefore, we choose to change into a choreography way of service interaction using the publisher-subscriber pattern. Of course, this approach also introduces new challenges. Before, if we couldn't get a response from the external service's API, we could log the error and see the HTTP status code.

Now we're using a message broker to implement the pub-sub pattern, so if a message isn't being processed by the subscribers, we need to put something in place to detect that and potentially solve it as well. I've looked a bit into the dead-letter queue pattern to solve this problem, but I need to dig deeper to understand how to solve particular problems in this pattern. For example, if we have 400 messages going to the dead-letter queue, how do we set up batch processing so that workers can only process 200 messages, and the rest remains on the queue (assuming this is a requirement we want).

Another challenge is tracing and knowing what was triggered by an event. I still need to learn how to set up strong monitoring, distributed tracing and logging for this scenario.

Integration technology

In a microservices architecture, we are likely to have services in different machines. This means we need to integrate them somehow since they need to communicate with each other in order to do some processing for the user in question.

There are many ways to integrate services, communication protocols like SOAP, HTTP, or FTP. Messaging protocols like AMQP or XMPP. But one way I find it quite common to integrate two or more services is through databases. Have you used a DB to integrate two systems that need to read and/or modify the same data? I know I have 😅, but at the time of making this decision, there were more benefits than cons. It was easier and changing the database schema was not common. If a change was made to the schema, we'd update both code repositories to represent that change.

With that being said, this is still a coupling between two services. Implementation details like this should be hidden from other services so that when these change, we don't need to change the clients of this service.

RPC

When I learned about RPC I thought it was something quite different compared to HTTP. It was still fun to learn sockets and Java RMI... but it doesn't compare to gRPC. While learning and researching, I found a lot of people recommending and using gRPC as their integration technology. Even though I haven't had a lot of experience using it, I think it does a good job at least on an important factor when it comes to RPC implementations: not hiding the fact a remote call isn't the same as a local call. In gRPC, the client needs to specify if it wants a blocking stub or an async stub, but we need to explicitly create that object. You can think of a stub as a wrapper for the socket that marshalls and unmarshalls a payload.

Reading the Building Microservices book also opened my eyes to some restrictions of HTTP, like use cases where low latency is a requirement. Using HTTP as your communication protocol might be a problem for your use case because of the overhead of the payload sent in HTTP, where WebSockets or an RPC framework sitting on top of UDP would be better solutions for streaming data between a client and a server.

Now HTTP/2 is a whole different story when it comes to performance and low latency. HTTP 1.1 and 2.0 is a whole topic on their own, so I might delve deeper into them later on.

Backends for Frontends (BFF)

Completely new idea to me, but interesting nonetheless 😃. The Backends for frontends (BFF) pattern is one way of solving a common problem in web applications, which is having multiple UIs for our business capabilities. For example, we want a web application and a mobile application to consume our web API on the server-side. The problem arises when we want to offer different user experiences, and the usual design of the web API doesn't fit the requirements of the mobile application.

And thus, we create back-end services for a specific UI, where these services only contain behavior specific to the user experience that UI offers. The web API behind these back-end services still has all the business logic. Unless in your use case, you want to apply some business logic only to a specific UI.

In December, I saw a discussion on Twitter regarding what technologies people use for this pattern. I haven't been in this scenario at work, but the technology seems to be dictated by the developer team of the client application. Personally, I would like to experiment a bit more into GraphQL, since a query language is very appealing to the client-side apps.

Conclusion

I hope you enjoyed reading this blog post. At the end of the day, this topic is quite sizeable and I'm not expecting to learn everything at once. After writing this post, I realize I want more hands-on experience, but taking the first step into being aware of the problems, solutions, and terminology is also important 😃. I'd say there are more concepts to talk about (e.g. deployment, testing, scaling DBs) and patterns, so perhaps I'll make a Learning Microservices part 2 post 😅.

If you have experienced splitting a monolith into smaller pieces, or if you started working in a microservices system and needed to get up to speed on how this works, feel free to comment your opinion and thoughts. I'd love to read it and respond!

DEV Community