Miroslav Malkin

Posted on Aug 10, 2021 • Edited on Sep 26, 2023

Mixing Clean Architecture

#elixir #erlang #architecture #codequality

TL;DR This article gently introduces the CleanMixer tool, which is helpful for visualization and control of elixir project architecture. Throughout this article, CleanMixer is used as a backbone theme for introducing architecture principles, best practices, and their reasons.

Disclaimer Most of the theoretical material I will cover uses Robert Martin's excellent book "Clean Architecture".

Introduction

First of all, let's introduce a definition of what a component is. Any set of source files can be a component. It is something physical, for example, a namespace (in Elixir, that is a set of files in a separate folder with the same module name prefix for it). It could be an umbrella app; it could be a hex package.

If we are talking about a component from a logical point of view, a component is an abstraction of some functionality, in other words, some DDD Bounded Context, some functional area or business capability. Or it could be a purely technical abstraction — for example, an adapter for accessing Kafka. The relationship between components is physical. The source file of one component should mention the module's name or the source file of another component. It should not be confused with logical coupling, where one component discovers its dependencies only at runtime using Dependency Injection.

Let's use the Clean Mixer tool to visualize the architecture of some imaginary project and see what the picture looks like when the principles are violated.

mix clean_mixer.plantuml -v

What are the problems here?

Firstly, it is unclear which component is the domain or the core of the application. Which component contains the most critical business logic of our application? For what purpose was this service created?

Secondly, the layers are poorly visible. It is not clear where the Domain Core is, but on the other hand, it is also not clear where all the adapters and infrastructure components are. For example, the Queues component is used by many other components. Is it an important domain object, or is it some small detail of interaction with the outside world? It is unclear.

As a result, the whole picture looks tangled. Everything is together, and everything depends on each other. It is not clear where the essential parts are. The main reason for this is that there are cyclical dependencies. And cycles in dependencies are a strong indicator of the Stable Dependency Principle violation. We will talk about it very soon. There are often two arrows in the picture. One is red, where the principle is violated, and one is black, where the principle is not violated.

Another problem is that all the components are specific. Therefore, it is better to depend on abstractions. If one component depends on abstraction, then a clear boundary is drawn between the components. And thanks to this boundary, these components can vastly change independently of each other without any avalanche-like propagation of changes in the system. The Abstractness metric here is minimal for most components, meaning they do not contain interface definitions (elixir behaviours).

Now the principles

Principles of cohesion

How are the components formed? What are they created from? We will cover two principles here. They are the Common Closure Principle and the Common Reuse Principle.
Common Closure Principle

Let's start with the Common Closure Principle. It says that files that change for the same reason must reside in the same component. And naturally, the opposite: files that change for different reasons and at different rates should be located in different components.

These architectural principles are very similar to more widely known SOLID principles but in a more generalized way. The SOLID counterpart of the Common Closure Principle is, of course, the mighty Single Responsibility Principle.

This principle follows from the fact that maintainability, that is, the convenience of maintaining a project, is more critical than reusability. When some functionality is in a single place, i.e., in one component, it is naturally more convenient to change it. On the other hand, if functionality is not split into a bunch of small independent pieces, it is more difficult to reuse it. A client should not depend on what he does not use (the Interface Segregation Principle). Maintainability is especially important in the early stages of project development. When to produce a lot of code, and preferably good understandable code, is more important than trying to package this code to be reusable in other hypothetical projects.

Common Reuse Principle

The next one is the Common Reuse Principle. Files that are used together must be in the same component. Sounds very familiar. Indeed, this architectural principle is a generalization of the SOLID Interface Segregation Principle. It follows from the fact that if a component has good cohesion and implements some logically integral piece of functionality, then its files should have many dependencies among themselves. It is difficult for clients to depend on one file in such a component and not to depend on the others because they need all the functionality implemented by this component. Customers don't want to use a small piece of a large multipurpose component that has everything in the world. You don't have to depend on what you don't need. Files that are loosely related to others should not be in that component. Clients of that component might not want to depend on them.

Principles of coupling

Now let's talk about coupling principles how the components are related. Very often, arbitrary relations between components indicate that something is wrong with cohesion as well. Relationships between components in this context mean source file dependencies (some code in a source file refers to a module name of another component).

Acyclic Dependency Principle

The first and the most critical principle is the Acyclic Dependency Principle. It's straightforward. There should be no cycles in the component dependency graph. You can always get rid of cycles. Dependency cycles make components inter dependable and make independent work on different parts of the system more difficult.

For example, let's say our component is a hex package. The hex package has a version. Suppose we had made changes to package B that forced us to increment its major version. Now we need to update all clients of this package. So we need to update package A. And since we change something in package A, we need to update its version in package C. Because of the circular connections of the three components here, we also need to update the version of package C in package B. So if we make changes to one component, we need to change all components in their cycle.

All three components physically form one monolithic system. If these components were microservices, it would mean a nasty lock step deployment of all of them at once. In the world of microservices, it is a nightmarish beast of a distributed monolith.

Breaking Cycles

How do I break the cycles? There are two approaches here.

Breaking Cycles with a new component

One way to do it is to move common code into a new component. Let's imagine we have two components — the component of Happy Doge and the component of Good Doge. But, as we know, all Doges are good happy doggos. Therefore, the Happy Doge uses docility of the Good Doge, and the Good Doge uses the happiness of the Happy Doge. We can combine this functionality into a new component of the base Doge and use it from both of these two components.

It's a great feeling if while trying to avoid a circular dependency, we suddenly realize that our system is missing some business-related component (the bounded context in DDD).

Breaking cycles with DI

The second way to break the loop is to reverse the direction of the dependency. If you have a cyclic graph, then to break it, it is enough to direct one dependency in the opposite direction. How to do it? Let's assume that we have a domain component, and it uses some functionality from the authorization component. But we do not want the domain to be dependent on anything. We want it to be the core of the system. What should we do? Within the domain component, we define an abstract interface and implement it inside the authorization component. It turns out that the domain no longer depends on any functionality in the authorization. Now the authorization component must implement some abstract interface that the domain requires from it. The concrete implementation of the functionality into the domain component is injected at runtime. That is a good old-fashioned dependency injection, which in my opinion, is highly underused in Elixir.

The most straightforward way to implement an interface in Elixir is behaviour. Every time I say interface, you can think about behaviour.

defmodule Domain.UseCases.CreateUser do
   alias Domain.User

   @spec create_user(String.t, UserRegistry.t()) :: {:ok, User.t()} | {:error, term}
   def create_user(username, user_registry) do
     case user_registry.exists?(username) do
       {:ok, true} ->
         # ...
       other ->
           # ...
       end
     end
   end

   defmodule Domain.UserRegistry do
     @type t :: module
     @callback exists?(username) :: {:ok, boolean} | {:error, term}
   end
 end

 defmodule Auth.LDAP do
   alias Domain.UserRegistry
   @behaviour UserRegistry

   @impl UserRegistry

   def exists?(username) do
     {:ok, true}
   end
 end

Stable Dependency Principle

Do you remember that odd red arrow that was on the opening diagram? It is time to make sense of it. That arrow was red because it violated the Stable Dependency Principle - dependencies should point in the direction of stability.

But what is stability? It is important to note that stability is not the opposite of volatility. It is not how often the source files of that component change. Stability is a definition in terms of dependencies. It determines how hard it is to change a component without breaking other components in your system. It measures how much work it takes to change it.

There are stable components and unstable components. Component A is a stable one. Three components use it. This means that it is responsible for them since they depend on it. If it wants to change its interface or internal functionality, these components may also need to change. Therefore, it makes the component stable. It's harder to change it.

Component B is unstable. It has no dependencies, and nobody uses it. Therefore, it can change as often as it wants.

Some components, by their very nature, must be volatile. We just want it. A good architectural principle is to divide components that change frequently and those that change rarely. Components that change frequently should be unstable. Components that rarely change can be stable.

Unstable components should depend on components that are more difficult to change and not vice versa. Because of their volatility, they may change frequently, and we don't want to modify a bunch of other components' code every time we change it (code smell known as the Shotgun Surgery).

Conversely, a volatile component should not be a dependency of a component that is difficult to change.

From this, we draw a simple conclusion. Dependencies should point in the direction of stability. Thus, if you go through the dependency graph, each next component in the system should be more stable than the previous ones.

To assess this more quantitatively, we need to introduce simple IN and OUT metrics. IN is the number of inbound dependencies, and OUT is the number of outbound dependencies. Those connections are source file-based.

For example, we have a Cc component. Inside it, there are two public files. They are used by two files from the Ca component, one from the Cb component and one from the Cd component. This means that its IN metric is equal to four.

Instability metric

I = OUT / (IN + OUT)

Based on the IN metric, you can calculate the Instability metric. The Instability of a component is the share of its outgoing connections among all its connections. If the Instability of a component is equal to one, then no one depends on the component. He has no reason not to change. It can be a volatile component. If Instability is zero, then the component is very stable. Other components depend on this component. It is difficult for it to change, but on the other hand, since it does not depend on other components. So it will change only for important reasons: for reasons of either unavoidable change in business logic or some kind of purposeful refactoring decisions that hopefully won't change its interface.

Let's consider the case of violation of the Stable Dependency Principle. I think it is a bit easier to use the Stability metric. It is simply the inverted Instability.

S = 1 − I = IN / (IN + OUT)

Suppose initially we had a stable component in the system. We designed it this way, and it contains the core of the system's functionality, one that we expect not to change very often. But at some point in time, while adding some new functionality to the core of the system, one of the developers saw that the initially volatile component Flexible had the code he needed, and he just jumped to reuse it.

A new connection was created between a stable component and a mutable component. This connection violates the Stable Dependency Principle. A stable component has a ⅔ Stability metric. It has two incoming connections, among a total of three connections. The flexible component has a Stability metric of 1/3. He has one incoming connection and two outgoing ones. We remember that in the Stable Dependency Principle, stability should increase in the direction of connections. Ideally, the Stability of the component that is lower in the picture should be greater than the Stability of the component that is higher in the picture, but here it is just the opposite. This connection is broken, and so is our original desire for a component that rarely changes to be stable.

A component that changes frequently, an unstable component, should have few or no incoming connections, and yet they have appeared.

Solution? Once again, it is the Dependency Inversion. Component C defines the interface it needs. Component D implements this interface, and the necessary functionality is injected into component C.

Stable Abstraction Principle

The next principle is the Stable Abstraction Principle. Stable components should be abstract. Indeed, we said that some components in the system must be stable, and many incoming connections to them can not be avoided. But it means that when this component changes, much other code in the system can break. This can be prevented by depending on abstractions. Stable components should be abstract, thereby loosening coupling by defining explicit interfaces. Unstable components need to be concrete because they implement some specific functionality, and that is why they have value. Dependencies must point in the direction of stability, and therefore dependencies must point in the direction of increasing abstractness.

You can take a simple metric A, the number of abstract files among all the files in our component.

A = AbstractFiles / TotalFiles

The Main Sequence

We can visualize these metrics of Abstractness and Instability. This plot has two zones to avoid.

The first is the Zone of pain. It contains very concrete components. And they have very high stability. These are difficult to change because if you try to change such components, you may need to change a bunch of others.

On the other hand, there is a zone of uselessness. There are components here that are abstract but very unstable. It is unclear why a component that is changing all the time and no one depends on it defines some abstraction. This is most likely some kind of rubbish, like unnecessary interfaces.

We can then assume that the most valuable and problem-free components will be the components that are as far away as possible from these two extreme points. And these components form what Bob Martin called the Main Sequence. Distance from this main sequence can be a good metric. The further the component is from the main sequence, the more suspicious it is. So it is prudent to take a close look at it. Why did such a bad guy appear in the system, and what is wrong with him? Maybe he has something wrong with connections. Somewhere you need Inversion of Dependencies? Or, maybe, in general, you need to repartition the responsibilities between other components or introduce some new ones.

There is a particular case, and it is essential to mention it. These are nearly immutable components. Although they are in the pain zone, they are not dangerous because they rarely change. The classic example is the standard libraries of the language. All of our code is permeated with it, but this is not a big deal because the developers of these libraries take on a solid obligation to define stable interfaces and functionality.

Clean and Hexagonal Architectures

Let's try to bring these principles together in the frameworks of Clean and Hexagonal Architectures. I will talk about them together as if they were the same thing. Because indeed, they have the same basic principles.

The first principle is that the domain is always at the center of all dependencies. Domain neither depends on nor directly uses any source files from other components. Only other components have physical connections to a domain because they implement the interfaces that are defined in it.

While Clean Architecture considers the domain to be central, Hexagonal architecture emphasizes that the inner parts of the architecture define the interfaces implemented by the outer parts.

Inside the system, in its core is the Domain, along the edges, there are all sorts of adapters to implement API, query the database, access Kafka, etc... But, fundamentally, these two architectures are very similar in their values.

To reiterate, the Domain is the most stable part of the application. Therefore, all other components depend on it. But these dependencies should be to data structures that are fundamental entities of a specific domain. Or they must be abstract interfaces that are defined by the Domain.

Both architectures have many things in common in their layering as well. In the center is the Domain Entities Layer. These are pure data structures and entities with basic behavior that is stable in the Domain. If, for example, you want to create a new application with new functionality in the same domain, then, theoretically, you could take these structures out into a new application and build some new functionality based on them with no or few changes. This is the most stable part of the application - the Domain, the part that changes relatively rarely and only for important reasons: changes in your knowledge about the domain.

Around the inner core is the Domain Services Layer, that is, use cases. These are those entities that implement business processes using the core Domain structures. They, by definition, are application-specific. Therefore, they are less stable and need to change more often.

Next is the Application Services Layer. This would include technical but important concerns such as transaction management for the database.

Along the edges of the system are Adapters. This is the lowest level logic - i/o. Those can be HTTP API controllers or clients, Kafka, queues, etc. When moving from the center of the system to the edges, stability and reusability decrease.

Reusability: Domain entities > Domain Services > Application Services > Adapters

Stability: Domain entities > Domain Services > Application Services > Adapters

Clean Mixer Example

Let's try to analyze a specific example. Look at the picture. First, we see where the core of our system is. It consists of im (Instant Messaging) and im/business_chat.

It is important to note that all dependencies for components inside our Core Domain are only incoming. Therefore, these components are very stable.

At the top, we have the app_server component, and it is very unstable because it has many outgoing dependencies. It contains all the dirty initialization: building supervisor trees, configuring dependency injection, etc.

We also can see the app_services components — it's the Application Services Layer, which lies between the very stable application core and the very unstable component app_server.

Red arrows point to the Stable Dependency Principle violation. But the difference in Stability metrics is not that big. So it does not seem that these violations bring any problems worth looking at at this specific moment.

Also, note that components within the Core Domain are more abstract than other components (I metric).

It is also quite interesting that two very stable components have many incoming connections: the xmpp and xep components. Those implement low-level protocols. This is the kind of functionality that someday soon we might like to move to a separate hex package and use in other applications that work with these protocols. Therefore, we want these components to be stable to be more conveniently reused in other applications.

It is also worth mentioning the data_io/interface component. This is an entirely abstract component that contains only the most basic data structures and interface definitions. It was created to break cyclic dependencies.

Automated architecture tests

But fixing the current state of the system is only the first step. How do we automatically check for architectural regressions? This can be done reliably only by automated tests. This is especially important if your target architecture is a Distributed Monolith. Without rigorous automated control, it can quickly regress to a Big Ball of Mud.

Some examples of tools just for that are Archunit for Java and NetArchTest for C#. Well, now Elixir has a Clean Mixer.

test/arch_test.exs

ExUnit.start(capture_log: true, trace: true)

defmodule MessagingServer.Architecture.ArchTest do
 use ExUnit.Case

 alias CleanMixer.Workspace

 @domain [
   "im",
   "im/business_chats"
 ]

 @core_infrastructure [
   "xmpp",
   "xep"
 ]

 @application_services [
   "jabber/app_services",
   "im/app_services",
   "data_io/app_services"
 ]

 @adapters [
   "chatbot_platform_api",
   "app_server",
   "network"
 ]

 @interface_components [
   "data_io/interface"
 ]

setup_all do
   workspace = CleanMixer.project(include_hex: true) |> Workspace.new(timeout_ms: 30_000)
   %{ws: workspace}
 end

defp dependencies_of(workspace, name) do
   Workspace.dependencies_of(workspace, name)
   |> Workspace.reject_hex_packs()
   |> Enum.map(& &1.target)
 end

 defp dependency?(workspace, source_name, target_name) do
   Workspace.dependency?(workspace, source_name, target_name)
 end

 defp component(workspace, name) do
   Workspace.component(workspace, name)
 end

 defp format_cycle(cycle) do
   cycle |> Enum.map(& &1.name) |> Enum.join(" -> ")
 end

The rules are described in the form of regular ExUnit tests. I split the components in the system into several layers, which are similar to the layers in the Clean Architecture. In each subsequent layer, stability decreases, and instability increases. The most stable layer is the Core Domain, then goes the Core Infrastructure - components that implement protocols. The next level is Application Services. And then the Adapters.

There are two exceptional cases. These are the interface components, which we do not want to depend on anything. And protocol components, which should be as stable as possible since we want them to have a minimum of outgoing dependencies for easy subsequent reuse.

In the first test, we don't want our system to have circular dependencies between components.

 test "there are shall be no circular dependencies between components", %{ws: ws} do
   assert Workspace.component_cycles(ws) |> Enum.map(&format_cycle/1) == []
 end

In the next test, we don't want Core Domain components to depend on application services.

 for comp <- @domain, app_service <- @application_services do
   test "domain component #{comp} shall not depend on application service #{app_service}", %{ws: ws} do
     refute dependency?(ws, unquote(comp), unquote(app_service))
   end
 end

We don't want our Domain Components to depend on Adapters:

 for comp <- @domain, adapter <- @adapters do
   test "domain component #{comp} shall not depend on adapter #{adapter}", %{ws: ws} do
     refute dependency?(ws, unquote(comp), unquote(adapter))
   end
 end

We check that the Core Infrastructure is independent of app_services:

 for comp <- @core_infrastructure, app_service <- @application_services do
   test "core infrastructure component #{comp} shall not depend on application service #{app_service}", %{ws: ws} do
     refute dependency?(ws, unquote(comp), unquote(app_service))
   end
 end

We check that the utils component is independent of any other component. Naturally, we want it to be stable and reusable.

test "`utils` shall have no dependencies", %{ws: ws} do
  assert dependencies_of(ws, "utils") == []
end

All in all, we check that each next more unstable level depends only on more stable components. In fact, we simply check for The Stable Dependencies Principle.

We can run these tests with the following command.

> mix run --no-start test/arch_test.exs

Running these tests on each commit in CI gives me a great deal of assurance that our architecture remains what I expect.

To summarize:

Prohibit cyclical dependency graphs
Separate domain and infrastructure. The domain should be at the center of your application, infrastructure at the edges.
Explicitly define interfaces and control dependency directions using Dependency Inversion, but prefer to depend on interfaces, not implementations.
Visualize the current state and automate checks for architectural regressions

DEV Community