Andreas Jim

Posted on May 8, 2020 • Originally published at Medium on Apr 21, 2020

Tagless Final in Scala: Best Practices

#programming #functional #taglessfinal #scala

Notwithstanding some criticism, the tagless final pattern continues to gain popularity among the Scala community. I want to share some things I learned while working with tagless final for about two years.

The tagless final pattern enables the programmer to declare modules with a very specific set of functions, and compose programs based on these modules. The foremost concerns of this approach are correctness and predictability. Today’s article is a collection of best practices and lessons learned and will hopefully help you utilize the pattern to its full potential and avoid any common pitfalls. The concepts and principles presented here are nothing new and will be familiar to every seasoned programmer; the article merely provides some guidance how to apply them in the context of tagless final programs.

Coding Style Recommendations

Naming

Let’s start with the most controversial of them all — naming. Although not directly related to tagless final, I find it worth sharing that the following naming scheme has proved to be practical:

ch.becompany.authn.AuthnDomain
ch.becompany.authn.AuthnDsl
ch.becompany.authn.AuthnService
ch.becompany.db.DbDomain
ch.becompany.db.DbDsl

The package and class names are redundant (as opposed to, e.g., ch.becompany.authn.Dsl), but this scheme has some advantages:

Autocomplete is a bit quicker since you don’t have to cycle between alternatives.
Generally, since each DSL has a unique name, no imports have to be renamed to avoid name collisions. This ensures that the same class name is used everywhere in the codebase, which aids readability.
Even when you need only one DSL in the scope, you can immediately see which DSL it is.

Context Bound Style and the apply Function

Scala offers two syntactic styles for declaring implicit dependencies: implicit parameters and context bounds. I generally prefer the context bound style because it makes the constraints for a type parameter immediately obvious, and it doesn’t require naming parameters. Here’s an example of the context bound style with multiple type parameters:

def registerUser[
F[_] : Monad : AuthnDsl,
G[_] : LogDsl
]: F[Unit] = ???

The only disadvantage of the context bound style that I’m aware of is that sometimes the compiler has trouble inferring the type parameter when the function is used, and requires a type ascription.

The companion object of the DSL trait should provide an apply function which can be used to resolve the DSL instance in the current implicit scope:

object AuthnDsl {
  def apply[F[_]](implicit ev: AuthnDsl[F]): AuthnDsl[F] = ev
}

This style has the additional benefit that the type of the DSL, in this case F, is immediately apparent:

AuthnDsl[F].register(…)

Design Considerations

Constraints and Overconstraining

A core concept of tagless final is declaring dependencies. The following function declares a dependency to an AuthnDsl instance for the type F:

def registerUser[F[_] : AuthnDsl]: F[Unit] =
  AuthnDsl[F].register(…)

Each dependency can also be viewed as a constraint on the type F, since it reduces the set of all possible concrete type instances for F. Each function should only declare those constraints that its implementation actually needs, for the following reasons:

The fewer constraints a function has, the more types it supports. In other words, functions with fewer type constraints are more generic and can be applied in more situations.
Each additional constraint increases the capabilities of a function. The more capabilities a function has, the less predictable it becomes.

The relationship between constraints and reasoning is one of the key benefits of programming with effects and the tagless final approach: Given that its implementation is disciplined, i.e. it doesn’t directly invoke any side effects, it is possible to predict what operations the resulting effect instance can contain.

Consider the following example:

def registerUser[F[_] : AuthnDsl : EmailDsl : Monad]: F[Unit] = ???

Just by looking at the function signature, without knowing anything about its implementation, you can be sure that the effect instance returned by this function can only

Execute effects related to authentication,
Execute effects related to e-mail, and
Combine the results using monadic operations.

Now let’s see what happens when we replace the Monad constraint with the Sync constraint from the cats-effect library:

def registerUser[F[_] : AuthnDsl : EmailDsl : Sync]: F[Unit] = ???

This change opens Pandora’s box. The function can now suspend arbitrary side effects in the returned effect instance — it could, to quote a coworker, “launch a rocket”. In addition, it can potentially instantiate other DSL interpreters which declare the Sync constraint. It is impossible to know what executing the effect instance returned by this registerUser function will actually do.

The practice of declaring unnecessary constraints, also called overconstraining, violates the principle of least power and should generally be considered a code smell.

Type classes and Coherence

Sometimes an application requires multiple implementations for a service. Consider for instance our authentication DSL: Our application may provide authentication against multiple providers, e.g. an LDAP back-end and a user table in a database.

Consider our authentication DSL. Let’s assume that we want to provide a function to authenticate against multiple back-ends. The obvious solution is to implement a dedicated DSL interpreter for each authentication mechanism:

import cats.Monad
import cats.implicits._
import ch.becompany.Domain.User

object AuthnService {

  def authenticate[F[_]:Monad](email: String,
                               password: String,
                               authenticators: AuthnDsl[F]*):
      F[Either[String, User]] =
    authenticators.toList
      .collectFirstSomeM(
        _.authenticate(email, password).map(_.toOption)
      )
      .map(_.toRight("Authentication failed"))

}

While this is entirely possible using the tagless final pattern, it is only recommended if the DSL is not considered a type class. The coherence principle of type classes states that, for each type class Dsl[F[_]], there may only be one interpreter for a specific type F in scope. The main goal of coherence is safety: Using different type class interpreters (e.g. different Order interpreters for the Int type) can lead to incorrect programs. Even though Scala doesn't enforce coherence, it is still advisable to keep this principle in mind before providing multiple interpreters for a DSL which is encoded as a type class.

Modularisation using DSLs

Most programs of a certain complexity can be expressed as a series of layers. The lowest layers deal with technical intricacies and are nowadays usually hidden in third-party libraries. When we move up the stack of layers, technical aspects give way to higher-level functionality that more directly represents the business requirements of our program.

Create Instances at the End of the World

In functional programming, the top-level or main code block of the application is often referred to as the end of the world. This is the place where everything happens that cannot be postponed or delegated anywhere else, such as executing side effects or instantiating DSLs. Program code should usually never create a DSL instance, but always consume it as a dependency. This way, the client code (ultimately the main block or test case) has the freedom to provide the desired implementations. Furthermore, if a DSL instance requires an expressive dependency à la cats.effect.Sync, passing the instance instead of the dependency limits the power of the enclosing module. Compare for instance the following functions:

def login[F[_] : Sync](username: String,
                       password: String): F[Unit] = {
  // requires Sync
  implicit val authnDsl: AuthnDsl[F] = AuthnDsl.interpreter[F] 
  authnDsl.authenticate(username, password)
}

def login[F[_] : AuthnDsl](username: String,
                           password: String): F[Unit] =
  authnDsl.authenticate(username, password)

The first example could potentially suspend arbitrary side effects in F and is therefore, judged solely based on its signature, much less predictable than the second example. Furthermore, since it hard-codes the DSL implementation, it cannot be used in conjunction with other AuthnDsl interpreters, such as mocks.

Another benefit of creating a DSL instance at the end of the world, i.e. in a single place, is the guarantee that the same implementation is being used throughout the complete application.

Minimize DSL Scope

When modularizing a program using DSLs, each DSL should ideally only address a single concern. This principle is congruent with the Unix philosophy — make each program (or in our case, DSL) do one thing well. Minimizing the scope of DSLs means that code which uses it can wield less power. Furthermore, implementations generally require fewer dependencies, which improves code safety and predictability.

Contain Dangerous Dependencies

As we have mentioned before, some dependencies are particularly expressive and therefore powerful. One prominent category are those which can be used to suspend arbitrary side effects, such as Sync, Async or IO from the cats-effect library. A code unit which has access to these dependencies wields virtually limitless power and is therefore unpredictable. As a consequence, it is desirable to limit the usage of these dependencies, if possible to the main block of the application.

Ideally, only the lowest-level technical DSLs which directly rely on side effects like I/O should be given the power of suspending arbitrary effects. If at one point you find yourself in a situation where you require access to Sync or a similar class, think about whether it would be appropriate to factor out the operation into a dedicated low-level DSL, and instantiate it at the end of the world.

DSL vs. Service

When should you implement a DSL, and when is it better to use a plain object (which we will call a service) to group a set of related functions? As we have already discussed, one reason to introduce a DSL is to hide precarious implementation dependencies such as cats.effect.Sync. Another purpose is to provide polymorphic modules, i.e. modules that have different implementations for different types – or, in some cases, can even have multiple implementations for a single type.

A DSL has a key drawback compared to a service: Each interpreter of the DSL has to implement every function of the DSL. All functions of the interpreter share a single set of constraints — the constraints declared by the function producing the interpreter. If some function implementations don’t use all of these constraints, these functions are technically overconstrained.

Because of the drawback mentioned above, only polymorphic functions — i.e. those which require different implementations in different interpreters — should be declared in the DSL. Everything else can be provided by a service object.

Error Handling

When it comes to error handling and propagation, there are basically two choices: Declaring the error channel in the signatures of the DSL functions, or using a constraint.

The first approach is straightforward:

trait AuthnDsl[F[_]] {
  def authenticate(username: String,
                   password: String): F[Either[Error, User]]
}

This option is particularly useful if the client code of the function is expected to be interested in any potential errors and to be able to handle them appropriately. This certainly applies to an authentication failure.

With the second approach, the declaration of the DSL doesn’t concern itself with error propagation and leaves this subject to the interpreter. Here’s an example:

trait AuthnDsl[F[_]] {
  def authorize(user: User): F[List[Role]]
}

object AuthnDsl {
  def interpreter[F[_] : MonadError[Throwable, ?]] = ???
}

Here the implementation declares a MonadError constraint, which means that the instance of the type F is expected to provide an error handling mechanism for a specific error type, in this case Throwable. This approach makes sense for errors which probably can't be handled by the client code and have to be propagated, sometimes even to the end of the world. In our example this would for instance be a network communication error occurring when accessing an LDAP server. As mentioned in the previous section, this approach results in overconstraining in case some of the DSL members require error handling and others don't, an implication to keep in mind when designing your DSLs.

Of course both approaches can be combined, e.g. using an error channel in the return type for business errors and an implementation constraint for low-level technical errors.

DSLs for Common Side Effects

As we have learned earlier, having direct access to low-level type classes for expressing side effects — like cats.effect.Sync – is potentially dangerous. For this reason it is advisable to provide (or use existing) DSLs for frequently used effectful functionality like logging or generating timestamps and UUIDs. It is also worth considering concealing functions from third-party libraries requiring expressive dependencies, such as http4s, behind façade DSLs to avoid leaking Sync & friends into the codebase.

Limitations

I came across a couple of limitations which I haven’t found any solutions or workarounds for. Please don’t hesitate to provide your insights in the comment section.

It is not possible to abstract over a set of constraints — each function has to declare the complete set independently.
It is not possible to identify any unused constraints in IntelliJ IDEA, a circumstance particularly problematic since unused constraints implicate overconstraining and are therefore a code smell. Scala 2.13 provides the compiler option -Wunused:implicits, but I haven't tried it yet and I don't know whether it works with context bounds.

Open Source Libraries

Here is a list of some freely available libraries for the cats-effect ecosystem. Some of them provide DSLs directly, others have to be wrapped in a DSL to be used with the tagless final pattern.

cats-tagless — Transform and compose tagless final encoded algebras
log4cats — Logging (Logger DSL)
fuuid — Generating UUIDs
pureconfig — Loading configuration files
doobie — Database access
http4s — HTTP (Client DSL)
fetch — Data access

Final Words

Despite its shortcomings, I believe that tagless final is a valuable addition to the toolbox of the functional Scala programmer. I hope this article contained some helpful advice. I would be very happy to hear about your experience with tagless final — please feel free to use the comment section to let us know about your thoughts, insights and additional best practices. Thank you very much for reading!

Cover image: Photo by Josue Isai Ramos Figueroa on Unsplash

Top comments (1)

Carlos Saltos • Nov 13 '21

Thank you for sharing, very cool article !! 👍😎

DEV Community