Building Repositories with Spring Data

#domaindrivendesign #spring #jpa #ddd

Yesterday, we learned how to build aggregates with Spring Data. Now when we have our aggregates in place, we need to build repositories for storing and retrieving them.

Building repositories with Spring Data is very easy. All you need to do is declare your repository interface and have it extend the Spring Data interface JpaRepository. However, this also makes it easy to accidentally create repositories for local entities (which may happen if you have developers unfamiliar with DDD but familiar with JPA). Therefore, I always declare my own base repository interface like this:

@NoRepositoryBean // <1>
public interface BaseRepository<Aggregate extends BaseAggregateRoot<ID>, ID extends Serializable> // <2>
        extends JpaRepository<Aggregate, ID>,  // <3>
                JpaSpecificationExecutor<Aggregate> { // <4>

    default @NotNull Aggregate getById(@NotNull ID id) { // <5>
        return findById(id).orElseThrow(() -> new EmptyResultDataAccessException(1));
    }
}

This annotation tells Spring Data not to try to instantiate this interface directly.
We limit the entities served by the repository to aggregate roots only.
We extend JpaRepository.
I personally prefer specifications to query methods. We'll return to why in a little bit.
The built in findById method returns an Optional. In many cases when you fetch an aggregate by its ID you assume it will exist. Having to deal with the Optional every single time is a waste of time and code so you might as well do that in the repository directly.

With this base interface in place, the repository for a Customer aggregate root could look something like this:

public interface CustomerRepository extends BaseRepository<Customer, CustomerId> {
    // No need for additional methods
}

This is all you need for retrieving and saving aggregates. Now let us have a look at how to implement queries.

Query Methods and Specifications

The most straightforward way to create queries in Spring Data is by defining carefully named findBy-methods (if you are not familiar with this then check the Spring Data reference documentation).

I find these useful for simple queries that look for aggregates based on one or two keys only; for example, in a PersonRepository you could have a method called findBySocialSecurityNumber and in a CustomerRepository you could have a method called findByCustomerNumber. However, for more advanced or complex queries I try to avoid using findBy-methods.

I do this mainly for two reasons: First, the method names tend to become very long and pollute the code wherever they are used.

Second, very specific needs from application services may sneak into the repository and after a while your repositories are full of query methods that do almost the same thing but with small variations. I want to keep my domain model as clean as possible. Instead, I like to construct my queries using specifications.

When you query by specification, you start by building a specification object that describes the result you want from your query. Specification objects can also be combined using the logical operators and and or. For maximum flexibility, I try to keep my specifications as small as possible. If needed, I create composite specifications for commonly used specification combinations.

Spring Data has built in support for specifications. To create a specification, you have to implement the Specification interface. This interface relies on the JPA Criteria API so you need to familiarize yourself with that if you have not used it before (here is Hibernate's documentation about it).

The Specification interface contains a single method that you have to implement. It produces a JPA Criteria predicate and takes as input all the necessary objects you need to create said predicate.

The easiest way of creating specifications is by making a specification factory. This is best illustrated with an example:

public class CustomerSpecifications {

    public @NotNull Specification<Customer> byName(@NotNull String name) {
        return (root, query, criteriaBuilder) -> criteriaBuilder.like( // <1>
            root.get(Customer_.name), // <2>
            name
        );
    }

    public @NotNull Specification<Customer> byLastInvoiceDateAfter(@NotNull LocalDate date) {
        return (root, query, criteriaBuilder) -> criteriaBuilder.greaterThan(root.get(Customer_.lastInvoiceDate), date);
    }

    public @NotNull Specification<Customer> byLastInvoiceDateBefore(@NotNull LocalDate date) {
        return (root, query, criteriaBuilder) -> criteriaBuilder.lessThan(root.get(Customer_.lastInvoiceDate), date);
    }

    public @NotNull Specification<Customer> activeOnly() {
        return (root, query, criteriaBuilder) -> criteriaBuilder.isTrue(root.get(Customer_.active));
    }
}

Here I'm just doing a simple like query, but in a real-world specification you would probably want to be more thorough, paying attention to wildcards, case matching and so on.
Customer_ is a metamodel class generated by the JPA implementation (such as Hibernate).

You would then use the specifications in the following way:

public class CustomerService {

    private final CustomerRepository repository;
    private final CustomerSpecifications specifications;

    public CustomerService(CustomerRepository repository, CustomerSpecifications specifications) {
        this.repository = repository;
        this.specifications = specifications;
    }

    public Page<Customer> findActiveCustomersByName(String name, Pageable pageable) { // <1>
        return repository.findAll(
            specifications.byName(name).and(specifications.activeOnly()), // <2>
            pageable
        );
    }
}

Never ever write methods that return a result set without an upper bound (at least in production code). Either use pagination (like I do here) or use a finite and reasonable limit on how many records the query can return.
Two specifications are here combined together using the and operator.

A Note About Repositories and QueryDSL

Spring Data also supports QueryDSL. In this case you are not working with specifications but with QueryDSL predicates directly. The design principle is pretty much the same so if you feel more comfortable with QueryDSL than with the JPA Criteria API there is no reason for you to change.

Specifications and Testing

There is one noticeable drawback with using specifications in favor of query methods and that has to do with unit testing. Since the specifications are using the JPA Criteria API under the hood, there is no easy way of making assertions on the contents of a given Criteria object without constructing and analysing its JPA predicate - a nontrivial process.

However, there are ways around this. The most obvious way is to just ignore checking the incoming specifications when mocking repositories in your unit tests and use separate integration tests to test your specifications, for example with an in-memory H2 database. In many cases this may be just good enough.

There is also another way that avoids the use of integration tests but requires some extra work upfront. If you take a closer look at the specifications factory, you will see that the factory methods are not static but instance methods and the class itself is not final. This means that you can mock or stub the entire factory. Also, since the factory methods only return objects that implement the Specification interface, you can mock or stub that interface as well. This means that as long as you avoid using the static helper methods on the Specification interface (which use the JPA Criteria API), you can build a mock specification factory that returns mock specifications that can then be analyzed and used as the basis for test assertions. Unfortunately this post is not the right place to dig deeper into this so I'll just leave it as an exercise to the reader.

In the next post, we are going to look at how to use value objects as aggregate IDs. Stay tuned!