Eddy Nguyen

Posted on Jul 15, 2021 • Originally published at 99designs.com.au

Schema-driven development in 2021

#webdev #graphql #protobuf #prisma

Schema-driven development is an important concept to know in 2021. A lot of full-stack applications are built with schema-driven technologies. Schema-driven development could help teams build products better and faster. What exactly is schema-driven development? What are the benefits of schema-driven development? We will explore the answers to these questions in this article.

What is schema-driven development?

The schema is a contract

First, let's understand what a schema is. A schema is a contract between two sides of a system. The schema communicates the type of requests that can be made and the expected type of response.

Depending on the context where the schema is used, each side could play a different role. For example, in traditional web applications, the side making a request is the client or browser. The side returning a response is the application server.

Diagram of a Client making request to a Server following schema specifications — Example of a client making request to a server based on schema specification

A schema follows a specific, unambiguous type of language defined by the technology that you use. Schema language is usually programming-language agnostic that communicates common software ideas such as objects, enumeration, field types (e.g. string, integer). Some common schema languages you may have encountered before are YAML and JSON. The client and server should understand these concepts to fulfil their part of the contract.

If we think a system works like a restaurant where the client is the "customer" and the server is the "wait staff". Then, the schema is the "menu". Given a menu, the customer can quickly scan through to find the "dish" (or resource) to consume. In some cases, the customer could see the ingredients and tell to the wait staff to remove items from the final dish.

SDL-first vs code-first — The schema is the menu. Via Unsplash

Schema-driven development ( SDD ) definition

SDD prioritises the design of the schema and uses it as the first-class citizen to communicate the responsibilities of the client and the server. This contract usually becomes the API.

Let’s say we have a web application being built by two teams: Frontend and Backend.

A non-SDD development process may look like this:

Frontend team contacts Backend team for data that Backend team owns.
Backend team receives the request and starts to create the endpoint that returns the data Frontend team needs.
Backend team finishes the endpoint and lets Frontend team know.
Frontend team tries out the endpoint but it may not have what they need.
Backend team goes back and fixes the endpoint.
Repeat steps 3-6 until Frontend team gets what they need.

On the other hand, the SSD process may look like this:

Frontend team proposes changes to the schema.
Backend team understands exactly what they need to implement.
While Backend team is creating an endpoint that fulfil their part of the contract, Frontend team already knows what that looks like and can mock it out.
Both teams finish their side of the contract.

Note: Just because you are working with a technology that has a schema does not mean you are applying SDD. The schema design and discussions must be front and centre for the process to be called SDD.

Benefits of schema-driven development

There are many benefits of schema-driven development, which allows teams to build better and faster applications.

1. Better cross-team communications

Effective communication is unambiguous, concise and intentional. These are also the attributes of a schema language. When a team proposes changes to a schema, the team on the other side should know what must be done.

In fact, I was in numerous projects where complicated changes can be understood from reading the schema proposals.

Without writing a single line of code.
Without countless chat messages.
Without tortuous video call meetings.

We just nodded at each other, and knew exactly what must be done. We understood each other as if we somehow develop Professor X's level of telepathic power.

2. Better API design

SDD forces developers to think about designing the contract in abstract schema language. This frees up any implementation details from the conversation. If two teams are involved, they can see whether they can fulfil their part of the contract and suggest changes early. This usually results in well thought out API design for both sides of the system.

If a side cannot fulfil their part of the schema after the implementation phase has started, they can let the other side know and propose different schema changes. This saves a lot of time and money because errors caught in the design process or early in the implementation phase are much cheaper to fix.

3. Independent client and server development

Each side of a schema should know exactly what the other side can provide. This knowledge allows development to happen at the same time because we can mock out each side’s payload. In a scenario of a SDD web application, the frontend team should know an endpoint’s response from the schema and therefore can mock said response to code their part without needing to wait for the backend team to do any work.

4. Clear entity relationships

In software development, we usually need to identify various entities and how they interact with each other. Schema is a great way to represent said relationships.

For example, a Farm entity may house many Bull entities and one Bull entity can only stay in one Farm entity at a time. This relationship can be written in the following made-up Bull Schema Language:

// Bull Schema Language ( BS Language )
entity Bull {
  id: String
  name: String
}

entity Farm {
  id: String
  name: String
  bulls: Bull[] // An array of Bulls
}

5. Type-safety

Type-safety is important when building medium to large applications. A lot of bugs can be caught at compile time by using languages like Go or TypeScript.

A schema has information about the types and interfaces of entities, requests and responses. There are usually tooling in the ecosystem that could help generate code which conforms to the types declared in the schema.

When code can be generated from the schema, it helps cut down a lot of time and effort in development as developers can focus on the business logic, rather than writing boilerplate-y code to send requests and responses. Some examples of code generators are: GraphQL Code Generator, Swagger Codegen etc.

Examples of schema-driven development technologies

In 2021, SDD can be applied end-to-end when building applications. In this section, let’s look at some examples of SDD technologies.

GraphQL

GraphQL is developed by Facebook initially for their mobile app and has been adopted widely across web apps.

GraphQL's schema is called Schema Definition Language (SDL). Given a schema, GraphQL clients can declare the fields they want to query. This drastically reduces the amount of data that is sent over the internet because the client usually only needs a subset of an entity attributes. Learn more about GraphQL schema.

There are numerous implementations of GraphQL clients and servers. You will be spoiled for choice when you are in this space. You can check out a list of outstanding GraphQL related libraries here.

To build clients and servers, you will need consistent tooling and support. The GraphQL open-source community is truly astonishing in this regard. The Guild is the main powerhouse coordinating a myriad of GraphQL projects.

Remember how I was telling you code can be generated from the schema? @dotansimha takes home the cake for creating a highly versatile plugin-based GraphQL Code Generator that works for various clients and servers.

There is also gqlgen which is a Go-based server and a code generator. If you want the performance of Go, in a qualified GraphQL server, with type-safety and heaps of other functionalities, gqlgen is the one you’d want. 🤩

Here's an example in GraphQL schema:

query {
  bull(id: ID!): Bull
  farm(id: ID!): Farm
}

type Bull {
  id: ID!
  name: String!
}

type Farm {
  id: ID!
  name: String!
  bulls: [Bull!]!
}

Common use case	Browser/Mobile to Server/s communication
Clients	Relay, Apollo Client, urql, and more
Servers	Apollo Server, gqlgen, Nexus, and more

A quick note about GraphQL and SDD

If you are coming from the wonderful world of GraphQL, you might hear the terms schema-first development, Schema Definition Language first ( SDL-first ) and code-first being thrown around a lot. I will use this section to clarify what each term means.

Schema-first development is the process of building software where schema-design is prioritised. This is the same as schema-driven development.
SDL-first is an implementation approach where code is often generated from the schema.
Code-first is an implementation approach where resolvers are created first and the schema is generated from the code.

Both SDL-first and code-first approaches in the context of GraphQL can be schema-driven development if the schema design is the number one focus. It might seem like you are doing SDD by using the SDL-first approach because both processes start from making changes to the schema. However, it is not SDD if the team providing the resource ( i.e. the Backend team in the example in "schema-driven development definition" section ) does not want to discuss the schema changes up front with the team consuming the resource ( i.e. the Frontend team ).

For those who are not familiar with GraphQL, you have just stumbled upon something that I'd like to called GraphQL: Civil War. You can do a Google search on SDL-first vs code-first to learn more about the ideology of each approach.

gRPC/Twirp

gRPC and Twirp are Remote Procedure Call frameworks. gRPC is developed by Google and Twirp is developed by Twitch. They both use Protocol Buffers (Protobuf) as the Interface Definition Language. Protobuf is also the serialisation protocol for structured data.

gRPC and Twirp are commonly used in a micro-service architecture. The client is the service making the request and the server is the service returning the response. Services can choose to send data encoded using Protobuf or JSON. Protobuf can be serialised and sent faster than JSON. JSON is useful if you need to debug calls between services.

They both support code generation for clients and servers in many languages: Go, PHP, Ruby, TypeScript, etc. You can see gRPC supported languages here and Twirp supported languages here. This works out great as companies with micro-service architecture may choose different languages for each service.

At 99designs, we have backends in different languages: Go, PHP and Ruby. By using Twirp, we allow teams to code in their preferred language while maintaining a consistent, type-safe and convenient way to send data from one service to another.

Here's an example in Protobuf:

service FarmingApi {
  rpc GetBull(GetBullRequest) returns (Bull);
  rpc GetFarm(GetFarmRequest) returns (Farm);
}

message Bull {
  string id = 1;
  string name = 2;
}

message GetBullRequest {
  string id = 1;
}

message Farm {
  string id = 1;
  string name = 2;
  repeated Bull bulls = 3;
}

message GetFarmRequest {
  string id = 1;
}

Common use case	Server to server communication
Clients	Automatically generated for different languages
Servers	Automatically generated for different languages

Prisma

Prisma is an Object–relational mapping ( ORM ) library which represents database models through a central schema written in Prisma Schema Language ( PSL )

The Prisma schema is effortlessly easy to read. It clearly shows model relationships and field types. On top of that, other important information such as database connection is also stored in the schema. Learn more about Prisma schema.

From the schema, a TypeScript Prisma Client can be generated that can be used in Node.js applications - including Next.js! A Go Prisma Client is also in the works.

On the other side of the schema, Prisma has a sophisticated Prisma Migrate feature to keep the database in sync with the schema. What's great about this is that it works for multiple types of database: PostgreSQL, MySQL and SQLite with support coming for other types!

Personally, I normally don't care for traditional ORMs. I think they are a type of abstraction that requires a lot of work to maintain while hardly making the relationships of entities clearer to developers. I'm a lazy developer so anything that's remotely hard is a turn off.

Prisma is the only ORM that hits the sweet spot for me. It is easy to see the relationships of entities without connecting to the database and click through 200 tables to build up a mental map to connect all the pieces together. The TypeScript Prisma Client is generated with strong type-safety. It's almost like I have two other senior developers watching over me. And when I'm about to do something silly they would be like: "Tsk tsk don't do that, do this!". 😍

Here's an example in Prisma schema:

datasource db {
  provider = "mysql"
  url      = env("PRISMA_DATABASE_URL")
}

generator client {
  provider        = "prisma-client-js"
}

model Bull {
  id     String @id @default(uuid())
  name   String
  farm   Farm
  farmId String
}

model Farm {
  id   String @id @default(uuid())
  name String
  bulls Bull[]
}

Common use case	Server to database communication
Clients	TypeScript Prisma Client, Go Prisma Client
Supported databases	PostgreSQL, MySQL and SQLite ( as of July 2021 )

🙌 Summary

Schema-driven development is becoming more relevant in 2021 with many benefits: better cross-team communications, better API design, independent client and server development, clear entity relationships and type-safety. You can easily build a full-stack application with SDD technologies such as GraphQL, Twirp and Prisma.

I hope you had fun reading this article as much as I did researching and writing it. For similar posts about software engineering, follow me on Twitter @eddeee888. 🤙

Top comments (1)

Daniel Norman • Jul 16 '21

Great article!