Anvil Engineering

Posted on Aug 19, 2021 • Originally published at useanvil.com

Introduction to the Twelve-Factor App

#webdev #tutorial #github

Web applications can come and go, but once you have an application/service that starts gaining traction, your simple
setup on shared hosting will eventually start to show its limits. The app you started has grown. It now uses a few
third-party services and relies on more than a few libraries and packages. Your manual deployment processes have also
grown and are now more prone to human error in the process. What can a developer do to start moving towards the path of
more maintainability and scalability?

One proven way is the “Twelve-Factor App” methodology. This blog post will cover I-IV in the
Twelve-Factor App methodology. Many of these twelve concepts are also building blocks towards building out even more
complex systems like Kubernetes, but let’s not get ahead of ourselves. We can define our starting point as:

An application that:

is a web application
maybe already hosted/deployed on a service like Amazon’s AWS or Google Cloud
needs a lot of manual attention, especially when updating production, managing application settings, and dealing with infrastructure changes.

As we step through all the different factors, we should keep in mind their benefits, both short and long-term, and how
each concept leads to more maintainability and scalability. More importantly, though this is not exactly measurable,
this should reduce your stress when dealing with this part of the application development cycle (no guarantees!).

At the end of this series, your app will (to mention a few):

have a higher level of security when it comes to configurations and credentials used in your app
have a more robust system for debugging issues
be more resilient if services change
be on a path towards more automated processes, such as build and deploy steps.

I. Codebase

One codebase tracked in revision control, many deploys - https://12factor.net/codebase

First up: your codebase. The soul of your application. On the surface, the concepts are pretty simple:

code must be in a repository
only one codebase per app

One codebase tracked in revision control ...

These days, you’re likely already using a code versioning system, and it will most likely be git.
There’s also other version control systems such as Subversion (svn)
or Mercurial hg). This post will use concepts as it relates to git.

If you’re not using a code versioning system on your project, you should stop reading this and start using one right
now. It’s that important.

This acts as somewhat of a backup in case of data loss on your system and also has a number of other benefits like
tracking historical code changes and ease of collaboration when a team is working on a single codebase. Pick a code
versioning system (probably going to be git) and use a service like GitHub, GitLab, or even host your own.

One codebase ... many deploys

There’s some nuance with these specific words. They are highly subjective and might not apply in all situations, but
here’s one way to look at it.

“One codebase” refers to a single app. For example, if you have a company blog and a customer management system in the
same repository, that violates this rule. You will often need to give each project its own repo. Technically an app
codebase can contain references to other codebases’ repositories (
see: git submodules), but the majority of the time you would want
to define those as “dependencies”. This will be covered in the next section, section II.

“Many deploys” is where a developer must be mindful of what’s being committed into the repository. “One codebase … many
deploys” can be interpreted as a single source of truth (your git repo) that all environments use. More importantly, all
environments (development, production, test, etc) should be able to use the codebase as it is in the repo with very
little intervention.

An example of this: you have an API backend. For testing purposes you've been including extra data for debugging each
API response:

# Taken from https://fastapi.tiangolo.com/#example

@app.get('/items/{item_id}')
def get_item(item_id: int, q: Optional[str] = None):
    return {
        "id": item_id,
        "q": q,
        "debug": "I need this during development. Please ignore."
    }

With “one code base … many deploys” this would be a problem. When deploying to a production environment, a developer
would need to remove the "debug" item every time production is updated. There are many frameworks that use some sort
of global variable to check if an application is running in production mode, but that will be covered later in section
III.

In this case, we want to ensure that these debug statements don’t appear. For now, we can rewrite the above as:

# Taken from https://fastapi.tiangolo.com/#example

@app.get('/items/{item_id}')
def get_item(item_id: int, q: Optional[str] = None):
    ret = {
        "id": item_id,
        "q": q,
    }
    # Assuming there's a `DEBUG_MODE` global variable set in our app's config.
    if DEBUG_MODE:
        ret["debug"] = "I need this during development. Please ignore."
    return ret

II. Dependencies

Explicitly declare and isolate dependencies - https://12factor.net/dependencies

As with the last section, more modern development processes have this concept ingrained in their usage.

Why is declaring dependencies important? Reproducibility. Code deploys should be as identical as possible between
environments. This is extremely important, for example, if someone were to debug production issues on their own local
development setup.

For example, with Python it’s generally understood that pip is used for package management (with
a requirements.txt file) and virtualenv is used to isolate application dependencies
from system dependencies. More recently, this can also be done with newer tools
like Pipenv and Poetry.

Also, in the Javascript world, npm and yarn are both generally understood to be the main package management tools (
with package.json
and yarn.lock) and optionally nvm (or similar) can be used to select a Node runtime.

Python is especially significant in this section since it’s widely used in modern operating systems under the hood. In
Linux, many system scripts use Python, which is why distributions are pinned to specific Python versions. Similarly, as
of the time of writing, Macs include Python 2.7 as its main Python binary (/usr/bin/python) which has been deprecated
since January 1, 2020 in order to keep compatibility with legacy software. If you want to run your app with a specific
version of Python, especially newer versions, isolating your runtime and dependencies is a must.

Another way to isolate dependencies, though more involved and adds complexity, is to use Docker.
See Docker’s official documentation for more information. With
Docker, you will be able to completely isolate your runtime, application code, and its dependencies from your host (i.e.
your computer’s system dependencies) via containers. This also brings another level of reproducibility as
the Dockerfile defines all the steps necessary to bring up a working version of the app.

Using these tools to isolate your dependencies gives full visibility of what your app needs to run and build.

III. Config

Store config in the environment - https://12factor.net/config

Application configs are a critical piece to this twelve-part puzzle. Since configs will definitely differ across
environments, we need a way for our application’s configs to be more dynamic and change depending how it’s deployed. Not
only will this streamline deployments, but it also has additional benefits such as securing sensitive data (i.e.
database usernames and passwords) since that data is never committed into the source repo as code. This also helps
satisfy requirements in section I.

Let’s look at an example:

# Taken from: https://fastapi.tiangolo.com/advanced/async-sql-databases/?h=async+datab#import-and-set-up-sqlalchemy
import databases
from fastapi import FastAPI

# Use in dev
# SQLAlchemy specific code, as with any other app
DATABASE_URL = "sqlite:///./test.db"

# Use in production
# DATABASE_URL = "postgresql://user:password@postgresserver/db"

database = databases.Database(DATABASE_URL)

With the above, you can see thatDATABASE_URL is hardcoded for both development and production cases, and we’ve exposed
our username and password in our code.

If this code were committed and open-sourced (or otherwise somehow made public), that would be a pretty significant
security problem.

What can we do instead? Use environment variables.

Without going too deep into what environment variables (or env vars) are, we can think of them as variables that exist
alongside your app. They are also accessible from your app, making them very useful as dynamic variables that change
depending on your app’s runtime environment.

What are we working towards? Here’s the following from
the official Twelve-factor App site:

A litmus test for whether an app has all config correctly factored out of the code is whether the codebase could be made open source at any moment, without compromising any credentials.

With our code above, we definitely don’t want that publicly available, so now let’s modify the above with
Python’s os.environ mapping, but first we need to define our
database env var:

# In our server’s shell, set the `DATABASE_URL` variable. 
# `export` allows programs outside of this current shell session to use `DATABASE_URL`.
root@28b24b724f54:/# export DATABASE_URL="postgresql://mydbuser:mysecretpass@thePGserver/db"

# Let’s also set our environment type
root@28b24b724f54:/# export ENV=development

# Let’s see what Python would give us with `os.environ`.
root@ee16c88faf39:/# python
Python 3.6.14 (default, Jul 22 2021, 16:21:31) 
[GCC 8.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import os
>>> os.environ
environ({
 'DATABASE_URL': 'postgresql://mydbuser:mysecretpass@thePGserver/db',
 'HOSTNAME': '28b24b724f54', 'PYTHON_VERSION': '3.6.14', 'ENV': 'development',
 'PWD': '/', 'HOME': '/root', 'LANG': 'C.UTF-8', 'GPG_KEY': '0D96DF4D4110E5C43FBFB17F2D347EA6AA65421D',
 'TERM': 'xterm', 'SHLVL': '1', 'PYTHON_PIP_VERSION': '21.2.4',
 'PYTHON_GET_PIP_SHA256': 'fa6f3fb93cce234cd4e8dd2beb54a51ab9c247653b52855a48dd44e6b21ff28b',
 'PYTHON_GET_PIP_URL': 'https://github.com/pypa/get-pip/raw/c20b0cfd643cd4a19246ccf204e2997af70f6b21/public/get-pip.py',
 'PATH': '/usr/local/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin', '_': '/usr/local/bin/python'
})

Looks good. You can see that DATABASE_URL and ENV are now part of our env vars, as well as some other predefined
variables that your operating system and/or install programs have configured.

Let’s really update the code now:

# Taken from: https://fastapi.tiangolo.com/advanced/async-sql-databases/?h=async+datab#import-and-set-up-sqlalchemy

# This is a new import needed to use env vars
import os
import databases
from fastapi import FastAPI

# Now, we only need to define DATABASE_URL once, and it no longer contains
# sensitive information.
DATABASE_URL = os.environ.get('DATABASE_URL')

# We should also check if our environment wasn't set up properly and raise if
# `DATABASE_URL` wasn't set
if not DATABASE_URL:
    raise AssertionError('DATABASE_URL not configured or detected. '
                         'Please check your environment variables.')

database = databases.Database(DATABASE_URL)

# We can also do specific things if our app is running in development mode with this variable
IS_DEVELOPMENT = os.environ.get('ENV') == 'development'
if IS_DEVELOPMENT:
    print('My app is running in development mode')
    # Do other things... like set up local logging, connect to services only available on your local environment.

IV. Backing services

Treat backing services as attached resources https://12factor.net/backing-services

Taken from the official Twelve-factor App page:

A backing service is any service the app consumes over the network as part of its normal operation.
Examples include datastores (such as MySQL or CouchDB), messaging/queueing systems (such as RabbitMQ or Beanstalkd),
SMTP services for outbound email (such as Postfix), and caching systems (such as Memcached).

With the legwork done on the previous sections, this is now much easier to manage. Our DATABASE_URL is dynamic and can
be changed whenever we need. On the extreme side, we could even entirely swap out our database system -- provided that
our code's dependencies can handle its query language, schema differences, etc. The last bit may be on the excessive
side as it requires a significant amount of development effort for something your app may never need. Thankfully, there
are frameworks out there that give developers the ability to swap out some resources for free.

Django is one such framework that provides easily swappable email functionality, for
example. On the email backends page we can see
that email backends are easily swappable just by changing the EMAIL_BACKEND setting.

The loose coupling of services gives the application freedom from being tied down to a specific service, however, this
comes at a cost. In an ideal scenario, a developer would (hopefully) not have to plan to use MySQL, PostgreSQL and
Oracle at the same time. They would have a single database system with its credentials in an env var that changes
depending on the current environment. At the other extreme, it's unreasonable to expect a project to be able to handle
many external services just to conform to this "rule".

To be continued!

This post covered sections I-IV of the Twelve-Factor App methodology. Hopefully, this has given you more insight on some
steps to take in order to make an app more flexible and scalable. In a future blog post, we will cover the next few
concepts. Here at Anvil we follow many of these concepts in our development process and we believe that sharing our
experiences helps everyone create awesome products.

If you’re developing something cool with PDFs or paperwork automation, let us know
at developers@useanvil.com. We’d love to hear from you.

DEV Community

Introduction to the Twelve-Factor App

I. Codebase

One codebase tracked in revision control ...

One codebase ... many deploys

II. Dependencies

III. Config

IV. Backing services

To be continued!

Top comments (0)

Read next

Implementing Route Guards in React: Protecting Your Routes with Authentication and Roles

Controlled vs Uncontrolled Components in React: Choosing the Right Approach

Toggle Switch Realistic illusion using the core html and core Css Code

React.js Explained: A Comprehensive Guide to Building Modern Web Applications