This article will walk through how we correctly persist static & media files for a Django application hosted on Heroku. As a bonus, it will also explain how we can satisfy the additional constraint of specifying private versus public media files based on model definitions.
Before I begin, this post extends from this TestDriven.io article that was written awhile back. I frequent it often when setting up my projects, and have built some extra functionality on top of it over the years. I decided to create a more focused post that references Heroku & Bucketeer with these extra features after helping an individual on StackOverflow.
I think it's because I turn off a PC, where I took these images
This probably is not it, because Heroku doesn't have access to the files on your computer.
When you upload a file to the Django admin, it looks at the DEFAULT_FILE_STORAGE
settings configuration to determine how to…
So without further ado, let's first dive into what static & media files are and how Heroku dynos manage their filesystem?
What are Media & Static Files
If you are working with a Django project, then you inevitably have all of your Python application code written around a bunch of .py
files. These are the code paths of your application, and the end-user - hopefully - never actually sees these files or their contents.
Outside of these business-logic files, it is common to serve users directly from your server's file system. For these static files, Django doesn't need to run any code for them; the framework looks up the file and returns the contents for the requesting user to view.
Some examples of static files include:
- Non-templated HTML
- CSS & JavaScript files to make your page look nice
- User profile pictures
- Generated PDFs
Media files in Django are a particular variant of static files. Media files are read from the server's file system as well. Unlike static files, though, they are usually generated files uploaded by users or generated by your application and are associated with a model's FileField
or ImageField
. In the examples above, user profile pictures and generated PDFs are typical examples of media files.
Django with Media & Static Files
When a new media file is uploaded to a Django web application, the framework looks at the DEFAULT_FILE_STORAGE
settings configuration to determine how to store that file. By default, it uses the django.core.files.storage.FileSystemStorage
class, which is what most projects start off as having configured. This implementation looks at the MEDIA_ROOT
configuration that is defined in the settings.py
file and copies the uploaded file contents to a deterministically-created file path under that given MEDIA_ROOT
.
For example, if the MEDIA_ROOT
is set as /var/www/media
, all uploaded files will be copied and written to a location under /var/www/media/
.
Heroku with Media & Static Files
Storing these static files on your server's disk file system is okay until you start to work with a containerization platform such as Heroku. To explain why this is the case, it helps to take a step back.
When downloading files on your personal computer, it's okay that these get written to the file system - usually under ~/Downloads
or somewhere similar. This download is because you expect your computer's file system to persist across restarts and shutdowns; if you download a file and restart your computer, that downloaded file should still be there once the laptop is finished restarting.
Heroku uses containerization to execute customer workloads. One fact of this environment is that the associated file systems do not persist across restarts and reschedules. Heroku dynos are ephemeral, and they can be destroyed, restarted, and moved without any warning, which replaces the associated filesystem. This situation means that any uploaded files referenced by FileField's and
ImageField's are just deleted without a trace every time the dyno is restarted, moved, or scaled.
Complete Example Codebase
I will be stepping through the process of configuring the Django application for Heroku & S3-compatible storage, but feel free to reference the repository below for the complete code to browse through.
dstarner / django-heroku-static-file-example
Used in my blog post of detailing private & public static files for a Heroku-served Django application
Properly Managing Django Media & Static Files on Heroku Example
Used in my blog post of detailing private & public static files for a Heroku-served Django application.
Note: This does include a $5.00 / month Bucketeer add-on as a part of the one-click deployment.
Bootstrapping Django on Heroku
This tutorial aims to help you retrofit an existing Django project with S3-compatible storage, but I'll quickly go through the steps I used to set up the example Django application. It may help those new to Django & Heroku or those who encounter bugs following the rest of the setup process.
You can view the tagged project before the storage change at commit 299bbe2
.
- Bootstrapped a Django project
example
- Uses
poetry
for dependency management - All of the Django code is under the
example
package, and themanage.py
file is in the root. I've always found this structure cleaner than the Django apps defined in the project root.
- Uses
- Configured the project for Heroku
-
django-heroku
package to automatically configureALLOWED_HOSTS
,DATABASE_URL
, and more. This reduces the headache of deploying Django on Heroku considerably - A
Procfile
that runs agunicorn
process for managing the WSGI application - An
app.json
is defined with some fundamental configuration values and resources defined for the project to work - A
release
process definition in theProcfile
and an associatedscripts/release.sh
script that runs staticfile collection and database migrations
-
Introducing Heroku's Bucketeer Add-On
Before we can start managing static and media files, the Django application needs a persistent place to store the files. Again, we can look to Heroku's extensive list of Add-Ons for s3-compatible storage. Ours of choice will be one called Bucketeer.
Heroku's Bucketeer add-on provides an AWS S3 storage bucket to upload and download files for our application. The Django application will use this configured bucket to store files uploaded by the server and download them from the S3 when a user requests the files.
If you'd like to learn more about AWS S3, the widely-popular data storage solution that Bucketeer is built upon, you can read the S3 user documentation.
It is worth mentioning that the base plan for Bucketeer - Hobbyist
- is $5 per month. If you plan on spinning up the one-click example posted above, it should only cost a few cents if you proactively destroy the application when you are done using it.
Including the Bucketeer Add-On
To include the Bucketeer add-on in our application, we can configure it through the Heroku CLI, web dashboard, or via the project's app.json
file. We will use the third method of including the add-on in an app.json
file.
If the project does not have one already, we can create the basic structure listed below, with the critical part being the addition of the "add-ons"
configuration. This array defines the "bucketeer:hobbyist"
resource that our application will use, and Heroku will install the add-on into our application if it does not already exist. We also include the " as"
keyword, which will preface the associated configuration variables with the term BUCKETEER
. This prefacing is helpful to keep the generated configuration value names deterministic because, by default, Heroku will generate the prefix as a random color.
{
// ... rest above
"addons": [
// ...other addons...
{
"plan": "bucketeer:hobbyist",
"as": "BUCKETEER"
}
]
}
With the required resources being defined, we can start integrating with our storage add-on.
Implementing Our Storage Solution
The django-storages
package is a collection of custom, reuseable storage backends for Django. It aids immensely in saving static and media files to different cloud & storage provider options. One of the supported storage providers is S3, which our Bucketeer add-on is built on. We will leverage the S3 django-storages
backend to handle different file types.
Installing django-storages
Begin by installing the django-storages
package and the related boto3
package used to interface with AWS's S3. We will also lock our dependencies to ensure poetry
and our Heroku deployment continue to work as expected.
poetry add django-storages boto3 && poetry lock
Then, just like most Django-related packages, django-storages
will need to be added to the project's INSTALLED_APPS
in the projects settings.py
file. This will allow Django to load the appropriate code flows as the application starts up.
# example/config/settings.py
INSTALLED_APPS = [
# ... django.X.Y apps above
'storages',
# ... custom project apps below
]
Implementing Static, Public & Private Storage Backends
We will return to the settings.py
file later to configure the usage of django-storages
, but before that can be done, we will implement three custom storage backends:
- A storage backend for static files - CSS, Javascript, and publicly accessible images - that will be stored in version control - aka
git
- and shipped with the application - A public storage backend for dynamic media files that are not stored in version control, such as uploaded files and attachments
- A private storage backend for dynamic media files that are not stored in the version control that require extra access to be viewed, such as per-user reports and potentially profile images. Files managed by this backend require an access key and will block access to those without a valid key.
We can extend from django-storages
's S3Boto3Storage
storage backend to create these. The following code can be directly "copy and paste "'d into your project. The different settings
attributes read in the module will be written shortly, so do not expect this code to work if you import it right now.
# FILE: example/utils/storage_backends.py
from django.conf import settings
from storages.backends.s3boto3 import S3Boto3Storage
class StaticStorage(S3Boto3Storage):
"""Used to manage static files for the web server"""
location = settings.STATIC_LOCATION
default_acl = settings.STATIC_DEFAULT_ACL
class PublicMediaStorage(S3Boto3Storage):
"""Used to store & serve dynamic media files with no access expiration"""
location = settings.PUBLIC_MEDIA_LOCATION
default_acl = settings.PUBLIC_MEDIA_DEFAULT_ACL
file_overwrite = False
class PrivateMediaStorage(S3Boto3Storage):
"""
Used to store & serve dynamic media files using access keys
and short-lived expirations to ensure more privacy control
"""
location = settings.PRIVATE_MEDIA_LOCATION
default_acl = settings.PRIVATE_MEDIA_DEFAULT_ACL
file_overwrite = False
custom_domain = False
The attributes listed in each storage backend class perform the following:
-
location
: This dictates the parent directory used in the S3 bucket for associated files. This is concatenated with the generated path provided by aFileField
orImageField
'supload_to
method. -
default_acl
: This dictates the access policy required for reading the files. This dictates the storage backend's access control through values ofNone
,public-read
, andprivate
.django-storages
and theS3Boto3Storage
parent class with translate these into object policies. -
file_overwrite
: In most cases, it's better not to overwrite existing files if we update a specific path. With this set toFalse
, a unique suffix will be appended to the path to prevent naming collisions. -
custom_domain
: Disabled here, but you can enable it if you want to use AWS's CloudFront anddjango-storage
to serve from it.
Configure Settings to Use the Storage Backends
With our storage backends defined, we can configure them to be used in different situations via the settings.py
file. However, it is challenging to use S3 and these different cloud storage backends while in development, and I've always been a proponent of keeping all resources and files "local" to the development machine, so we will create a logic path that will:
- Use the local filesystem to store static and media files for convenience. The Django server will be responsible for serving these files directly.
- Use the custom S3 storage backends when an environment variable is enabled. We will use the
S3_ENABLED
variable to control this, enabling it in our Heroku configuration variables.
First, we will assume that you have a relatively vanilla settings.py
file concerning the static- & media-related variables. For reference, a new project should have a block that looks similar to the following:
# Static files (CSS, JavaScript, Images)
# https://docs.djangoproject.com/en/4.0/howto/static-files/
STATIC_URL = 'static/'
STATIC_ROOT = BASE_DIR / 'collected-static'
We will design a slightly advanced control flow that will seamlessly handle the two cases defined above. In addition, it will provide enough control to override each part of the configuration as needed.
Since there are already default values for the static file usage, we can add default values for media file usage. These will be used when serving files locally from the server while in development mode.
STATIC_URL = '/static/'
STATIC_ROOT = BASE_DIR / 'collected-static'
MEDIA_URL = '/media/'
MEDIA_ROOT = BASE_DIR / 'collected-media'
To begin the process of including S3, let's create the controls to manage if we should serve static & media files from the local server or through the S3 storage backend. We will create three variables
-
S3_ENABLED
: controls whether media & static files should use S3 storage by default -
LOCAL_SERVE_MEDIA_FILES
: controls whether media files should use S3 storage. Defaults to the negatedS3_ENABLED
value -
LOCAL_SERVE_STATIC_FILES
: controls whether static files should use S3 storage. Defaults to the negatedS3_ENABLED
value
from decouple import config # import explained below
# ...STATIC and MEDIA settings here...
# The following configs determine if files get served from the server or an S3 storage
S3_ENABLED = config('S3_ENABLED', cast=bool, default=False)
LOCAL_SERVE_MEDIA_FILES = config('LOCAL_SERVE_MEDIA_FILES', cast=bool, default=not S3_ENABLED)
LOCAL_SERVE_STATIC_FILES = config('LOCAL_SERVE_STATIC_FILES', cast=bool, default=not S3_ENABLED)
if (not LOCAL_SERVE_MEDIA_FILES or not LOCAL_SERVE_STATIC_FILES) and not S3_ENABLED:
raise ValueError('S3_ENABLED must be true if either media or static files are not served locally')
In the example above, we are using the python-decouple
package to make it easier to read and cast environment variables to Python variables. I highly recommend this package when working with settings.py
configurations. We also include a value check to ensure consistency across these three variables. If all three variables are defined in the environment but conflict with one another, the program will throw an error.
We can now start configuring the different configuration variables required by our file storage backends based on those control variables' value(s). We begin by including some S3 configurations required whether we are serving static, media, or both types of files.
if S3_ENABLED:
AWS_ACCESS_KEY_ID = config('BUCKETEER_AWS_ACCESS_KEY_ID')
AWS_SECRET_ACCESS_KEY = config('BUCKETEER_AWS_SECRET_ACCESS_KEY')
AWS_STORAGE_BUCKET_NAME = config('BUCKETEER_BUCKET_NAME')
AWS_S3_REGION_NAME = config('BUCKETEER_AWS_REGION')
AWS_DEFAULT_ACL = None
AWS_S3_SIGNATURE_VERSION = config('S3_SIGNATURE_VERSION', default='s3v4')
AWS_S3_ENDPOINT_URL = f'https://{AWS_STORAGE_BUCKET_NAME}.s3.amazonaws.com'
AWS_S3_OBJECT_PARAMETERS = {'CacheControl': 'max-age=86400'}
The above defines some of the variables required by the django-storages
S3 backend and sets the values to environment configurations that are provided by the Bucketeer add-on. As previously mentioned, all of the add-on environment variables are prefixed with BUCKETEER_
. The S3_SIGNATURE_VERSION
environment variable is not required and most likely does not need to be included.
With the S3 configuration together, we can reference the LOCAL_SERVE_MEDIA_FILES
and LOCAL_SERVE_STATIC_FILES
control variables to override the default static and media file settings if they are desired to be served via S3.
if not LOCAL_SERVE_STATIC_FILES:
STATIC_DEFAULT_ACL = 'public-read'
STATIC_LOCATION = 'static'
STATIC_URL = f'{AWS_S3_ENDPOINT_URL}/{STATIC_LOCATION}/'
STATICFILES_STORAGE = 'example.utils.storage_backends.StaticStorage'
Notice the last line where STATICFILES_STORAGE
is set to the custom Backend we created. That ensures it follows the location & ACL (Access Control List) policies that we configured initially. With this configuration, all static files will be placed under /static/
in the bucket, but feel free to update STATIC_LOCATION
if desired.
We can configure a very similar situation for media files.
if not LOCAL_SERVE_MEDIA_FILES:
PUBLIC_MEDIA_DEFAULT_ACL = 'public-read'
PUBLIC_MEDIA_LOCATION = 'media/public'
MEDIA_URL = f'{AWS_S3_ENDPOINT_URL}/{PUBLIC_MEDIA_LOCATION}/'
DEFAULT_FILE_STORAGE = 'example.utils.storage_backends.PublicMediaStorage'
PRIVATE_MEDIA_DEFAULT_ACL = 'private'
PRIVATE_MEDIA_LOCATION = 'media/private'
PRIVATE_FILE_STORAGE = 'example.utils.storage_backends.PrivateMediaStorage'
The big difference here is that we have configured two different storage backends for media files; one for publicly accessible objects and one for objects that require an access token. When the file is requested, this token will be generated internally by django-storages
so you do not have to worry about anonymous public access.
Local Development Serving
Since we will have S3_ENABLED
set to False
in our local development environment, it will serve static and media files locally through the Django server instead of from S3. We will need to configure the URL routing to handle this scenario. We can configure our urls.py
file to serve the appropriate files like so:
from django.conf import settings
from django.conf.urls.static import static
from django.contrib import admin
from django.urls import path
urlpatterns = [
path('admin/', admin.site.urls),
]
if settings.LOCAL_SERVE_STATIC_FILES:
urlpatterns += static(settings.STATIC_URL, document_root=settings.STATIC_ROOT)
if settings.LOCAL_SERVE_MEDIA_FILES:
urlpatterns += static(settings.MEDIA_URL, document_root=settings.MEDIA_ROOT)
This will locally serve the static or media files based on the values of the LOCAL_SERVE_STATIC_FILES
and LOCAL_SERVE_MEDIA_FILES
settings variables we defined.
Enabling S3 Storage
We can enable these storages and our add-on in the app.json
file to start using these storage backends. This will effectively disable LOCAL_SERVE_STATIC_FILES
and LOCAL_SERVE_MEDIA_FILES
to start serving both via S3 when deployed to Heroku.
{
// ...rest of configs...
"env": {
// ...rest of envs...
"S3_ENABLED": {
"description": "Enable to upload & serve static and media files from S3",
"value": "True"
},
}
}
Using the Private Storage
By default, Django will use the PublicMediaStorage
class for uploading media files, meaning the contents will be publicly accessible to anyone with the link. However, a model can utilize the PrivateMediaStorage
backend when desired, which will create short-lived access tokens that prevent the public from viewing the associated object.
The below is an example of using public and private media files on the same model.
from django.db import models
from example.utils.storage_backends import PrivateMediaStorage
class Organization(models.Model):
"""A sample Organization model with public and private file field usage
"""
logo = models.ImageField(help_text='A publicly accessible company logo')
expense_report = models.FileField(
help_text='The private expense report requires a short-lived access token'
storage=PrivateMediaStorage() # will create private files
)
You can see the code for this complete example at commit 265becc
. This configuration will allow your project to scale efficiently using Django on Heroku using Bucketeer.
In a future post, we will discuss how to upload and set these files using vanilla Django & Django REST Framework.
As always, if you find any bugs, issues, or unclear explanations, please reach out to me so I can improve the tutorial & experience for future readers.
Take care everyone
Top comments (5)
THANK YOU SO MUCH YOU SAVED MY LIFE
Thank you for the very kind words!
seconded... this post was excellent.
What is "rn_api" ? in the settings.py file (I don't see it in the github page)
Oops! Good catch! That was a copy/paste relic from the project I based this post off of. I fixed it to reflect the example codebase.