DEV Community

Mike Brooks
Mike Brooks

Posted on • Updated on

Drupal File Persistence in Azure Kubernetes Service (AKS)

Table of Contents

Introduction

As we know, websites are a composite of structured and unstructured data. Text on the webpage may be hard-coded or dynamically aggregated from a database into an HTML file. Image files such as a jpg in-line with page text are independently requested from a file storage location.

Content management system (CMS) frameworks such as Drupal and WordPress expect to read and write unstructured data files to a directory path, typically on the web server. In the case of Drupal, this file storage location is the /sites/default/files/ directory. For WordPress, it is the wp-content/uploads/ directory.

Containerized applications that read and write unstructured data require an external file store resource that persists the files outside the container. This is because containers, by their nature, are ephemeral. Should the container restart or be updated with a new version, then "zap" ⚡ state written to the running container is gone. The container solution for file persistence is to mount storage to the machine (VM or bare metal) that hosts the container.

In the case of Kubernetes, our containers run in so-called "pods" which can horizontally scale across a cluster of VM nodes. As such, a "read write many" solution is necessary so that all pods have access to a common file share not pinned to a single node in the cluster.

In this article we cover a technique to achieve such file persistence in the context of Drupal running in Microsoft's managed service for Kubernetes, Azure Kubernetes Service (or AKS for short).

While our example uses Drupal, the principals are common to other CMS frameworks based on PHP, such as WordPress and Joomla. With some refactoring, the techniques used in this solution can be applied to other CMS frameworks.

Drupal file persistence use cases

This tutorial considers three Drupal uses cases for file persistence:

  • Drupal file system for a single-site Drupal instance
  • Drupal watchdog log, as well as logs for PHP errors and Apache access
  • Drupal administration files such as a hash_salt for one-time login links, cancel links, form tokens, etc.

Prerequisites

Despite the rudiments covered in the introduction above, this article assumes experience with Drupal, Docker containers, Kubernetes and Azure.

This article does not describe all of the prerequisites to run Drupal on AKS. For a more comprehensive view, visit this GitHub repository to which I contributed:

The method described here uses Azure Files, a managed file storage solution from Microsoft. A static Azure Storage account and File shares are necessary. You can create these from the Azure Portal by following the steps in Quickstart: Create and manage Azure file shares with the Azure portal. As in the Quickstart, for non-production scenarios I recommend the StandardV2 account kind with Locally-redundant storage (LRS) replication. In this tutorial we'll create three files shares corresponding to the aforementioned use cases:

  • files
  • apache2logs
  • configs

In Azure Storage Explorer, the resulting Azure Storage account and File shares should look like this:

Azure Storage File share screen capture

Alternatively, you can Create an Azure file share using the Azure CLI.

Configuration steps

Step 1 - Dockerfile commands

In the Dockerfile you use to build your container image, include commands to define the directory paths which we will later assign to volumeMounts in our Kubernetes deployment manifest. Let's consider the Dockerfile commands for each File share:

files

RUN mkdir -p /var/www/html/docroot/sites/default/files

This command assumes that our website DocumentRoot is set to /var/www/html/docroot. This can be done in a apache2.conf file have the line:
DocumentRoot /var/www/html/docroot

apache2logs

In our example Dockerfile we use this base image:
FROM php:7.3-apache-stretch

We are going to use the default location for Apache logs /var/log/apache2/, and also use this directory for PHP and Drupal logs.

For the PHP log we include the following command:

RUN { \
  echo 'error_log=/var/log/apache2/php-error.log'; \
  echo 'log_errors=On'; \
  echo 'display_errors=Off'; \
  } >> /usr/local/etc/php/php.ini

For the Drupal watchdog log we include the following command:

RUN echo "local0.* /var/log/apache2/drupal.log" >> /etc/rsyslog.conf

This command assumes a local installation of Rsyslog in our container.

configs

RUN mkdir -p /var/www/html/config

Step 2 - PersistentVolume and PersistentVolumeClaim resources

In this step we compose a Kubernetes manifest with instructions to define
PersistentVolume and PersistentVolumeClaim resources in our AKS cluster.

Before we do that, however, we need to create a Kubernetes secret to hold the credentials to access our Azure Storage account. The secret manifest YAML is as follows:

apiVersion: v1
kind: Secret
metadata:
  name: sa-secrets
type: Opaque
data:
  azurestorageaccountkey: <BASE_64_ENCODED_STORAGE_ACCOUNT_KEY>
  azurestorageaccountname: <BASE_64_ENCODED_STORAGE_ACCOUNT_NAME>

This Secret resource will be referenced by our PersistentVolume resource.

You can find the Azure Storage credentials in the Access keys setting in Azure portal, for example:

Storage account screen capture of Access keys

To output base64 encoded strings for use in your secrets manifest, use the command:
echo -n "<string to encode>" | base64 -w 0

OK. Now that we've gotten that out of the way, we can turn our attention back to our PersistentVolume and PersistentVolumeClaim resources. The YAML example below illustrates the creation of a PersistentVolume and a PersistentVolumeClaim to be used for our public file system path.

Please note...

  • Our use of the azurefile storageClassName in the PersistentVolume spec. This StorageClass as well as azurefile-premium are Kubernetes resources provided out-of-the-box in AKS.
  • In the mountOptions we set the file share access control. uid=1000 refers to user d8admin added in our Dockerfile and uid=33 refers to the group which happens to be www-data provided by the container's base image (see the Addendum for a how-to).
apiVersion: v1
kind: PersistentVolume
metadata:
  name: filespv
spec:
  accessModes:
    - ReadWriteMany
  storageClassName: azurefile
  capacity:
    storage: 5Gi
  azureFile:
    secretName: sa-secrets
    shareName: files
    readOnly: false
  claimRef:
    namespace: default
    name: filespvc
  mountOptions:
  - dir_mode=0777
  - file_mode=0777
  - uid=1000
  - gid=33
  - mfsymlinks
  - nobrl

---
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
  name: filespvc
spec:
  accessModes:
    - ReadWriteMany
  storageClassName: azurefile
  resources:
    requests:
      storage: 5Gi

Similarly, PersistentVolume and PersistentVolumeClaim resources can be created for the apache2logs and configs File shares.

Step 3 - Add volumes & volumeMounts fields to the K8s Deployment

The final step to ensure file persistance is to make our Drupal pod(s) aware of the Azure Files' persistent volume claims, and associate each claim with a path in our container.

In our Kubernetes deployment YAML manifest we add .spec.containers.volumeMounts and .spec.volumes fields.

For example:

 spec:
      containers:
        - image: <registry>/<image-name>:<tag>
          name: drupal
          ports:
          - containerPort: 80
          volumeMounts:
          - mountPath: /var/log/apache2
            name: apache2-vol
          - mountPath: /var/www/html/docroot/sites/default/files
            name: files-vol
          - mountPath: /var/www/html/config
            name: config-vol
volumes:
        - name: config-vol
          persistentVolumeClaim:
            claimName: configpvc
        - name: files-vol
          persistentVolumeClaim:
            claimName: filespvc
        - name: apache2-vol
          persistentVolumeClaim:
            claimName: apache2pvc

A complete deployment example can be found in the aforementioned GitHub repo.

In Closing

In this article we have covered the basic configurations necessary to enable file persistent for a Drupal deployment in Azure Kubernetes Service.

Drupal containerization and Kubernetes orchestration are a craft and there are many opinions on how to do it. The technique described here is one of many options for file persistence. I encourage you to explore the storage options to determine what is best your use case, whether it is Azure Files, an NFS server, GlusterFS, or even a storage abstraction layer such as Rook or Portworx.

Please feel free to share your questions and comments in the discussion below. Thanks - Mike 😃

📚 Resources

👏 Acknowledgements

I would like to thank Gousiya Sayyad for her technical know-how and review of this post.

Addendum

To get the uid and gid values for my mountOptions properties, I bash into my container running locally, for example:

SNP+mike@MIKE-T570 MINGW64 ~
$ winpty docker exec -it drupal8aks bash
root@6cc24a823fb3:/var/www/html# id -u d8admin
1000
root@6cc24a823fb3:/var/www/html# id -g www-data
33

Top comments (0)