Rimpal Johal for AWS Community Builders

Posted on Mar 27, 2023

Restore Postgresql Database in a Docker Container

#programming #docker #containers #containerapps

Introduction

In this blog post, we will discuss a common challenge faced by many developers when setting up and configuring a PostgreSQL database in a Docker container. The problem we encountered was the need to create a PostgreSQL database with the PostGIS extension enabled and to import data from an existing SQL backup file. This process involves a series of steps, such as initializing the database, starting the PostgreSQL server, creating the database, enabling the PostGIS extension, and importing the data from the backup file.

To address this challenge, we'll use Docker, a platform that allows us to easily build, ship, and run applications in containers. Containers provide an isolated environment for running applications, which helps streamline the development and deployment process. In our case, we'll create a Docker container with PostgreSQL, a powerful and widely-used open-source relational database management system. Additionally, we'll be using the PostGIS extension, which adds support for geographic objects, allowing location queries to be run in SQL.

In this blog post, we will walk you through the entire process of creating a Docker container with a PostgreSQL database and PostGIS extension enabled. We'll also demonstrate how to import data from an existing SQL backup file into the newly created database. This tutorial will serve as a comprehensive guide for developers who need to set up a PostgreSQL database with the PostGIS extension in a Docker container and import data from a backup file.

Setting up the PostgreSQL Docker container

In this section, we'll guide you through the process of setting up a PostgreSQL Docker container with the PostGIS extension enabled. We'll start by creating a custom Dockerfile, which will define the necessary instructions and configurations for our container.

Here's the Dockerfile we used to create our PostgreSQL container:

# Base image with PostgreSQL and PostGIS
FROM postgis/postgis

# Set up environment variables
ENV PGDATA=/var/pgdata
ENV POSTGRES_USER=postgres
ENV POSTGRES_PASSWORD=mysecretpassword
ENV POSTGRES_DB=mypostgresDB

# Create a directory to store PostgreSQL data and logs
RUN mkdir -p ${PGDATA} /tmp /var/log/postgresql && chown -R postgres:postgres ${PGDATA} /tmp /var/log/postgresql

WORKDIR /data

# Expose the PostgreSQL port
EXPOSE 5432

# Set the user to run the container
USER postgres

# Copy the entrypoint script to the container
COPY entrypoint.sh /entrypoint.sh
RUN chmod +x /entrypoint.sh

# Run the entrypoint script
CMD ["/entrypoint.sh"]

In this Dockerfile, we start with the postgis/postgis base image, which already includes PostgreSQL and the PostGIS extension. We then set up environment variables for the PostgreSQL data directory, the default PostgreSQL user, password, and database name. These variables will be used later in our entrypoint script.

We create directories to store PostgreSQL data and logs, ensuring that the postgres user has the necessary permissions to access these directories. The PostgreSQL port (5432) is exposed to allow connections from outside the container.

Finally, we copy our custom entrypoint.sh script into the container and set it as the command to be executed when the container starts. The entrypoint script is responsible for initializing the PostgreSQL data directory, starting the PostgreSQL server, creating the database with the PostGIS extension, and importing data from the SQL backup file.

By using this Dockerfile, we're able to create a customized PostgreSQL container with the PostGIS extension enabled, tailored to our specific requirements. This setup ensures that our database is configured and ready to use, with the necessary data imported, as soon as the container starts running.

Creating the entrypoint.sh script

The entrypoint.sh script is an essential part of our PostgreSQL Docker container setup, as it handles the initialization and configuration of the database. In this section, we'll walk you through the contents of the entrypoint.sh script, explaining each step in detail.

Here's the content of the entrypoint.sh script:

#!/bin/bash

# Initialize the PostgreSQL data directory
initdb -D ${PGDATA}

# Start PostgreSQL in the background
pg_ctl -D ${PGDATA} -l /var/log/postgresql/logfile start

# Wait for PostgreSQL to start
wait_postgresql() {
  while ! pg_isready -q; do
    echo "Waiting for PostgreSQL to start..."
    sleep 1
  done
}
wait_postgresql

# Create the Postgres database named "mypostgresDB" with the "template_postgis" template
createdb mypostgresDB --template=template_postgis

# Extract the schema and data from the backup file
pg_restore -f /tmp/schema_new.sql -s /data/mypostgresDB-backup.sql 2>&1 | tee /var/log/postgresql/schema_extract.log
pg_restore -f /tmp/new_data.sql -a /data/mypostgresDB-backup.sql 2>&1 | tee /var/log/postgresql/data_extract.log

# Import the schema and data into the new database
psql --dbname=mypostgresDB -f /tmp/schema_new.sql 2>&1 | tee /var/log/postgresql/schema_import.log
psql --dbname=mypostgresDB -f /tmp/new_data.sql 2>&1 | tee /var/log/postgresql/data_import.log

# Keep PostgreSQL running
tail -f /var/log/postgresql/logfile

Initialize the PostgreSQL data directory: The initdb command initializes the data directory for the PostgreSQL server, setting up the necessary files and folders.
Start PostgreSQL in the background: The pg_ctl command starts the PostgreSQL server in the background, with logs being written to the specified logfile.
Wait for PostgreSQL to start: The wait_postgresql function checks if the PostgreSQL server is ready to accept connections. It uses the pg_isready command to query the server status and waits until the server is ready.
Create the PostgreSQL database: The createdb command creates a new PostgreSQL database named mypostgresDB using the template_postgis template, which includes the PostGIS extension.
Extract schema and data from the backup file: The pg_restore command is used to extract the schema and data from the SQL backup file. We create two separate files: schema_new.sql for the schema and new_data.sql for the data. The extraction process logs are written to /var/log/postgresql/.
Import schema and data into the new database: The psql command imports the schema and data from the extracted files into the newly created mypostgresDB database. The import process logs are written to /var/log/postgresql/.
Keep PostgreSQL running: The tail -f command ensures that the PostgreSQL server keeps running by continuously monitoring the logfile.

By using this entrypoint.sh script, we ensure that our PostgreSQL container is fully configured and ready to use when it starts, with the necessary data imported from the backup file.

Running the Docker container and mounting the backup file

In this section, we'll cover the steps to run the Docker container and mount the backup file during the process. This allows the entrypoint.sh script to access the backup file and perform the required operations to restore the database schema and data.

First, build the Docker image using the Dockerfile created earlier. Navigate to the directory containing the Dockerfile and run the following command:

docker build -t postgres-postgis .

This command builds a new Docker image with the tag postgres-postgis.

Download the SQL backup file: Before running the Docker container, ensure that you have downloaded the SQL backup file from the S3 bucket to your local system. This file will be mounted as a volume in the Docker container to restore the database.

Run the Docker container: To run the Docker container and mount the backup file, use the following command:

docker run -d --name my-postgres -p 5432:5432 -v /path/to/your/local/backupfile.sql:/data/mypostgresDB-backup.sql postgres-postgis

Replace /path/to/your/local/backupfile.sql with the path to the downloaded SQL backup file on your local system. This command does the following:

Runs the Docker container in detached mode (-d) with the name my-postgres.
Maps the local port 5432 to the container's port 5432 (-p 5432:5432), allowing you to connect to the PostgreSQL server running inside the container.
Mounts the backup file as a volume in the container at /data/mypostgresDB.sql. This allows the entrypoint.sh script to access the backup file during the container startup.

Once the container is up and running, the entrypoint.sh script will initialize the PostgreSQL server, create the database, extract the schema and data from the backup file, and import them into the new database. You can then connect to the PostgreSQL server using your preferred client and verify that the database has been restored correctly.

By following these steps, you have successfully created a Docker container for a PostgreSQL server with the PostGIS extension and restored a database using an SQL backup file.

Conclusion

In this blog post, we have demonstrated how to set up a PostgreSQL Docker container with the PostGIS extension and restore a database using an SQL backup file. We walked through the process of creating a Dockerfile to define the container, implementing an entrypoint.sh script to manage the database restoration process, and running the Docker container while mounting the backup file.

By leveraging Docker and PostgreSQL, we have created an easily deployable and scalable solution for managing and restoring geospatial databases. This approach not only simplifies the database restoration process but also ensures a consistent environment for your database server across different stages of development, testing, and production.

With this knowledge, you can now confidently use Docker and PostgreSQL to manage your geospatial databases, and restore them from backup files as needed.

DEV Community

Restore Postgresql Database in a Docker Container

Oldest comments (0)