DEV Community

Cover image for Dockerize & Backup A Postgres Database
Matthew Cale
Matthew Cale

Posted on • Edited on

Dockerize & Backup A Postgres Database

What is this?

This is a technical guide for doing common management tasks with a Postgres database running in Docker.

Why make this?

As a quick reference for taking database backups and restoring them.

Steps

Run The Postgres Image

Docker makes it incredibly easy to get going with a Postgres database with persistent storage. Let's try it out.

  • Let's just run the Postgres container pointing its volume location to a spot on disk we like. For me that's: ~/docker/volumes/postgres.
# In your terminal
docker run \
 --name pg-db \
 -e POSTGRES_PASSWORD=shiny_metal_pass \
 -d \
 -p 6543:5432 \
 -v $HOME/docker/volumes/postgres:/var/lib/postgresql/data \
 postgres
Enter fullscreen mode Exit fullscreen mode
  • Let's dissect this line by line:

    • docker run -- Runs a specified image (in our case it's postgres, see last line of command). Source
    • --name pg-db -- Sets the name of the container running the postgres image. Source
    • -e POSTGRES_PASSWORD=shiny_metal_pass -- Sets an environment variable which will be the default password for the superuser. Note: This should be a secret 🤫. Source
    • -d -- Detach from the terminal instance. i.e. run in the background.
    • -p 6543:5432 -- Map the local machine's port 6543 to the container's port 5432. So from our machine to connect we will use 6543 which in turn will hit the port in the container running on 5432.
    • -v $HOME/docker/volumes/postgres:/var/lib/postgresql/data -- Map any data created in the container at /var/lib/postgresql/data to a desired location on local disk: $HOME/docker/volumes/postgres. Source
    • postgres -- The name of the image in Docker hub we want to create a container from.
  • When I run it I see the following

create-container

  • When I check our volume location (note: for me this was an empty location prior to invoking the command above) I see this:

check-mount

  • Finally, let's see if we can connect to our database. I am using TablePlus but the same concepts apply with any general purpose database IDE or psql (found here). For me the connection looks like this:

connect-sql

  • And when we connect we can run the following to get a lay of the land.
-- From a SQL Editor (PSQL / TablePlus Session)
SELECT datname FROM pg_database;
-- RESULTS:
-- postgres
-- template1
-- template0
Enter fullscreen mode Exit fullscreen mode

Create Some Persistent Data

So, we've got a database! What now? Well let's store some data in it!

  • Start by creating a new database called todo_it_well. You guessed it this, YATA (yet another todo app). This time we are doing it in SQL later we will also show you how to use the createdb tool.
-- From a SQL Editor
CREATE DATABASE todo_it_well;
-- Query 1 OK: CREATE DATABASE
Enter fullscreen mode Exit fullscreen mode
  • Switch to our new database. For me, in TablePlus, this just means editing my connection to use todo_it_well instead of postgres for the chosen database.

connect-todo-sql

  • Next, create some tables. We will, for this demonstration, create a users table and tasks table so that users can track their tasks or "todos". The only almost complex bit of this is declaring a relationship between tasks and users. This post is not about SQL so don't think too hard on these commands for now just know that they make a viable todo application.
-- From a SQL Editor
CREATE TABLE "user" (
   id SERIAL PRIMARY KEY NOT NULL,
   created_at TIMESTAMP NOT NULL DEFAULT NOW(),
   updated_at TIMESTAMP NOT NULL DEFAULT NOW(),
   email TEXT NOT NULL,
   "password" TEXT NOT NULL,
   display_name TEXT
);

CREATE TABLE "task" (
   id SERIAL PRIMARY KEY NOT NULL,
   created_at TIMESTAMP NOT NULL DEFAULT NOW(),
   updated_at TIMESTAMP NOT NULL DEFAULT NOW(),
   owner_id INT NOT NULL,
   description TEXT NOT NULL,
   label TEXT,
   completed_on TIMESTAMP,
   removed_on TIMESTAMP,
   CONSTRAINT fk_owner 
    FOREIGN KEY(owner_id) 
    REFERENCES "user"(id)
);
-- Query 1 OK: CREATE TABLE
-- Query 2 OK: CREATE TABLE
Enter fullscreen mode Exit fullscreen mode
  • Before we finish this step let's create some real data in here! Let's insert some users first and then some tasks for them!
-- From a SQL Editor
INSERT INTO "user"
  ("email", "password", "display_name")
VALUES 
  ('matt@email.fun', 'password-gasworld', 'matt-the-magic-monkey'),
  ('jennie@yahoo.fun', 'password-sassworld', 'dat-girl-jen'),
  ('timothy24@email.fun', 'pw-vw', 'yo-boi-tom-o-tee');
-- users created with ids: 1, 2 and 3
Enter fullscreen mode Exit fullscreen mode
  • Next, the tasks:
-- From a SQL Editor
INSERT INTO "task"
   ("owner_id", "description", "label")
VALUES 
   (1, 'read this awesome post', 'tech'),
   (1, 'do dishes', NULL),
   (2, 'read chapter of lotr', 'lit'),
   (2, 'sign up to volunteer', NULL),
   (3, 'call brother', NULL),
   (3, 'make grocery list', 'health'),
   (3, 'rake leaves', 'house');
Enter fullscreen mode Exit fullscreen mode
  • Validate that our data looks as expected:
-- From a SQL Editor
SELECT 
   u.display_name, 
   t.description
FROM 
   "user" u 
INNER JOIN
   "task" t ON u.id = t.owner_id
WHERE 
   u.id = 2
   AND t.completed_on IS NULL;

-- RESULTS:
-- display_name description
-- dat-girl-jen read chapter of lotr
-- dat-girl-jen sign up to volunteer
Enter fullscreen mode Exit fullscreen mode

Create A Backup

So here we are with some valuable customer data that we really do not wanna lose. A good practice for any developer is to take regular (hopefully automated) database backups. Doing this in Postgres isn't too hard so let's get to it. I should state there are several ways to do this. This is just one of many ways to get this job done. I think it is the most straightforward way; though, there are definitely faster ways.

This approach consists of issuing a set of commands to the container that could likely be CRON job-ified without too much effort.

  • First let's create a backups directory at a spot that we share with our container's filesystem and our host machine:
# In your terminal
docker exec -t pg-db bash -c 'mkdir /var/lib/postgresql/data/backups'
Enter fullscreen mode Exit fullscreen mode
  • This command (run from the host machine) executes a command (-c) mkdir on the container and creates a directory called backups which is located at the spot on the container that is shared with the host machine. You can verify this on the host by running:
# In your terminal
ls $HOME/docker/volumes/postgres | grep backups
# my machine outputs: backups
Enter fullscreen mode Exit fullscreen mode
  • Next, let's create our backup
# In your terminal
docker exec -t pg-db bash -c 'pg_dump todo_it_well -U postgres  --file=/var/lib/postgresql/data/backups/todo_it_well_db-$(date +%Y-%m-%d).sql'
Enter fullscreen mode Exit fullscreen mode

Let's breakdown this command:

  • docker exec -t pg-db bash -c
    • We tell Docker we want to run a command on the pg-db container
  • pg_dump todo_it_well -U postgres
    • We invoke the pg_dump tool on the container telling it to target the todo_it_well db and run commands as the postgres user.
  • --file=/var/lib/postgresql/data/backups/todo_it_well_db-$(date +%Y-%m-%d).sql

    • This is the spot on the container we want to place our back up at and as you can see it is the shared backups directory we just made. There is a little date logic that interpolates the current date as YEAR-MONTH-DAY.
  • If this went as planned we should have made our database backup and we can verify it on our host machine by running:

# In your terminal
ls $HOME/docker/volumes/postgres/backups
# my machine outputs: todo_it_well_db-2021-11-07.sql
cat $HOME/docker/volumes/postgres/backups/todo_it_well_db-2021-11-07.sql
# output includes: -- PostgreSQL database dump
Enter fullscreen mode Exit fullscreen mode
  • If you see some SQL in your output then congratulations! You have successfully created a database backup and your data strategy for your awesome app is just the tiniest bit more durable. 🎉

Drop Data And Restore It

Great, so our data has been backed up! This means that you can restore folks back to a particular point in time when things were as they should be. So long as you have an application that can reliably serve this data (ruminates on articles using Goose to accomplish database versioning) you can restore functionality and value to your users.

BUT, backing up your data is only half the battle. Actually restoring it is another matter. Let's practice the act of restoring our data.

First, I should confess that this article commits a sin of simplicity in that I wanted to get going as simply as possible. You more than likely should not place your backups in the exact same spot as the rest of your mounted volume data, but it was useful in getting us going faster.

  • So to start, we are going to put our backup temporarily out of this spot so that we can pretend we had mounted two volumes:
# In your terminal
# Let's move our data to our home directory and pretend this is where we safely mounted our backups when starting the container, ok?
cp \
$HOME/docker/volumes/postgres/backups/todo_it_well_db-2021-11-07.sql \
$HOME/todo_it_well_db-2021-11-07.sql
Enter fullscreen mode Exit fullscreen mode
  • Next, verify that our data is where we expect it to be:
# In your terminal
ls $HOME | grep todo_it_well_db-2021-11-07.sql
# my machine outputs todo_it_well_db-2021-11-07.sql
Enter fullscreen mode Exit fullscreen mode
  • Ok cool, we are pretending the we have a spot outside the normal Postgres files where we keep our data safely backed up. Now, the fun part. OMG! We just got attacked 😰. All records were destroyed by the evil hacker! Let's stop our container and delete all of our persistent data.
# In your terminal
# Stop the container forcibly
docker rm -f pg-db
# Delete our persistent data
rm -rf $HOME/docker/volumes/postgres
Enter fullscreen mode Exit fullscreen mode
  • Verify the data is indeed gone.
# In your terminal
ls $HOME/docker/volumes/postgres
# my machine outputs: No such file or directory
Enter fullscreen mode Exit fullscreen mode
  • Now we have discovered the attack and want to bring our application back online. So we start up our container as before:
# In your terminal
docker run \
 --name pg-db \
 -e POSTGRES_PASSWORD=shiny_metal_pass \
 -d \
 -p 6543:5432 \
 -v $HOME/docker/volumes/postgres:/var/lib/postgresql/data \
 postgres
Enter fullscreen mode Exit fullscreen mode
  • However, sadly, our data, including our database, is gone and we cannot connect to the todo_it_well database.

connect-fail-sql

  • Fear not, 🦸‍♀️ we have a backup! Simply copy the backup into the container:
# In your terminal
docker cp $HOME/todo_it_well_db-2021-11-07.sql pg-db:/var
Enter fullscreen mode Exit fullscreen mode
  • Next, you will need to recreate the database (so your backup data has a place to restore to):
# In your terminal
docker exec -t pg-db bash -c "createdb \
-h localhost \
-p 5432 \
-U postgres \
todo_it_well"
Enter fullscreen mode Exit fullscreen mode
  • Finally, run the psql SQL file command and take a deep breath, you are recovering from a very stressful attack 😉.
docker exec -t pg-db bash -c \
"psql \
-U postgres \
-d todo_it_well \
-f /var/todo_it_well_db-2021-11-07.sql"
Enter fullscreen mode Exit fullscreen mode
  • Lastly, let's verify that our data was properly restored by running our query from earlier:
-- From a SQL Editor
SELECT 
   u.display_name, 
   t.description
FROM 
   "user" u 
INNER JOIN
   "task" t ON u.id = t.owner_id
WHERE 
   u.id = 2
   AND t.completed_on IS NULL;

-- RESULTS
-- display_name description
-- dat-girl-jen read chapter of lotr
-- dat-girl-jen sign up to volunteer
Enter fullscreen mode Exit fullscreen mode

There you have it! You have done a ton! You set up a dockerized Postgres DB, you created your schemas and planted some data in your database, you took a back up of the data and then restored the data from back up after losing it. You rock 🎸.

Closing Thoughts

There are some issues with our set up and certainly there are some ease of use and durability improvements we could make, but this is a fully working solution. Some of the areas we should checkout:

  • Custom Dockerfile that includes a backups location.
  • A run command that references the previously mentioned backups location mount.
  • A cron job run on the host or on the container that creates backups.
  • A secure, but accessible spot for these backups to stored. Maybe DigitalOcean Spaces, why not 🦀?

Resources

Top comments (4)

Collapse
 
andlewis profile image
Andrew Lewis

Kinda curious about why there's a photo of an LDS temple in the header. Not complaining, just curious.

Collapse
 
mattcale profile image
Matthew Cale • Edited

The answer is........ i thought it was a pretty steeple lol

I honestly had no idea it was LDS. Is that Joseph Smith up there? That's the Angel Moroni up there. In any case much love to my mormon sisters and brothers I mean absolutely zero offense in using the image. I genuinely think it's lovely and hope it doesn't offend anyone in that community. BTW if it does offend someone I hope you'll reach out and let me know (kindly lol), and also hopefully explain why so I can be sure to consider it in the future.

Thanks for learnin' me something Andrew Lewis!

Collapse
 
andlewis profile image
Andrew Lewis

Oh, totally not offensive, just a surprise to be sure. That's the Angel Moroni. You can read about it here: thechurchnews.com/temples/2020-07-...

Thread Thread
 
mattcale profile image
Matthew Cale

📖 -- i did not anticipate learning this when i started down this path! thanks again!