DEV Community

AWS GitHub & S3 Backup

Introduction:
Data backup and disaster recovery services are critical aspects of protecting a business’s most asset — its data.
Losing this data can result in severe consequences, including financial loss, reputational damage, and operational disruptions. Therefore, it’s essential to understand the importance of data backup and disaster recovery planning and implement effective strategies to safeguard your business’s assets.

AWS S3 BACKUPS STORAGE:

Image description

Always consider both folders:

Image description

Backup procedures:
Github:

Image description

Docker file:

FROM debian:12-slim

ENV DEBIAN_FRONTEND=noninteractive

RUN apt-get update && \
apt-get install -y \
awscli

COPY script.sh /usr/local/bin/script.sh
RUN chmod +x /usr/local/bin/script.sh

CMD ["/usr/local/bin/script.sh"]

Script:

!/bin/bash

Directory paths

SOURCE_DIR="/app/storage"
DEST_DIR="/app/backup"
TIMESTAMP=$(date "+%Y-%m-%d")

Ensure destination directory exists

mkdir -p "$DEST_DIR"

Count total directories, excluding lost+found

TOTAL_DIRS=$(find "$SOURCE_DIR" -mindepth 1 -maxdepth 1 -type d ! -name 'lost+found' | wc -l)

Counter for processed directories

PROCESSED_DIRS=0

RETENTION_PERIOD=13

List directories in the source directory

for dir in "$SOURCE_DIR"/*; do
if [ -d "$dir" ]; then
# Get directory name
dir_name=$(basename "$dir")

    # Skip the lost+found directory
    if [ "$dir_name" = "lost+found" ]; then
        continue
    fi

    echo "Processing directory: $dir"

    # Copy directory to destination
    cp -r "$dir" "$DEST_DIR/$dir_name"

    # Compress the copied directory
    tar -czf "$DEST_DIR/$dir_name.tar.gz" -C "$DEST_DIR" "$dir_name"

    # Upload to AWS S3
    aws s3 cp /app/backup/${dir_name}.tar.gz s3://${BUCKET}/files/${TIMESTAMP}/${NAMESPACE}/${dir_name}/${dir_name}.tar.gz

    # Clean up
    rm -rf "$DEST_DIR/$dir_name" "$DEST_DIR/$dir_name.tar.gz"

    # Increment processed directories counter
    PROCESSED_DIRS=$((PROCESSED_DIRS+1))

    # Calculate and display progress
    PROGRESS=$(( (PROCESSED_DIRS * 100) / TOTAL_DIRS))
    echo "Progress: $PROGRESS% ($PROCESSED_DIRS/$TOTAL_DIRS directories processed)"

    # Deleting folders older than retention period
    RETENTION_DATE=$(date -d "${TIMESTAMP} -${RETENTION_PERIOD} days" "+%Y-%m-%d")

    # List all date folders
    FOLDER_LIST=$(aws s3 ls s3://${BUCKET}/files/ | awk '$0 ~ /PRE/ {print $2}' | grep -E '^[0-9]{4}-[0-9]{2}-[0-9]{2}/' | sed 's/\/$//')

    # Loop through each folder and delete if older than retention period
    for folder in $FOLDER_LIST; do
        FOLDER_TIMESTAMP=$(date -d "${folder}" "+%s")
        RETENTION_TIMESTAMP=$(date -d "${RETENTION_DATE}" "+%s")
        if [ $FOLDER_TIMESTAMP -lt $RETENTION_TIMESTAMP ]; then
            aws s3 rm s3://${BUCKET}/files/${folder}/ --recursive --quiet
        fi
    done 
fi
Enter fullscreen mode Exit fullscreen mode

done

The backup retention period is 14 days, you can restore back any day you want.

YAML file:

apiVersion: batch/v1
kind: CronJob
metadata:
name: backup-files
namespaces: backup
spec:
schedule: "0 3 * * *"
jobTemplate:
spec:
template:
spec:
containers:
- name: backup-container
image: ghcr.io/backup/backup-files:latest
env:
- name: NAMESPACE
value: "backup"
- name: BUCKET
value: "nombre del bucket"
- name: AWS_ACCESS_KEY_ID
valueFrom:
secretKeyRef:
name: aws-secret
key: AWS_ACCESS_KEY_ID
- name: AWS_SECRET_ACCESS_KEY
valueFrom:
secretKeyRef:
name: aws-secret
key: AWS_SECRET_ACCESS_KEY
volumeMounts:
- name: my-pvc
mounthPath: /app/storage
restartPolicy: OnFailure
imagePullSecrets:
- name: registry-credentials-back
volumes:
- name: my-pvc
persistentVolumeClaim:
claimName: backend-upload-storage-pvc
backoffLimit: 4

Postgres YAML file:

apiVersion: batch/v1
kind: CronJob
metadata:
name: backup-files
namespaces: backup
spec:
schedule: "0 3 * * *"
jobTemplate:
spec:
template:
spec:
containers:
- name: backup-container
image: ghcr.io/backup/backup-files:latest
env:
- name: NAMESPACE
value: "backup"
- name: BUCKET
valueFrom:
secretKeyRef:
name: postgres-pguser-backup
key: host
- name: PG_PORT
valueFrom:
secretKeyRef:
name: postgres-pguser-backup
key: host
- name: PG_USER
valueFrom:
secretKeyRef:
name: postgres-pguser-backup
key: user
- name: PG_PASS
valueFrom:
secretKeyRef:
name: postgres-pguser-backup
key: password

- name: BUCKET
value: "nombre del bucket"
- name: AWS_ACCESS_KEY_ID
valueFrom:
secretKeyRef:
name: aws-secret
key: AWS_ACCESS_KEY_ID
- name: AWS_SECRET_ACCESS_KEY
valueFrom:
secretKeyRef:
name: aws-secret
key: AWS_SECRET_ACCESS_KEY
restartPolicy: OnFailure
imagePullSecrets:
- name: registry-credentials-back
backoffLimit: 4

Thank you for your time.

Top comments (0)