Overview
So you need to share files across multiple servers in the cloud, but then you come to find out you can only attach a volume to one host! What do you do?!
Well, you have a few options:
-
Create an NFS mount using another server
- You could create a Network Filesystem using another server, but this introduces a few challenges:
- Your storage capacity is bound to one underlying server, creating a single point of failure.
- You need some decent linux chops to get this working and automate it.
-
You can create a filesystem mount using s3fs and DigitalOcean Spaces
- Since DigitalOcean Spaces uses Amazon's S3 protocol, we don't actually have to use AWS S3 to use s3fs, we just need storage that implements the protocol.
Why
Why would you use object storage for this filesystem to share between servers?
- It's cloud native.
- It's highly available.
- It's performant.
- It's cheap. $5/month for 250G!
- You don't have to maintain and secure a separate server for storage.
- You can use the object storage for other applications outside of your servers via HTTP.
How to
OK, so this sounds pretty great right? You can have some amazing object storage power your storage needs to easily share files across your servers. Today, I'll show you how using Terraform.
Create a DigitalOcean Spaces access key
First you'll want to login to the DigitalOcean console and create a new access key + access key secret. This will be used to authenticate with your DigitalOcean Spaces bucket to ensure only you can access the storage.
When you click "Generate New Key", you'll need to type your key name into the text box, then click the blue check mark. After you click the blue check mark establishing your key name, you'll see these 2 new fields(save these for later):
DigitalOcean API Key
Now that you have your access key and secret for the spaces bucket, you'll still need an API key to use with Terraform to create a few resources such as DigitalOcean droplets and a Spaces bucket. This can also be done in the "API" section of the console.
Terraform Code
OK, so now we have our Spaces Access Key + Secret as well as our DigitalOcean API key. We can now move on to actually creating some droplets, a bucket, and share files across the two using s3fs.
A quick overview of what is happening below. We're creating a new bucket and 2 droplets that will share files back and forth. This is done by taking some example input such as a region, mount point (filesystem path), and bucket name which will use cloud-init to mount the bucket to the droplets when they first boot.
First let's make a terraform.tfvars
file that takes in our configuration. It looks likes this:
# Spaces Access Key ID
spaces_access_key_id = "XXX"
# Spaces Access Key Secret
spaces_access_key_secret = "XXX"
# DigitalOcean API Token
do_token = "XXX"
# SSH Key ID to be able to get into our new droplets (can leave this empty if no need to ssh)
ssh_key_id = ""
Now we need to create a file called main.tf
with the following content below. This will create our bucket and droplets and will configure s3fs on our droplets to be able to read and write files to the same bucket.
Please refer to the comments for the walkthrough of each component:
# Needed for terraform to initialize and
# install the digitalocean terraform provider
terraform {
required_providers {
digitalocean = {
source = "digitalocean/digitalocean"
}
}
}
# Expected input, DigitalOcean Spaces Access Key ID
variable "spaces_access_key_id" {
type = string
sensitive = true
}
# Expected input, DigitalOcean Spaces Access Key Secret
variable "spaces_access_key_secret" {
type = string
sensitive = true
}
# Expected input, DigitalOcean API Token
variable "do_token" {
type = string
sensitive = true
}
# SSH key in DigitalOcean that will allow us to get into our hosts
# (Not Necessarily Needed)
variable "ssh_key_id" {
type = string
sensitive = true
default = ""
}
# DigitalOcean region to create our droplets and spaces bucket in
# Let's just go with nyc3
variable "region" {
type = string
default = "nyc3"
}
# Name of our DigitalOcean Spaces bucket
variable "bucket_name" {
type = string
default = "s3fs-bucket"
}
# Where to mount our bucket on the filesystem on the DigitalOcean droplets
# Let's just default to /tmp/mount for demo purposes
variable "mount_point" {
type = string
default = "/tmp/mount"
}
# Configure the DigitalOcean provider to create our resources
provider "digitalocean" {
token = var.do_token
spaces_access_id = var.spaces_access_key_id
spaces_secret_key = var.spaces_access_key_secret
}
# Create our DigitalOcean spaces bucket to store files
# that will be accessed by our droplets
resource "digitalocean_spaces_bucket" "s3fs_bucket" {
name = var.bucket_name
region = var.region
}
# Let's create a sample file in the bucket called "index.html"
resource "digitalocean_spaces_bucket_object" "index" {
region = digitalocean_spaces_bucket.s3fs_bucket.region
bucket = digitalocean_spaces_bucket.s3fs_bucket.name
key = "index.html"
content = "<html><body><p>This page is empty.</p></body></html>"
content_type = "text/html"
}
# Configure our DigitalOcean droplets via cloud-init
# Install the s3fs package
# Create a system-wide credentials file for s3fs to be able to access the bucket
# Create a the mount point directory (/tmp/mount)
# Call s3fs to mount the bucket
locals {
cloud_init_config = yamlencode({
packages = [
"s3fs"
],
write_files = [{
owner = "root:root"
path = "/etc/passwd-s3fs"
permissions = "0600"
content = "${var.spaces_access_key_id}:${var.spaces_access_key_secret}"
}],
runcmd = [
"mkdir -p ${var.mount_point}",
"s3fs ${var.bucket_name} ${var.mount_point} -o url=https://${var.region}.digitaloceanspaces.com"
]
})
}
# Convert our cloud-init config to userdata
# Userdata runs at first boot when the droplets are created
data "cloudinit_config" "server_config" {
gzip = false
base64_encode = false
part {
content_type = "text/cloud-config"
content = local.cloud_init_config
}
}
# Create 2 DigitalOcean droplets that will both mount the same spaces bucket
# These 2 hosts will share files back and forth
resource "digitalocean_droplet" "s3fs_droplet" {
count = 2
image = "ubuntu-20-04-x64"
name = "s3fs-droplet-${count.index}"
region = var.region
size = "s-1vcpu-1gb"
ssh_keys = var.ssh_key_id != "" ? [var.ssh_key_id] : []
user_data = data.cloudinit_config.server_config.rendered
}
# Output our ip addresses to the console so that we can easily copy/pasta to ssh in
output "s3fs_droplet_ipv4_addresses" {
value = digitalocean_droplet.s3fs_droplet[*].ipv4_address
}
Terraform Output
Now that we have our configuration defined above, we simple need to run terraform init && terraform apply -auto-approve
to create our things!
❯ terraform apply -auto-approve
Terraform used the selected providers to generate the following execution plan. Resource actions are indicated with the following symbols:
+ create
Terraform will perform the following actions:
# digitalocean_droplet.s3fs_droplet[0] will be created
+ resource "digitalocean_droplet" "s3fs_droplet" {
+ backups = false
+ created_at = (known after apply)
+ disk = (known after apply)
+ graceful_shutdown = false
+ id = (known after apply)
+ image = "ubuntu-20-04-x64"
+ ipv4_address = (known after apply)
+ ipv4_address_private = (known after apply)
+ ipv6 = false
+ ipv6_address = (known after apply)
+ locked = (known after apply)
+ memory = (known after apply)
+ monitoring = false
+ name = "s3fs-droplet-0"
+ price_hourly = (known after apply)
+ price_monthly = (known after apply)
+ private_networking = (known after apply)
+ region = "nyc3"
+ resize_disk = true
+ size = "s-1vcpu-1gb"
+ ssh_keys = (sensitive)
+ status = (known after apply)
+ urn = (known after apply)
+ user_data = "dc35535cfb286b2994e31baa83c32ef808b9bdff"
+ vcpus = (known after apply)
+ volume_ids = (known after apply)
+ vpc_uuid = (known after apply)
}
# digitalocean_droplet.s3fs_droplet[1] will be created
+ resource "digitalocean_droplet" "s3fs_droplet" {
+ backups = false
+ created_at = (known after apply)
+ disk = (known after apply)
+ graceful_shutdown = false
+ id = (known after apply)
+ image = "ubuntu-20-04-x64"
+ ipv4_address = (known after apply)
+ ipv4_address_private = (known after apply)
+ ipv6 = false
+ ipv6_address = (known after apply)
+ locked = (known after apply)
+ memory = (known after apply)
+ monitoring = false
+ name = "s3fs-droplet-1"
+ price_hourly = (known after apply)
+ price_monthly = (known after apply)
+ private_networking = (known after apply)
+ region = "nyc3"
+ resize_disk = true
+ size = "s-1vcpu-1gb"
+ ssh_keys = (sensitive)
+ status = (known after apply)
+ urn = (known after apply)
+ user_data = "dc35535cfb286b2994e31baa83c32ef808b9bdff"
+ vcpus = (known after apply)
+ volume_ids = (known after apply)
+ vpc_uuid = (known after apply)
}
# digitalocean_spaces_bucket.s3fs_bucket will be created
+ resource "digitalocean_spaces_bucket" "s3fs_bucket" {
+ acl = "private"
+ bucket_domain_name = (known after apply)
+ force_destroy = false
+ id = (known after apply)
+ name = "s3fs-bucket"
+ region = "nyc3"
+ urn = (known after apply)
}
# digitalocean_spaces_bucket_object.index will be created
+ resource "digitalocean_spaces_bucket_object" "index" {
+ acl = "private"
+ bucket = "s3fs-bucket"
+ content = "<html><body><p>This page is empty.</p></body></html>"
+ content_type = "text/html"
+ etag = (known after apply)
+ force_destroy = false
+ id = (known after apply)
+ key = "index.html"
+ region = "nyc3"
+ version_id = (known after apply)
}
Plan: 4 to add, 0 to change, 0 to destroy.
Changes to Outputs:
+ s3fs_droplet_ipv4_addresses = [
+ (known after apply),
+ (known after apply),
]
digitalocean_spaces_bucket.s3fs_bucket: Creating...
digitalocean_droplet.s3fs_droplet[1]: Creating...
digitalocean_droplet.s3fs_droplet[0]: Creating...
digitalocean_spaces_bucket.s3fs_bucket: Still creating... [10s elapsed]
digitalocean_droplet.s3fs_droplet[1]: Still creating... [10s elapsed]
digitalocean_droplet.s3fs_droplet[0]: Still creating... [10s elapsed]
digitalocean_droplet.s3fs_droplet[1]: Still creating... [20s elapsed]
digitalocean_droplet.s3fs_droplet[0]: Still creating... [20s elapsed]
digitalocean_spaces_bucket.s3fs_bucket: Still creating... [20s elapsed]
digitalocean_spaces_bucket.s3fs_bucket: Creation complete after 28s [id=s3fs-bucket]
digitalocean_spaces_bucket_object.index: Creating...
digitalocean_spaces_bucket_object.index: Creation complete after 0s [id=index.html]
digitalocean_droplet.s3fs_droplet[0]: Still creating... [30s elapsed]
digitalocean_droplet.s3fs_droplet[1]: Still creating... [30s elapsed]
digitalocean_droplet.s3fs_droplet[0]: Still creating... [40s elapsed]
digitalocean_droplet.s3fs_droplet[1]: Still creating... [40s elapsed]
digitalocean_droplet.s3fs_droplet[1]: Creation complete after 43s [id=283287872]
digitalocean_droplet.s3fs_droplet[0]: Creation complete after 43s [id=283287873]
Apply complete! Resources: 4 added, 0 changed, 0 destroyed.
Outputs:
s3fs_droplet_ipv4_addresses = [
"165.227.106.47",
"45.55.60.230",
]
Sharing files
Cool! Now have our bucket and some droplets already configured, let's ssh to both and checkout that /tmp/mount
path that we set up in our Terraform configuration above.
Let's do a recap of what's happening above.
On both s3fs-droplet-0
and s3fs-droplet-1
, I ran df -h | grep s3fs
which gives us our disk usage for all of the mounted volumes, but I filtered specifically for the term s3fs
to shorten the list. This shows us that our bucket is mounted and available at /tmp/mount
! Hooray!
root@s3fs-droplet-0:/tmp/mount# df -h | grep s3fs
s3fs 256T 0 256T 0% /tmp/mount
root@s3fs-droplet-1:/tmp/mount# df -h | grep s3fs
s3fs 256T 0 256T 0% /tmp/mount
Next, I ran ll /tmp/mount
on both hosts so that we can see that the contents of the bucket and we can see the index.html
file that I created in the bucket in the Terraform code is there and is viewable by both droplets. Awesooooome!
root@s3fs-droplet-0:/tmp/mount# ll /tmp/mount/
total 5
drwx------ 1 root root 0 Jan 1 1970 ./
drwxrwxrwt 12 root root 4096 Jan 22 18:54 ../
-rw-r----- 1 root root 52 Jan 22 18:48 index.html
root@s3fs-droplet-1:/tmp/mount# ll /tmp/mount/
total 5
drwx------ 1 root root 0 Jan 1 1970 ./
drwxrwxrwt 12 root root 4096 Jan 22 18:54 ../
-rw-r----- 1 root root 52 Jan 22 18:48 index.html
OK, so next I ran a touch
command on s3fs-droplet-0
which created a file in /tmp/mount
:
root@s3fs-droplet-0:/tmp/mount# touch file_from_$(hostname)
I used $(hostname)
to substitute the name of the droplet in the file name so that we can see said file on s3fs-droplet-1
. Let's have a look and see if that file is viewable on the other server.
root@s3fs-droplet-1:/tmp/mount# ll /tmp/mount/
total 6
drwx------ 1 root root 0 Jan 1 1970 ./
drwxrwxrwt 12 root root 4096 Jan 22 18:54 ../
-rw-r--r-- 1 root root 0 Jan 22 19:00 file_from_s3fs-droplet-0
-rw-r----- 1 root root 52 Jan 22 18:48 index.html
It's there! We successfully shared files between our 2 droplets. Now let's go look at our spaces bucket in the DigitalOcean console:
WOOT WOOT! Since we're using the spaces bucket, we can access these files from anywhere and in any application! NFS is looking pretty gross at this point. Yay for cloud object storage and thanks to DigitalOcean for providing us such a cool service!
Fin.
Top comments (1)
Beautiful article. Thanks for the explanation.