loading...

Cloud in a budget: Preemptible instances to the rescue

viniciusccarvalho profile image Vinicius Carvalho ・2 min read

Saving costs on cloud

For those on a tight budget,
Google Cloud Preemtible VMs is one of my favorite offerings.

Preemptible VMs are restarted (recycled) every 24 hours which make them a great fit for stateless workloads. The benefit of those VMs comes in pricing. Prices for a 24/7 VM running on us-central1:

Configuration Regular Preemptible
E2 4vCPUs 16GB RAM $97.84 $29.35

As you can see, there's a huge cost saving in leveraging them. Consider the cost savings for your testing enviroments alone.

Wait what about serverless?

A lot of people may point out that you should use serverless for stateless workloads, since it's a managed offering.

And while that is indeed true for most cases, there are still several workloads that would not make sense on a serverless environment.

One of the main shortcomings of serverless offerings is the fact that you can't run long running processes.

Another issue is the fact that many serverless offerings would not allow you to customize the sandbox environment (Google Cloud run fixes this by letting you bring your own container with your pre packaged libraries, and lambda layers also help in that front).

Quasi-stateful machines

Ok, so preemtible VMs can't really run stateful workloads, but it doesn't mean you can't have state. That is possible by attaching a persistent disk to your instance.

The data on the boot disk gets discarded on every restart, but not the persistent disk.

GCE instances can run scripts during the startup and you can attach one or more disks to it. The script bellow would mount and format (if not already) any disk on /dev/sdb

#!/bin/bash

if sudo blkid /dev/sdb;then 
   exit
else 
   sudo mkfs.ext4 -m 0 -F -E lazy_itable_init=0,lazy_journal_init=0,discard /dev/sdb; \
   sudo mkdir -p /mnt/data
   sudo mount -o discard,defaults /dev/sdb /mnt/data
fi
  • Please be advised that a machine stop can lead to data corruption so be careful when using this approach *

Our cheap batch workload agent

Where I found this to be very useful was running some long running (2~3 hours a day jobs) where I could use some beefier machines for video encoding tasks.

As video encoding requires a good chunk of CPU and a bunch of libraries installed on the OS. Preemptible VMs were an easy choice. To give you an idea, running a node 4 hours would cost:

Configuration Regular Preemptible
E2 16vCPUs 64GB RAM $65.22 $19.57

Not too bad right? We start and stop the VMs via cloud functions and cloud scheduler (there's a place for serverless too).

So if you are in a budget, consider using preemptible VMs if your architecture requires batching or stateless workloads.

Happy coding.

Discussion

markdown guide