loading...

Working with Azure VM Spot instance

omiossec profile image Olivier Miossec ・5 min read

Azure Spot instance is a cost-saving option to run a virtual machine in Azure. The VM runs when there is unused capacity on Azure infrastructure for a particular region. But if this capacity is no longer available the VM is deallocated.

The price is not fixed like standard instances. It changes over the day.

Take an F2s_v2 instance with windows 2019. It cost $0.204 per hour in the North Europe region, but at the time I wrote this article (2020-02-26, early in the morning) the spot price is $0.0376 per hour in North Europe and $0.017 in UK South. As far as I know, prices are lower in the morning.

To use Spot VM you have two choices you can:

  • decide a maximum price, the machine is evicted when the spot price is greater than the maximum price;
  • choose capacity only, the machine is evicted when Azure needs capacity. In other words, your maximum price for the spot VM is the current price of the regular VM;

As you understand, Spot instances price is not fixed and it may vary based on the supply and demand in a particular region. And, just like a stock market, you can fix the price where you no longer need an instance.

Spot instance is in preview and is limited to Pay As You Go or Enterprise Agreement subscription. All VM Serie is available except for the B series and Promo version.

There is no API to monitor spot instance prices, for the moment, to monitor the current price of a VM size.

To deploy a Spot instance with ARM Templates, you can use the latest version of the API of Microsoft.Compute/virtualMachines. The 2019-03-01 and the 2019-07-01 versions are the only versions supporting Spot instance.

"priority": "string",
    "evictionPolicy": "string",
    "billingProfile": {
      "maxPrice": x
    }

Where X represents the maximum price for the instance. It could be -1 for capacity only or a price x.xxx for the maximum price.

What happens if you try to deploy a spot instance when your maximum price is less than the current price. You will not be able to deploy the VM.

With ARM Template you will get this error:

  "error": {
    "code": "OperationNotAllowed",
    "message": "Unable to perform operation 'Create VM' since the provided max price '0.02 USD' is lower than the current spot price '0.0752 USD' for Azure Spot VM size 'Standard_F2s_v2'. For more information, see http://aka.ms/AzureSpot/errormessages."
  }

And with PowerShell:

New-AzVM: Unable to perform operation 'Create VM' since the provided max price '0.02 USD' is lower than the current spot price '0.0752 USD' for Azure Spot VM size 'Standard_F2s_v2'

When deployed and running the VM can be deallocated if Azure need resources. If you had set a maximum price, the VM is evicted when the price is greater than the current price. If not the VM is evicted when Azure need resource.

How do you know if the VM will be evicted?

Azure uses the Instance Metadata Service to notify the eviction. Instance Metadata Services are web services you can only access from a VM by an un-routable IP. There are four services, Instance, attested, identity and ScheduleEvents. Azure uses the latest one to notify the eviction event. It’s a REST API.

The URL of the service is:
http://169.254.169.254/metadata/scheduledevents?api-version=2019-01-01

Be sure to use the latest API version, at this time 2019-01-01

You can use PowerShell to query the API, but you will need to add a key/value header, Metadata=true. Without this header, the request will fail.

$MetaDataHeaders = @{"Metadata"="true"}
Invoke-RestMethod -Method GET -uri "http://169.254.169.254/metadata/scheduledevents?api-version=2019-01-01" -Headers $MetaDataHeaders

The first query takes a few seconds because the API is enabling the service (the service is disabled by default). If you don't query the API for 24 hours, the service is disabled again.
You should receive a JSON with two fields, DocumentIncarnation, an integer and Events, an array.
If the event object count is equal to zero, nothing will happen. In other cases, you need to look at the Event type. There are five event types:

  • Freeze, the VM is freeze
  • Reboot, the VM will reboot
  • Redeploy, the VM will be redeployed
  • Terminate, The VM will be deleted (by the user)
  • Preempt, the Spot VM will be evicted

For the first four event types, you will have at least 5 minutes to do something but for Spot VM to be evicted you will have only 30 seconds.

It means you need to run the query every second to have the time to perform any action before the eviction.

And as you noticed, the Events Fields is an array, you may have more than one event.

<#
.SYNOPSIS
    Detect if any Preempt action is scheduled

#>

while($true) {
    try {
        $MetaDataHeaders = @{"Metadata"="true"}
         $SchedulerData = Invoke-RestMethod -Method GET -uri "http://169.254.169.254/metadata/scheduledevents?api-version=2019-01-01" -Headers $MetaDataHeaders

        if ($SchedulerData.Events.EventType -contains "Preempt") {
            # Take any action
            break
        }
        else {
            Start-Sleep -Seconds 1
        }
    }
    catch {
        Write-Error -Message " Exception Type: $($_.Exception.GetType().FullName) $($_.Exception.Message)"
    }
}

The action could be saving and closing files, close an application and terminate sessions, …

The maximum price option offers more visibility. You know how much you can spend on a solution before it goes down.

The question is what kind of solution can support this situation. You have some basics scenario. If you need to set up a Lab for a few hours if you want some batch processing but you don't care if VMs are down for a while.

But the applications running in spot VMs must handle unexpected interruption.

You can build a complex architecture to take advantage of the spot VMs. Imagine the situation where the application is present in three regions:

  • North Europe
  • West Europe
  • France Central

Theses three regions will have different spot prices. So, you can deploy a workload depending on the price. You will need some immutable VMs. The application must be deployed within the VM.

To avoid being in the dark you should deploy normal instances too. You should have a mechanism to deploy new Spot VM with a new price when VM starts to be evicted in a region. The eviction script running on each VM can trigger an Azure Function to perform that.

The function will remove the VM from an Azure Application Gateway backend pool, try to create a new Spot VM with a different price and add it to the backend pool.

Finally, the three Azure Application Gateway regions are connected via an Azure Front door to ensure a global availability.

Azure Spot VM is in preview and you will not get any SLA. It's a good option if you are price sensitive and can handle an interruption.

Discussion

pic
Editor guide
Collapse
mohsin708961 profile image
{{7*7}}

Awesome