DEV Community

axurcio
axurcio

Posted on • Originally published at insight-services-apac.github.io on

A Guide to Azure Site Recovery - Part 2

In the previous blog post I have elaborated the significance of Azure Site Recovery and various under the hood components that make up an Azure Site Recovery. As a continuation to it, I will cover the Onboarding of Virtual Machines to ASR and usage of Recovery Plans along with the Infrastructure as a Code practices and challenges.

Onboarding Virtual Machines for ASR

In a typical migration journey, ASR onboarding can be performed once all the technical and business validations are performed and signed off and when there is no point of rollback involved. It is critical to understand and categorize the RTO and RPO levels of the Virtual machine and services in it before onboarding into ASR. The initial replication and following snapshots will begin according to your Replication policy as soon as you associate the VM to ASR.

Switching to a different Replication policy is only possible after turning off replication which will delete all the snapshots under previous policy. Hence, initial onboarding has to be performed with right rationale. On most occasions, the recovery configuration of the VM would be same as of that of its Primary configuration. However with ASR this is configurable too.

What is a Recovery Plan?

Recovery plans helps group the VMs into recovery groups through which you can plan and define your failover. Recovery plan groups helps define the order of failover and has capability to run tasks as a part of pre or post failover. This is similar to an Onpremises DR playbook maintained by Infrastructure and Product team. While migrating your workloads, it is essential to transform these runbooks and incorporate them into Recovery plans.

Automation using Recovery Plans

Group Actions in Recovery Plans helps us to automate tasks which can reduce the overall RTO. There are two type of Group Actions.

  • Manual actions can be list of steps that needs to be performed in Azure or elsewhere before or after a groups of VMs are failed over or failed back. When the step is reached, User prompt with preconfigured description if any will allow the administrator to complete any manual tasks and awaits acknowledgement.

  • Runbook Actions are tasks that can integrate with Runbooks of Automation account. These can be tasks executed before or after a groups of VMs are failed over or failed back such as scripts on the VMs specifically like Updating Config files, registry changes etc. Runbooks have visibility to the Recovery process through the Recovery Plan context passed through ASR processes.

Sample Runbook with Recovery Plan Context

{
    "RecoveryPlanName":"Test-RecoveryPlan",
    "FailoverType":"Test",
    "FailoverDirection":"PrimaryToSecondary",
    "GroupId":"Group2",
    "VmMap":{
        "d8daf0e6-34a7-4608-b09d-6a3251fe5ac5":{
            "SubscriptionId":"nnnnnn-nnnnn-nnnnn",
            "ResourceGroupName":"yyy-yyy-yyy-yy",
            "CloudServiceName":null,
            "RoleName":"VM-Name",
            "RecoveryPointId":"a53eea11-4e14-462a-b5ec-e18e455dada5",
            "RecoveryPointTime":"/Date(1636597591395)/"
        }
    }
}

Enter fullscreen mode Exit fullscreen mode

ASR via Infrastructure as a Code

Recovery Services Vault can be a single pane of control for ASR however under the hood there are plenty of components to be configured if it is configured and maintained via code. Typically Recovery Service Vault configuration is part of Azure Foundations and is best placed to configure.

Resource Provider Significance
Replication Policy Microsoft.RecoveryServices/vaults/replicationPolicies Configuration that details the frequency of recovery snapshots and retention of those snapshots
Replication Fabric Microsoft.RecoveryServices/vaults/replicationFabrics Source and Target Regions are represented as Fabrics
Replication Protection Container Microsoft.RecoveryServices/vaults/replicationFabrics/replicationProtectionContainers Logical containers underneath Fabric to group Virtual Machines for Source and Target regions
Replication Protection Container Mappings Microsoft.RecoveryServices/vaults/replicationFabrics/replicationProtectionContainers/replicationProtectionContainerMappings Associates the Protection Containers to Replication Policy Ideally this has to be performed for every replication policy which we are intending to use.
Replication Network Mappings Microsoft.RecoveryServices/vaults/replicationFabrics/replicationNetworks/replicationNetworkMappings Maps the Source and Target Networks and vice versa

ASR Onboarding

Some of common pain areas in ASR onboarding are,

  • Configuration such as Disks, Specification, Resource Groups can be varying for every virtual machine.
  • Complex Parameters files which is hard to maintain.
  • If extensive parameters are not supplied, building logic via ARM, Bicep or Terraform to fetch from VMs can be challenging and will be complex to write and maintain.
  • Maintaining the Source Repository. Most of the foundations code comprises Recovery Services vault and teams generally do not mix up Non foundations components in it. Recommendation is maintain the ASR components excluding the foundations in a separate repository.

In efforts to solve above pain points, I found that the approach of using an Azure Powershell script as a wrapper can be highly beneficial. This preprocessing logic dynamically fetches the VM details and enables replication for the VMs chosen.

  • a Simple CSV file comprising Virtual Machines can be used as an input this preprocessing logic.
  • CSVs are easily configurable and maintainable.
  • Powershell is highly compatible to integrate with Azure providers.
  • Logic can be developed to read the specifications, disk details and formulate the details required for enabling replication.
  • Finally this preprocessing logic can either produce a complex parameter JSON file that can be used to deploy via ARM templates, Bicep or can run Azure cmdlets to enable replication.

Virtual Machines CSV Layouts

Column Name Description
vmName Name of the Virtual Machine
replicationPolicy Name of the Replication Policy - Platinum,Gold,Silver,Bronze, NonProd depending upon RTO and RPO
resourceGroup Name of Resource Group

ASR Onboarding Snippet

  #import CSV from vmCsvPath
  $vmCsv = import-csv $vmCsvPath
  $enableReplicationJobs = New-Object System.Collections.ArrayList

  #Enable Replication for Each VM
  foreach ($vm in $vmCsv)
  { 
   Write-output ("Processing VM: "+$vm.vmName)
   $vmName = $vm.vmName
   $sourceResourceGroup = $vm.resourceGroup
   $replicationPolicy = $vm.replicationPolicy

   Enable-Replication -vmName $vmName -replicationPolicy $replicationPolicy -sourceRg $sourceResourceGroup -targetRg $asrResourceGroup -rsvVault $rsvVault
  } 

  #Adding Os Disk
  $osDisk = New-AzRecoveryServicesAsrAzureToAzureDiskReplicationConfig -DiskId $vmDetails.StorageProfile.OsDisk.ManagedDisk.Id `
            -LogStorageAccountId $primaryASRStorageAccountId -ManagedDisk -RecoveryReplicaDiskAccountType $vmDetails.StorageProfile.OsDisk.ManagedDisk.StorageAccountType `
            -RecoveryResourceGroupId $targetResourceGroupId -RecoveryTargetDiskAccountType $vmDetails.StorageProfile.OsDisk.ManagedDisk.StorageAccountType 

  #Adding Data Disk
  foreach($dataDisk in $vmDetails.StorageProfile.DataDisks)
  { 
    write-output "Adding Data disks for Replication"
    $disk = New-AzRecoveryServicesAsrAzureToAzureDiskReplicationConfig -DiskId $dataDisk.ManagedDisk.Id `
             -LogStorageAccountId $primaryASRStorageAccountId -ManagedDisk -RecoveryReplicaDiskAccountType $dataDisk.ManagedDisk.StorageAccountType `
             -RecoveryResourceGroupId $targetResourceGroupId -RecoveryTargetDiskAccountType $dataDisk.ManagedDisk.StorageAccountType
    $rc = $diskList.Add($disk)
  }

  #Enabling Replication
  $job = New-AzRecoveryServicesAsrReplicationProtectedItem -AzureToAzure -Name $vmName -RecoveryVmName $vmName -ProtectionContainerMapping $primaryProtectionContainerMapping ` 
         -AzureVmId $vmDetails.ID -AzureToAzureDiskReplicationConfiguration $diskList -RecoveryResourceGroupId $TargetResourceGroupId `
         -RecoveryAzureSubnetName $targetSubnetName -RecoveryAzureNetworkId $targetVirtualNetworkId

Enter fullscreen mode Exit fullscreen mode

Recovery Plans

Recovery plans can be simple or complex according to your Virtual machines footprint and your appetite towards automation. A complex recovery plan can have multiple groups each groups comprising a set of Virtual machines. Each group can have pre and post actions that can be either manual or automated tasks with the help of runbooks.

Now all this can be overwhelming if the configuration is maintained as parameters in JSON files. Similar to pain points in ASR onboarding, achieving a right balance between logic and flexibility in parameters will be challenging.

A preprocessing logic using Azure Powershell can again be used which can consume simple parameters in form of CSV files and build the complex JSON required for ARM template deployment.

  • Script to iterate the VM and Recovery plan CSV files for the recovery plan to be processed.
  • Identifies the grouping of the VMs and build groups by referencing the Pre and Post actions in group actions CSV file.
  • A JSON parameter file is finally built through this preprocessing script and can then be used to deploy via ARM template/Bicep.

Recovery Plan CSV Layouts

Column Name Description
vmName Name of the Virtual Machine
recoveryPlan Name of the Recovery Plan
group Group in Recovery Plan - 1,2,3 etc

Group Action CSV Layouts

Column Name Description
recoveryPlan Name of the Recovery Plan
group Group in Recovery Plan - 1,2,3 etc
startAction Type of Start Action - Manual, Runbook
startActionName Name of the Start Action
startActionDescription Description of Start Action
endAction Type of End Action - Manual, Runbook
endActionName Name of the End Action
endActionDescription Description of End Action
failoverType Type of Failover - TestFailover, PlannedFailover
failoverDirections Direction of Failover - PrimaryToRecovery, RecoveryToPrimary

Recovery Plan Snippet


        #Create Recovery Protected Items Array
        $replicationProtArray = $vmSubset | ForEach-Object {
            $primaryFabric = get-asrfabric | Where-object {$_.FabricSpecificDetails.Location -like $primaryRegion} 
            $primaryContainer = Get-ASRProtectionContainer -Name $PrimaryContainerName -Fabric $primaryFabric
            $protDetails = Get-AzRecoveryServicesAsrReplicationProtectedItem -Name $_.vmName -ProtectionContainer $primaryContainer
            $protId = $protDetails.Id
            $vmDetails = Get-AzVM -ResourceGroupName $_.resourceGroup -Name $_.vmName 
            $vmId = $vmDetails.Id 
            [PSCustomObject]@{
                id = $protId
                virtualMachineId = $vmId 
            }

        }

        ###Start Group Action
        #Logic to transform to Manual action
        if ($action.startAction -eq 'Manual')
        {
            $startCustomDetails= [PSCustomObject]@{
                instanceType = 'ManualActionDetails'
                description = $action.startActionDescription
            }
            $finalStartAction = [PSCustomObject]@{
                actionName = $action.startActionName
                failoverTypes = [string[]] (Split-StringObject $action.failoverType)  
                failoverDirections = [string[]] (Split-StringObject $action.failoverDirections)
                customDetails = $startCustomDetails 
            }         

        }
        #Logic to transform to Runbook action 
        elseif( $action.startAction -eq 'Runbook') {
            $runbookName = $action.startActionName
            $runbookId = ($automationAccountId+"/runbooks/"+$runbookName)
            $startCustomDetails = [PSCustomObject]@{
                instanceType = 'AutomationRunbookActionDetails'
                runbookId = $runbookId
                description = $action.startActionDescription
                fabricLocation = 'Primary'
            }
            $finalStartAction = [PSCustomObject]@{
                actionName = $action.startActionName
                failoverTypes = [string[]] (Split-StringObject $action.failoverType)  
                failoverDirections = [string[]] (Split-StringObject $action.failoverDirections)
                customDetails = $startCustomDetails 
            }    
        }
        elseif( !$action ) {
            $finalStartAction = @()            
        }

        #Create Recovery Group Array
        $recoveryGroups = [PSCustomObject]@{
                groupType = "Boot"
                replicationProtectedItems = [array] $replicationProtArray
                startGroupActions = [array] $finalStartAction
                endGroupActions = [array] $finalEndAction
        }

        #create Recovery Plan Finalized Param file with Array
        $recoveryPlanfile.parameters.recoveryVaultName.value = $rsvVault

        $recoveryPlanfile.parameters.recoveryPlanName.value = "RecoveryPlan-$recoveryPlan"

        $recoveryPlanfile.parameters.recoveryGroups.value = [array] $recoveryGroupsArray

        #Convert to Json Parameters
        $recoveryPlanJson = ConvertTo-Json -InputObject $recoveryPlanfile -Depth 10

        $recoveryPlanJson | Set-Content $baseTemplatePath\"RecoveryPlan-$recoveryPlan.parameters.json"

        $DeploymentInputs = @{
                     Name = "RecoveryPlan-$recoveryPlan-$(-join (Get-Date -Format yyyyMMdd))"
                     TemplateFile = $armTemplateFile
                     TemplateParameterFile = "$baseTemplatePath\RecoveryPlan-$recoveryPlan.parameters.json"
                     Verbose = $true
                     ErrorAction = "Stop"
                   }

        New-AzResourceGroupDeployment @DeploymentInputs -ResourceGroupName $rsvRg

Enter fullscreen mode Exit fullscreen mode

This marks the completion of Onboarding of Virtual machines into ASR and Recovery plans and now ready for failover as a part of DR Drills or a real life disaster situation. In the next blog post in this series, I will explain the Failover scenarios and steps along with the day 2 day operations in Azure Site Recovery.

References

Top comments (0)