DEV Community

Todor Todorov for AWS Community Builders

Posted on

Spinning up EC2 instance part of ASG faster - mission impossible? Or maybe not?!

We all know about all these serverless opportunities out there running on lightweight docker images launched in seconds or Lamba functions triggered in time comparable to lightning speed.
However, even if I would prefer and I am a fan of all these serverless solutions like ECS Fargate, Lambda, EKS, etc., it is more than obvious that this could not be a panacea and could not solve all the cases with serverless.
Similarly to every technology EC2 also does not make an exception and does have advantages and things to consider as bottlenecks too.
EC2 in its essence represents a virtual machine where we have a full-sized Operating System and could not escape from the boot time factor. In the meantime, AWS is a leader which pays attention to what its customers' needs are.
This is why back in 2019, a decade after EBS appeared, they presented a feature called EBS Fast Snapshot Restore (FSR). This feature enables us to use a snapshot and create a fresh EBS volume with up to 16 TiB space and 64K IOPS.
As you may already know, the main benefit of this feature is to make the EC2 boot faster.

Use Case

Speed up the boot process of a new instance part of an AutoScaling Group in a cost-effective way.

Question

Could I benefit from the FSR feature and at the same time avoid any potential extra cost?

Solution

Fast Snapshot Restore feature is not free of charge hence we need to handle with care its usage just like we do with all the other services which power our business. To do that we should first know that it will charge us $0.75 for each hour that Fast Snapshot Restore is enabled for a snapshot in a particular Availability Zone. The second thing which we need to know is when we can benefit from the feature and the answer is simple - during the boot time only (once the instance is operational, we can disable it till the next time we need it). So far so good, but outages come with no upfront notice and we should be ready, this is why some automation will give us a hand in this situation. If you are using ASG, you are most probably aware of the existence of Lifecycle hooks or in other words - the opportunity to halt the launching/termination of an EC2 instance and trigger some actions before you let it go terminating or launching.
Here is a small diagram that will make the situation clear:

Image description

How setup will look like in CloudForamtion Template to deploy the Step Function is shown below:

AWSTemplateFormatVersion: '2010-09-09'
Transform: AWS::Serverless-2016-10-31
Description: Enable/Disable FRS on newly launched instance in AutoScaling Group

Parameters:

  Asg:
    Description: 'Source ASG from where to get an instance Id to create the new AMI'
    Type: String
    Default: MyASG

  LambdaManageFrsZipLocation:
    Description: 'Name of the Python Lambda function to manage AMIs zip file'
    Type: String
    Default: 'FRS.zip'

  LambdaStartStepFunctionZipLocation:
    Description: 'Name of the Python Lambda function to start step function zip file'
    Type: String
    Default: 'StartStepFunction.zip'

  LambdaCompleteLifecycleHookZipLocation:
    Description: 'Name of the Python Lambda function to complete lifecycle hook zip file'
    Type: String
    Default: 'CompleteLifecycleHook.zip'

  Region:
    Description: 'Operational Region'
    Type: String
    Default: 'us-east-1'

  Environment:
    Description: 'Environment Name'
    Type: String
    Default: 'Prod'

  ActivateLifecycleWhenLaunchOrch:
    Description: 'Has the FRS to be activated when the instance is launching (Y/N) ?'
    Type: String
    Default: Y
    AllowedValues:
      - Y
      - N

#---------------------------------------------------------------------------------------------------------------------------------------------------------
Conditions:
  ActivateLifecycleConditionStart: !Equals [!Ref 'ActivateLifecycleWhenLaunchOrch', 'Y']
#---------------------------------------------------------------------------------------------------------------------------------------------------------

Resources:
  myStepFunctionRole:
    Type: 'AWS::IAM::Role'
    Properties:
      RoleName: !Sub 'StepFunctionRole-${Environment}-${AWS::Region}'
      AssumeRolePolicyDocument:
        Version: '2012-10-17'
        Statement:
          - Effect: Allow
            Principal:
              Service: states.amazonaws.com
            Action: 'sts:AssumeRole'
      Path: /
      Policies:
        - PolicyName: ManageAmiStepFunctionPolicy
          PolicyDocument:
            Version: '2012-10-17'
            Statement:
              - Sid: VisualEditor0
                Effect: Allow
                Action:
                  - 'logs:CreateLogStream'
                  - 'logs:PutLogEvents'
                Resource: 'arn:aws:logs:*:*:*'
              - Sid: VisualEditor1
                Effect: Allow
                Action:
                  - 'lambda:InvokeFunction'
                Resource: '*'
              - Sid: SomeNewSid
                Effect: Allow
                Action: 'logs:CreateLogGroup'
                Resource: '*'

  myLambdaManageAmiRole:
    Type: AWS::IAM::Role
    Properties:
      Description: 'Manage AMI'
      MaxSessionDuration: 3600
      Path: '/service-role/'
      RoleName: !Sub 'ManageFrsLambdaRole_${Environment}'
      AssumeRolePolicyDocument:
        Version: '2012-10-17'
        Statement:
        - Effect: Allow
          Principal:
            Service:
            - lambda.amazonaws.com
          Action:
          - sts:AssumeRole
      Policies:
        - PolicyName: allowLogging
          PolicyDocument:
            Version: '2012-10-17'
            Statement:
            - Effect: Allow
              Action:
              - logs:CreateLogGroup
              - logs:CreateLogStream
              - logs:PutLogEvents
              Resource: '*'
        - PolicyName: allowEC2
          PolicyDocument:
            Version: '2012-10-17'
            Statement:
            - Effect: Allow
              Action:
              - ec2:DescribeImages
              - ec2:CreateTags
              - ec2:DescribeFastSnapshotRestores
              - ec2:DisableFastSnapshotRestores
              - ec2:EnableFastSnapshotRestores
              Resource: '*'
        - PolicyName: allowDescribeAutoScaling
          PolicyDocument:
            Version: '2012-10-17'
            Statement:
            - Effect: Allow
              Action:
              - autoscaling:DescribeAutoScalingGroups
              Resource: '*'
        - PolicyName: allowIamPassRole
          PolicyDocument:
            Version: '2012-10-17'
            Statement:
            - Effect: Allow
              Action:
              - iam:PassRole
              - iam:ListAccessKeys
              Resource: !Sub "arn:aws:iam::${AWS::AccountId}:role/*"
        - PolicyName: allowKMS
          PolicyDocument:
            Version: '2012-10-17'
            Statement:
            - Effect: Allow
              Action:
              - kms:ReEncrypt*
              - kms:GenerateDataKey*
              - kms:CreateGrant
              - kms:DescribeKey*
              - kms:ListKeys
              - kms:ListAliases
              Resource: '*'
        - PolicyName: allowSSMCommands
          PolicyDocument:
            Version: '2012-10-17'
            Statement:
            - Effect: Allow
              Action:
              - ssm:SendCommand
              - ssm:GetCommandInvocation
              Resource: '*'

  myLambdatoStartStepFunctionRole:
    Type: AWS::IAM::Role
    Properties:
      Description: 'Start Step Function'
      MaxSessionDuration: 3600
      Path: '/service-role/'
      RoleName: !Sub 'StartStepFunctionLambdaRole_${Environment}'
      AssumeRolePolicyDocument:
        Version: '2012-10-17'
        Statement:
        - Effect: Allow
          Principal:
            Service:
            - lambda.amazonaws.com
          Action:
          - sts:AssumeRole
      Policies:
        - PolicyName: allowLogging
          PolicyDocument:
            Version: '2012-10-17'
            Statement:
            - Effect: Allow
              Action:
              - logs:CreateLogGroup
              - logs:CreateLogStream
              - logs:PutLogEvents
              Resource: '*'
        - PolicyName: allowStates
          PolicyDocument:
            Version: '2012-10-17'
            Statement:
            - Effect: Allow
              Action:
              - states:StartExecution
              Resource: 
              - !GetAtt myFastRestoreStepFunction.Arn
        - PolicyName: allowSNS
          PolicyDocument:
            Version: 2012-10-17
            Statement:
            - Effect: Allow
              Resource: "*"
              Action:
              - sns:Publish
        - PolicyName: allowASG
          PolicyDocument:
            Version: 2012-10-17
            Statement:
            - Effect: Allow
              Resource: "*"
              Action:
              - autoscaling:CompleteLifecycleAction

  myLifecycleHookRole:
    Type: "AWS::IAM::Role"
    Properties:
        AssumeRolePolicyDocument:
            Version: "2012-10-17"
            Statement:
              -
                Effect: "Allow"
                Action:
                  - "sts:AssumeRole"
                Principal:
                    Service:
                      - "autoscaling.amazonaws.com"
        Path: /
        ManagedPolicyArns:
          - arn:aws:iam::aws:policy/service-role/AutoScalingNotificationAccessRole

  myLambdatoCompleteLifecycleHookRole:
    Type: AWS::IAM::Role
    Properties:
      Description: 'Complete Lifecycle Hook'
      MaxSessionDuration: 3600
      Path: '/service-role/'
      RoleName: !Sub 'CompleteLifecycleHookLambdaRole_${Environment}'
      AssumeRolePolicyDocument:
        Version: '2012-10-17'
        Statement:
        - Effect: Allow
          Principal:
            Service:
            - lambda.amazonaws.com
          Action:
          - sts:AssumeRole
      Policies:
        - PolicyName: allowLogging
          PolicyDocument:
            Version: '2012-10-17'
            Statement:
            - Effect: Allow
              Action:
              - logs:CreateLogGroup
              - logs:CreateLogStream
              - logs:PutLogEvents
              Resource: '*'

        - PolicyName: allowASG
          PolicyDocument:
            Version: 2012-10-17
            Statement:
            - Effect: Allow
              Resource: "*"
              Action:
              - autoscaling:CompleteLifecycleAction

  myLambdatoCompleteLifecycleHook:
    Type: AWS::Lambda::Function
    Properties:
      Code:
        S3Bucket: !Sub 'cft-templates.${AWS::AccountId}.${AWS::Region}'
        S3Key: !Ref LambdaCompleteLifecycleHookZipLocation
      Description: 'Complete Lifecycle Hook'
      FunctionName: !Sub 'CompleteLifecycleHookLambda_${Environment}'
      Handler: CompleteLifecycleHook.lambda_handler
      Role: !GetAtt myLambdatoCompleteLifecycleHookRole.Arn
      Runtime: 'python3.8'
      Timeout: 300

  myFastRestoreStepFunction:
    Type: AWS::Serverless::StateMachine
    Properties:
      Name: !Sub 'ManageFastRestoreStepFunction_${Environment}'
      Type: STANDARD
      Role: !GetAtt myStepFunctionRole.Arn
      Definition:
        #Comment: Step Function to Enable FastRestore
        StartAt: Search for EC2 Id in ASG #Started by Lifecycle Hook 1 ?
        States:

          Search for EC2 Id in ASG:
            Type: Task
            OutputPath: '$.Payload'
            Resource: arn:aws:states:::lambda:invoke
            Parameters:
              FunctionName: !GetAtt myLambdatoManageFrs.Arn
              Payload:
                ASGName.$: '$.ASGName'
                lifeCycleHook.$: '$.lifeCycleHook'
                instance.$: '$.instance'
                region.$: '$.region'
                status.$: '$.status'
                AMIid.$: '$.AMIid'
                cmd: 'getASGInstanceId'
            Next: Get AMI ID

          Get AMI ID:
            Type: Task
            OutputPath: '$.Payload'
            Resource: arn:aws:states:::lambda:invoke
            Parameters:
              FunctionName: !GetAtt myLambdatoManageFrs.Arn
              Payload:
                ASGName.$: '$.ASGName'
                lifeCycleHook.$: '$.lifeCycleHook'
                instance.$: '$.instance'
                region.$: '$.region'
                status.$: '$.status'
                AMIid.$: '$.AMIid'
                cmd: 'getInstanceImageId'
            Next: Enable FastSnapshotRestores

          Enable FastSnapshotRestores:
            Type: Task
            OutputPath: '$.Payload'
            Resource: arn:aws:states:::lambda:invoke
            Parameters:
              FunctionName: !GetAtt myLambdatoManageFrs.Arn
              Payload:
                ASGName.$: '$.ASGName'
                lifeCycleHook.$: '$.lifeCycleHook'
                instance.$: '$.instance'
                region.$: '$.region'
                status.$: '$.status'
                AMIid.$: '$.AMIid'
                cmd: 'enableFastRestore'
            Next: Check FastRestore

          Test FastRestore:
            Type: Choice
            Choices:
            - Variable: '$.status'
              StringEquals: enabled
              Next: Parallel
            - Not:
                Variable: '$.status'
                StringEquals: enabled
              Next: Wait 20s Fast

          Wait 20s Fast:
            Type: Wait
            Seconds: 20
            Next: Check FastRestore

          Check FastRestore:
            Type: Task
            OutputPath: '$.Payload'
            Resource: arn:aws:states:::lambda:invoke
            Parameters:
              FunctionName: !GetAtt myLambdatoManageFrs.Arn
              Payload:
                ASGName.$: '$.ASGName'
                lifeCycleHook.$: '$.lifeCycleHook'
                instance.$: '$.instance'
                region.$: '$.region'
                status.$: '$.status'
                AMIid.$: '$.AMIid'
                cmd: 'showFastRestore'
            Next: Test FastRestore
          Parallel:
            Type: Parallel
            Branches:
              [
              {
              "StartAt": "Wait for Disable FastRestore",
              "States": {
                "Wait for Disable FastRestore": {
                  "Seconds": 3000,
                  "Type": "Wait",
                  "Next": "Disable FastSnapshotRestores"
                },
                "Disable FastSnapshotRestores": {
                  "Next": "success",
                  "OutputPath": "$.Payload",
                  "Parameters": {
                    "FunctionName": !GetAtt myLambdatoManageFrs.Arn,
                    "Payload": {
                      "AMIid.$": "$.AMIid",
                      "ASGName.$": "$.ASGName",
                      "cmd": "disableFastRestore",
                      "instance.$": "$.instance",
                      "lifeCycleHook.$": "$.lifeCycleHook",
                      "region.$": "$.region",
                      "status.$": "$.status"

                    }
                  },
                  "Resource": "arn:aws:states:::lambda:invoke",
                  "Type": "Task"
                },
                "success": {
                  "Type": "Pass",
                  "End": true
                }
              }
              },
              {
              "StartAt": "Complete Lifecycle Hook",
              "States": {
                "Complete Lifecycle Hook": {
                  "OutputPath": "$.Payload",
                  "Parameters": {
                    "FunctionName": !GetAtt myLambdatoCompleteLifecycleHook.Arn,
                    "Payload": {
                      "AMIid": "null",
                      "ASGName.$": "$.ASGName",
                      "cmd": "null",
                      "instance.$": "$.instance",
                      "lifeCycleHook.$": "$.lifeCycleHook",
                      "region": "null",
                      "status": "null"
                    }
                  },
                  "Resource": "arn:aws:states:::lambda:invoke",
                  "Type": "Task",
                  "Catch": [
                    {
                      "ErrorEquals": [
                        "States.ALL"
                      ],
                      "Next": "handle failure",
                      "ResultPath": "$.error"
                    }
                  ],
                  "End": true
                },
                "handle failure": {
                  "Type": "Pass",
                  "End": true
                }
              }
              }
              ]
            "End": true


  #Lambda to trigger the Step Function dedicated to enable FastRestore Snapshot on new EC2 instance launch in the ASG
  myLambdatoStartFastRestoreStepFunction:
    Type: AWS::Lambda::Function
    Properties:
      Environment:
        Variables:
          STATE_MACHINE_ARN: !GetAtt myFastRestoreStepFunction.Arn
      Code:
        S3Bucket: !Sub 'cft-templates.${AWS::AccountId}.${AWS::Region}'
        S3Key: !Ref LambdaStartStepFunctionZipLocation
      Description: 'Launch Step Function handling FRS enable/disable on EC2 launch'
      FunctionName: !Sub 'StartFastRestoreStepFunctionLambda_${Environment}'
      Handler: StartStepFunction.lambda_handler
      Role: !GetAtt myLambdatoStartStepFunctionRole.Arn
      Runtime: 'python3.8'
      Timeout: 300

  # Lifecycle hook to trigger the Enable/Disable of FRS when instance is launched on the Orchestrator ASG
  myStartLifecycleHookTopic:
    Type: AWS::SNS::Topic

  myStartLifecycleHookSubscription:
    Type: AWS::SNS::Subscription
    Properties:
      Endpoint: !GetAtt myLambdatoStartFastRestoreStepFunction.Arn
      Protocol: "lambda"
      TopicArn: !Ref myStartLifecycleHookTopic

  myStartPermission:
    Type: AWS::Lambda::Permission
    Properties:
      Action: "lambda:InvokeFunction"
      FunctionName: !GetAtt myLambdatoStartFastRestoreStepFunction.Arn
      Principal: sns.amazonaws.com
      SourceArn: !Ref myStartLifecycleHookTopic

  myStartLifecycleHookASG:
    Type: AWS::AutoScaling::LifecycleHook
    Condition: ActivateLifecycleConditionStart
    Properties:
      AutoScalingGroupName: !Ref Asg
      LifecycleTransition: "autoscaling:EC2_INSTANCE_LAUNCHING"
      DefaultResult: CONTINUE
      HeartbeatTimeout: 600
      NotificationMetadata: !Sub |-
        {
          "lifeCycleHook": "null",
          "ASGName": "${Asg}",
          "instance": "null",
          "region": "${Region}",
          "status": "null",
          "AMIid": "null",
          "cmd": "null"
        }
      NotificationTargetARN:
        Ref: myStartLifecycleHookTopic
      RoleARN: !GetAtt myLifecycleHookRole.Arn

  myLambdatoManageFrs:
    Type: AWS::Lambda::Function
    Properties:
      Code:
        S3Bucket: !Sub 'cft-templates.${AWS::AccountId}.${AWS::Region}'
        S3Key: !Ref LambdaManageFrsZipLocation
      Description: 'Manage Fast Restore Snapshot'
      FunctionName: !Sub 'ManageFRS_${Environment}'
      Handler: FRS.lambda_handler
      Role: !GetAtt myLambdaManageFrsRole.Arn
      Runtime: 'python3.8'
      Timeout: 300

  myLambdaManageFrsRole:
    Type: AWS::IAM::Role
    Properties:
      Description: 'Manage FRS'
      MaxSessionDuration: 3600
      Path: '/service-role/'
      RoleName: !Sub 'ManageFRS_${Environment}'
      AssumeRolePolicyDocument:
        Version: '2012-10-17'
        Statement:
        - Effect: Allow
          Principal:
            Service:
            - lambda.amazonaws.com
          Action:
          - sts:AssumeRole
      Policies:
        - PolicyName: allowLogging
          PolicyDocument:
            Version: '2012-10-17'
            Statement:
            - Effect: Allow
              Action:
              - logs:CreateLogGroup
              - logs:CreateLogStream
              - logs:PutLogEvents
              Resource: '*'
        - PolicyName: allowEC2
          PolicyDocument:
            Version: '2012-10-17'
            Statement:
            - Effect: Allow
              Action:
              - ec2:CreateTags
              - ec2:Describe*
              - ec2:DisableFastSnapshotRestores
              - ec2:EnableFastSnapshotRestores
              Resource: '*'
        - PolicyName: allowDescribeAutoScaling
          PolicyDocument:
            Version: '2012-10-17'
            Statement:
            - Effect: Allow
              Action:
              - autoscaling:DescribeAutoScalingGroups
              Resource: '*'
        - PolicyName: allowIamPassRole
          PolicyDocument:
            Version: '2012-10-17'
            Statement:
            - Effect: Allow
              Action:
              - iam:PassRole
              - iam:ListAccessKeys
              Resource: !Sub "arn:aws:iam::${AWS::AccountId}:role/*"

Enter fullscreen mode Exit fullscreen mode

We will have several Lambda Functions in order to cover the logic:

FRS Lambda:

import boto3
from botocore.config import Config
import json
from datetime import tzinfo, timedelta, datetime
import time
import sys
import os

# --------------------------------------------------------------------------------------------------------------------------------
# To display debug messages (True or False)
DEBUG = True

def message(msg):
    if DEBUG:
        print(msg)
    return

def error(err, msg ):
    raise Exception (f"{msg} --- err = {err}")

def getASGInstanceId(asgClient, asgName):
    try:
        message("---- getInstanceId")

        # get object for the ASG we're going to update, filter by name of target ASG
        response = asgClient.describe_auto_scaling_groups(AutoScalingGroupNames=[asgName])

        if not response['AutoScalingGroups']:
            err = f"## No such ASG '{asgName}'"
            message(err)
            raise RuntimeError(err)

        # get InstanceID in current ASG that we'll use to model new Launch Configuration after
        instanceId = response.get('AutoScalingGroups')[0]['Instances'][-1]['InstanceId']

    except Exception as e:
        message(f"==== e = {e}")
        return(error(e, f"## Failed to get ASG Instance Id of ASG '{asgName}'"))

    message(f"InstanceId = {instanceId}")
    return instanceId
def getInstanceImageId(instanceId):
    ec2_client = boto3.client('ec2')
    ec2_response = ec2_client.describe_instances(
    InstanceIds = [instanceId]
      )
    for instances in ec2_response['Reservations']:
        for image in instances['Instances']:
            print(image['ImageId'])
            AMIid = image['ImageId']
    return AMIid
def enableFastRestore(AMIid,asgName):
    ec2 = boto3.client('ec2')
    ec2response = ec2.describe_images(
        ImageIds=[
            AMIid
        ],
    )

    asg  = boto3.client('autoscaling')
    # You may need to edit the filter here for your use case
    for i  in range(5):
        if ec2response['Images'][0]['BlockDeviceMappings'][i]['Ebs']['VolumeSize'] < 40:
            snapshotid=ec2response['Images'][0]['BlockDeviceMappings'][i]['Ebs']['SnapshotId']
    print(snapshotid)
    asgresponse = asg.describe_auto_scaling_groups(
                AutoScalingGroupNames=[
                            asgName,
                                ]
                        )

    print(asgresponse['AutoScalingGroups'][0]['AvailabilityZones'][0])
    az=asgresponse['AutoScalingGroups'][0]['AvailabilityZones'][0]
    try:
        fsrresponse = ec2.enable_fast_snapshot_restores(
                   AvailabilityZones=[
                         az,
                   ],
                   SourceSnapshotIds=[
                         snapshotid,
                   ]
        )
    except Exception as e:
        message(f"==== e = {e}")
        return(error(e, f"## Failed to Enable Fast Restore "))
    return fsrresponse

# deactivate FastRestore Snapshot

def disableFastRestore(AMIid,asgName):
    ec2 = boto3.client('ec2')
    ec2response = ec2.describe_images(
        ImageIds=[
            AMIid
        ],
    )

    asg  = boto3.client('autoscaling')
    # You may need to edit the filter here for your use case
    for i  in range(5):
        if ec2response['Images'][0]['BlockDeviceMappings'][i]['Ebs']['VolumeSize'] < 40:
            snapshotid=ec2response['Images'][0]['BlockDeviceMappings'][i]['Ebs']['SnapshotId']
    print(snapshotid)
    asgresponse = asg.describe_auto_scaling_groups(
                AutoScalingGroupNames=[
                            asgName,
                                ]
                        )

    print(asgresponse['AutoScalingGroups'][0]['AvailabilityZones'][0])
    az=asgresponse['AutoScalingGroups'][0]['AvailabilityZones'][0]
    try:
        fsrresponse = ec2.disable_fast_snapshot_restores(
                   AvailabilityZones=[
                         az,
                   ],
                   SourceSnapshotIds=[
                         snapshotid,
                   ]
        )
    except Exception as e:
        message(f"==== e = {e}")
        return(error(e, f"## Failed to Disable Fast Restore "))
    return fsrresponse


def showFastRestore(AMIid,asgName):
    ec2 = boto3.client('ec2')
    ec2response = ec2.describe_images(
        ImageIds=[
            AMIid
        ],
    )

    asg  = boto3.client('autoscaling')
    # You may need to edit filter here for your use case
    for i  in range(5):
        if ec2response['Images'][0]['BlockDeviceMappings'][i]['Ebs']['VolumeSize'] < 40:
            snapshotid=ec2response['Images'][0]['BlockDeviceMappings'][i]['Ebs']['SnapshotId']
    print(snapshotid)
    asgresponse = asg.describe_auto_scaling_groups(
                AutoScalingGroupNames=[
                            asgName,
                                ]
                        )

    print(asgresponse['AutoScalingGroups'][0]['AvailabilityZones'][0])
    az=asgresponse['AutoScalingGroups'][0]['AvailabilityZones'][0]
    try:
        fsrresponse = ec2.describe_fast_snapshot_restores(
            Filters=[
               {
                'Name': 'snapshot-id',
                'Values': [
                             snapshotid,
                       ]
                    },
            ]
        )
    except Exception as e:
        message(f"==== e = {e}")
        return(error(e, f"## Failed to Disable Fast Restore "))
    fsrstate=fsrresponse['FastSnapshotRestores'][0]['State']
    return fsrstate

def lambda_handler(event, context):
    message("Received event: " + json.dumps(event, indent=2))
#----------------
    lifeCycleHook = event['lifeCycleHook']
    asgName = event['ASGName']
    instance = event['instance']
    status = event['status']
    region_name=event['region']
    status=event['status']
    AMIid=event['AMIid']
    cmd = event['cmd']

#----------------
    autoscalingClient = boto3.client('autoscaling')

    ec2Client   = boto3.client('ec2', region_name)

    ec2Resource   = boto3.resource('ec2', region_name)

    ssmClient   = boto3.client('ssm',  region_name)

#----------------

    message(f"---- CMD = {cmd}")

    if cmd == 'getASGInstanceId':
        message("---- Get instance Id from ASG")

        instance = getASGInstanceId(autoscalingClient, asgName)
        AMIid = getInstanceImageId(instance)
    elif cmd == 'getInstanceImageId':
        instance = getASGInstanceId(autoscalingClient, asgName)
        message("---- Get Instance Image Id on instance: " + instance)
        AMIid = getInstanceImageId(instance)
    elif cmd == 'enableFastRestore':
        message("---- Enable Fast Restore")
        instance = getASGInstanceId(autoscalingClient, asgName)
        AMIid = getInstanceImageId(instance)
        response = enableFastRestore(AMIid,asgName)
    elif cmd == 'disableFastRestore':
        instance = getASGInstanceId(autoscalingClient, asgName)
        message("---- Disable Fast Restore")
        AMIid = getInstanceImageId(instance)
        response = disableFastRestore(AMIid,asgName)
    elif cmd == 'showFastRestore':
        message("---- Show Fast Restore")
        instance = getASGInstanceId(autoscalingClient, asgName)
        AMIid = getInstanceImageId(instance)
        status = showFastRestore(AMIid,asgName)

    else:
        return(error(e, f"Command '{cmd}' is not recognized or planned to be managed" ))

    result = {"lifeCycleHook": lifeCycleHook, "ASGName": asgName, "instance": instance,  "region": region_name, "status": status,  "AMIid": AMIid, "cmd": cmd}
    message("##### RETURN result :  " + json.dumps(result, indent=2))
    return result
Enter fullscreen mode Exit fullscreen mode

Complete Lifecycle Hook Lambda:

import boto3
from botocore.exceptions import ClientError
import os
import json


asgClient = boto3.client('autoscaling')

def lambda_handler(event, context):
    print('The Lambda function is starting.')
    print("Received event: " + json.dumps(event, indent=2))

    autoScalingGroup = event['ASGName']
    instanceId = event['instance']
    lifeCycleHook = event['lifeCycleHook']

    actionResult = "CONTINUE"

    response = asgClient.complete_lifecycle_action(
        LifecycleHookName = lifeCycleHook,
        AutoScalingGroupName = autoScalingGroup,
        LifecycleActionResult = actionResult,
        InstanceId = instanceId
    )

    print(f"Complete lifecycle hook response : {response}")
    return

Enter fullscreen mode Exit fullscreen mode

Start Step Function Lambda:

import boto3
from datetime import tzinfo, timedelta, datetime
from botocore.exceptions import ClientError
import os
import json
import ast

STATE_MACHINE_ARN = os.environ.get('STATE_MACHINE_ARN')

stepFnClient = boto3.client('stepfunctions')

def lambda_handler(event, context):
    print('The Lambda function is starting.')
    print("Received event: " + json.dumps(event, indent=2))

    datename = datetime.now().strftime('%Y_%m_%d_%HH%M')
    EXECUTION_NAME = 'Life_Cycle_Hook_Start_' + datename
    message = event['Records'][0]['Sns']['Message']
    msgJson = json.loads(message)
    instanceId=msgJson["EC2InstanceId"]
    lifecycleHookName=msgJson["LifecycleHookName"]
    lifecycleActionToken=msgJson["LifecycleActionToken"]
    autoScalingGroupName=msgJson["AutoScalingGroupName"]
    notificationMetadata=msgJson["NotificationMetadata"]
    notificationMetadataDict = json.loads(notificationMetadata)

    print(f"----- NotificationMetadataDict = {notificationMetadataDict}")
    notificationMetadataDict["lifeCycleHook"] = lifecycleHookName
    notificationMetadataDict["instance"] = instanceId
    notificationMetadataDict["ASGName"] = autoScalingGroupName

    print(f"----- NotificationMetadataDict = {notificationMetadataDict}")
    notificationMetadata = json.dumps(ast.literal_eval(str(notificationMetadataDict)), indent=2)
    print(f"----- NotificationMetadata = {notificationMetadata}")

    print('Starting step function ...')

    response = stepFnClient.start_execution(
        stateMachineArn=STATE_MACHINE_ARN,
        name=EXECUTION_NAME,
        input=notificationMetadata
    )

    print(f"Execution arn of the Step Function : {response.get('executionArn')}")
    return

Enter fullscreen mode Exit fullscreen mode

To Sum up - I have tried to present you with an approach that will allow you to benefit from Fast Snapshot Restore functionality and in the meantime reduce the spending on the service to the really needed one.

Top comments (0)