We all know about all these serverless opportunities out there running on lightweight docker images launched in seconds or Lamba functions triggered in time comparable to lightning speed.
However, even if I would prefer and I am a fan of all these serverless solutions like ECS Fargate, Lambda, EKS, etc., it is more than obvious that this could not be a panacea and could not solve all the cases with serverless.
Similarly to every technology EC2 also does not make an exception and does have advantages and things to consider as bottlenecks too.
EC2 in its essence represents a virtual machine where we have a full-sized Operating System and could not escape from the boot time factor. In the meantime, AWS is a leader which pays attention to what its customers' needs are.
This is why back in 2019, a decade after EBS appeared, they presented a feature called EBS Fast Snapshot Restore (FSR). This feature enables us to use a snapshot and create a fresh EBS volume with up to 16 TiB space and 64K IOPS.
As you may already know, the main benefit of this feature is to make the EC2 boot faster.
Use Case
Speed up the boot process of a new instance part of an AutoScaling Group in a cost-effective way.
Question
Could I benefit from the FSR feature and at the same time avoid any potential extra cost?
Solution
Fast Snapshot Restore feature is not free of charge hence we need to handle with care its usage just like we do with all the other services which power our business. To do that we should first know that it will charge us $0.75 for each hour that Fast Snapshot Restore is enabled for a snapshot in a particular Availability Zone. The second thing which we need to know is when we can benefit from the feature and the answer is simple - during the boot time only (once the instance is operational, we can disable it till the next time we need it). So far so good, but outages come with no upfront notice and we should be ready, this is why some automation will give us a hand in this situation. If you are using ASG, you are most probably aware of the existence of Lifecycle hooks or in other words - the opportunity to halt the launching/termination of an EC2 instance and trigger some actions before you let it go terminating or launching.
Here is a small diagram that will make the situation clear:
How setup will look like in CloudForamtion Template to deploy the Step Function is shown below:
AWSTemplateFormatVersion: '2010-09-09'
Transform: AWS::Serverless-2016-10-31
Description: Enable/Disable FRS on newly launched instance in AutoScaling Group
Parameters:
Asg:
Description: 'Source ASG from where to get an instance Id to create the new AMI'
Type: String
Default: MyASG
LambdaManageFrsZipLocation:
Description: 'Name of the Python Lambda function to manage AMIs zip file'
Type: String
Default: 'FRS.zip'
LambdaStartStepFunctionZipLocation:
Description: 'Name of the Python Lambda function to start step function zip file'
Type: String
Default: 'StartStepFunction.zip'
LambdaCompleteLifecycleHookZipLocation:
Description: 'Name of the Python Lambda function to complete lifecycle hook zip file'
Type: String
Default: 'CompleteLifecycleHook.zip'
Region:
Description: 'Operational Region'
Type: String
Default: 'us-east-1'
Environment:
Description: 'Environment Name'
Type: String
Default: 'Prod'
ActivateLifecycleWhenLaunchOrch:
Description: 'Has the FRS to be activated when the instance is launching (Y/N) ?'
Type: String
Default: Y
AllowedValues:
- Y
- N
#---------------------------------------------------------------------------------------------------------------------------------------------------------
Conditions:
ActivateLifecycleConditionStart: !Equals [!Ref 'ActivateLifecycleWhenLaunchOrch', 'Y']
#---------------------------------------------------------------------------------------------------------------------------------------------------------
Resources:
myStepFunctionRole:
Type: 'AWS::IAM::Role'
Properties:
RoleName: !Sub 'StepFunctionRole-${Environment}-${AWS::Region}'
AssumeRolePolicyDocument:
Version: '2012-10-17'
Statement:
- Effect: Allow
Principal:
Service: states.amazonaws.com
Action: 'sts:AssumeRole'
Path: /
Policies:
- PolicyName: ManageAmiStepFunctionPolicy
PolicyDocument:
Version: '2012-10-17'
Statement:
- Sid: VisualEditor0
Effect: Allow
Action:
- 'logs:CreateLogStream'
- 'logs:PutLogEvents'
Resource: 'arn:aws:logs:*:*:*'
- Sid: VisualEditor1
Effect: Allow
Action:
- 'lambda:InvokeFunction'
Resource: '*'
- Sid: SomeNewSid
Effect: Allow
Action: 'logs:CreateLogGroup'
Resource: '*'
myLambdaManageAmiRole:
Type: AWS::IAM::Role
Properties:
Description: 'Manage AMI'
MaxSessionDuration: 3600
Path: '/service-role/'
RoleName: !Sub 'ManageFrsLambdaRole_${Environment}'
AssumeRolePolicyDocument:
Version: '2012-10-17'
Statement:
- Effect: Allow
Principal:
Service:
- lambda.amazonaws.com
Action:
- sts:AssumeRole
Policies:
- PolicyName: allowLogging
PolicyDocument:
Version: '2012-10-17'
Statement:
- Effect: Allow
Action:
- logs:CreateLogGroup
- logs:CreateLogStream
- logs:PutLogEvents
Resource: '*'
- PolicyName: allowEC2
PolicyDocument:
Version: '2012-10-17'
Statement:
- Effect: Allow
Action:
- ec2:DescribeImages
- ec2:CreateTags
- ec2:DescribeFastSnapshotRestores
- ec2:DisableFastSnapshotRestores
- ec2:EnableFastSnapshotRestores
Resource: '*'
- PolicyName: allowDescribeAutoScaling
PolicyDocument:
Version: '2012-10-17'
Statement:
- Effect: Allow
Action:
- autoscaling:DescribeAutoScalingGroups
Resource: '*'
- PolicyName: allowIamPassRole
PolicyDocument:
Version: '2012-10-17'
Statement:
- Effect: Allow
Action:
- iam:PassRole
- iam:ListAccessKeys
Resource: !Sub "arn:aws:iam::${AWS::AccountId}:role/*"
- PolicyName: allowKMS
PolicyDocument:
Version: '2012-10-17'
Statement:
- Effect: Allow
Action:
- kms:ReEncrypt*
- kms:GenerateDataKey*
- kms:CreateGrant
- kms:DescribeKey*
- kms:ListKeys
- kms:ListAliases
Resource: '*'
- PolicyName: allowSSMCommands
PolicyDocument:
Version: '2012-10-17'
Statement:
- Effect: Allow
Action:
- ssm:SendCommand
- ssm:GetCommandInvocation
Resource: '*'
myLambdatoStartStepFunctionRole:
Type: AWS::IAM::Role
Properties:
Description: 'Start Step Function'
MaxSessionDuration: 3600
Path: '/service-role/'
RoleName: !Sub 'StartStepFunctionLambdaRole_${Environment}'
AssumeRolePolicyDocument:
Version: '2012-10-17'
Statement:
- Effect: Allow
Principal:
Service:
- lambda.amazonaws.com
Action:
- sts:AssumeRole
Policies:
- PolicyName: allowLogging
PolicyDocument:
Version: '2012-10-17'
Statement:
- Effect: Allow
Action:
- logs:CreateLogGroup
- logs:CreateLogStream
- logs:PutLogEvents
Resource: '*'
- PolicyName: allowStates
PolicyDocument:
Version: '2012-10-17'
Statement:
- Effect: Allow
Action:
- states:StartExecution
Resource:
- !GetAtt myFastRestoreStepFunction.Arn
- PolicyName: allowSNS
PolicyDocument:
Version: 2012-10-17
Statement:
- Effect: Allow
Resource: "*"
Action:
- sns:Publish
- PolicyName: allowASG
PolicyDocument:
Version: 2012-10-17
Statement:
- Effect: Allow
Resource: "*"
Action:
- autoscaling:CompleteLifecycleAction
myLifecycleHookRole:
Type: "AWS::IAM::Role"
Properties:
AssumeRolePolicyDocument:
Version: "2012-10-17"
Statement:
-
Effect: "Allow"
Action:
- "sts:AssumeRole"
Principal:
Service:
- "autoscaling.amazonaws.com"
Path: /
ManagedPolicyArns:
- arn:aws:iam::aws:policy/service-role/AutoScalingNotificationAccessRole
myLambdatoCompleteLifecycleHookRole:
Type: AWS::IAM::Role
Properties:
Description: 'Complete Lifecycle Hook'
MaxSessionDuration: 3600
Path: '/service-role/'
RoleName: !Sub 'CompleteLifecycleHookLambdaRole_${Environment}'
AssumeRolePolicyDocument:
Version: '2012-10-17'
Statement:
- Effect: Allow
Principal:
Service:
- lambda.amazonaws.com
Action:
- sts:AssumeRole
Policies:
- PolicyName: allowLogging
PolicyDocument:
Version: '2012-10-17'
Statement:
- Effect: Allow
Action:
- logs:CreateLogGroup
- logs:CreateLogStream
- logs:PutLogEvents
Resource: '*'
- PolicyName: allowASG
PolicyDocument:
Version: 2012-10-17
Statement:
- Effect: Allow
Resource: "*"
Action:
- autoscaling:CompleteLifecycleAction
myLambdatoCompleteLifecycleHook:
Type: AWS::Lambda::Function
Properties:
Code:
S3Bucket: !Sub 'cft-templates.${AWS::AccountId}.${AWS::Region}'
S3Key: !Ref LambdaCompleteLifecycleHookZipLocation
Description: 'Complete Lifecycle Hook'
FunctionName: !Sub 'CompleteLifecycleHookLambda_${Environment}'
Handler: CompleteLifecycleHook.lambda_handler
Role: !GetAtt myLambdatoCompleteLifecycleHookRole.Arn
Runtime: 'python3.8'
Timeout: 300
myFastRestoreStepFunction:
Type: AWS::Serverless::StateMachine
Properties:
Name: !Sub 'ManageFastRestoreStepFunction_${Environment}'
Type: STANDARD
Role: !GetAtt myStepFunctionRole.Arn
Definition:
#Comment: Step Function to Enable FastRestore
StartAt: Search for EC2 Id in ASG #Started by Lifecycle Hook 1 ?
States:
Search for EC2 Id in ASG:
Type: Task
OutputPath: '$.Payload'
Resource: arn:aws:states:::lambda:invoke
Parameters:
FunctionName: !GetAtt myLambdatoManageFrs.Arn
Payload:
ASGName.$: '$.ASGName'
lifeCycleHook.$: '$.lifeCycleHook'
instance.$: '$.instance'
region.$: '$.region'
status.$: '$.status'
AMIid.$: '$.AMIid'
cmd: 'getASGInstanceId'
Next: Get AMI ID
Get AMI ID:
Type: Task
OutputPath: '$.Payload'
Resource: arn:aws:states:::lambda:invoke
Parameters:
FunctionName: !GetAtt myLambdatoManageFrs.Arn
Payload:
ASGName.$: '$.ASGName'
lifeCycleHook.$: '$.lifeCycleHook'
instance.$: '$.instance'
region.$: '$.region'
status.$: '$.status'
AMIid.$: '$.AMIid'
cmd: 'getInstanceImageId'
Next: Enable FastSnapshotRestores
Enable FastSnapshotRestores:
Type: Task
OutputPath: '$.Payload'
Resource: arn:aws:states:::lambda:invoke
Parameters:
FunctionName: !GetAtt myLambdatoManageFrs.Arn
Payload:
ASGName.$: '$.ASGName'
lifeCycleHook.$: '$.lifeCycleHook'
instance.$: '$.instance'
region.$: '$.region'
status.$: '$.status'
AMIid.$: '$.AMIid'
cmd: 'enableFastRestore'
Next: Check FastRestore
Test FastRestore:
Type: Choice
Choices:
- Variable: '$.status'
StringEquals: enabled
Next: Parallel
- Not:
Variable: '$.status'
StringEquals: enabled
Next: Wait 20s Fast
Wait 20s Fast:
Type: Wait
Seconds: 20
Next: Check FastRestore
Check FastRestore:
Type: Task
OutputPath: '$.Payload'
Resource: arn:aws:states:::lambda:invoke
Parameters:
FunctionName: !GetAtt myLambdatoManageFrs.Arn
Payload:
ASGName.$: '$.ASGName'
lifeCycleHook.$: '$.lifeCycleHook'
instance.$: '$.instance'
region.$: '$.region'
status.$: '$.status'
AMIid.$: '$.AMIid'
cmd: 'showFastRestore'
Next: Test FastRestore
Parallel:
Type: Parallel
Branches:
[
{
"StartAt": "Wait for Disable FastRestore",
"States": {
"Wait for Disable FastRestore": {
"Seconds": 3000,
"Type": "Wait",
"Next": "Disable FastSnapshotRestores"
},
"Disable FastSnapshotRestores": {
"Next": "success",
"OutputPath": "$.Payload",
"Parameters": {
"FunctionName": !GetAtt myLambdatoManageFrs.Arn,
"Payload": {
"AMIid.$": "$.AMIid",
"ASGName.$": "$.ASGName",
"cmd": "disableFastRestore",
"instance.$": "$.instance",
"lifeCycleHook.$": "$.lifeCycleHook",
"region.$": "$.region",
"status.$": "$.status"
}
},
"Resource": "arn:aws:states:::lambda:invoke",
"Type": "Task"
},
"success": {
"Type": "Pass",
"End": true
}
}
},
{
"StartAt": "Complete Lifecycle Hook",
"States": {
"Complete Lifecycle Hook": {
"OutputPath": "$.Payload",
"Parameters": {
"FunctionName": !GetAtt myLambdatoCompleteLifecycleHook.Arn,
"Payload": {
"AMIid": "null",
"ASGName.$": "$.ASGName",
"cmd": "null",
"instance.$": "$.instance",
"lifeCycleHook.$": "$.lifeCycleHook",
"region": "null",
"status": "null"
}
},
"Resource": "arn:aws:states:::lambda:invoke",
"Type": "Task",
"Catch": [
{
"ErrorEquals": [
"States.ALL"
],
"Next": "handle failure",
"ResultPath": "$.error"
}
],
"End": true
},
"handle failure": {
"Type": "Pass",
"End": true
}
}
}
]
"End": true
#Lambda to trigger the Step Function dedicated to enable FastRestore Snapshot on new EC2 instance launch in the ASG
myLambdatoStartFastRestoreStepFunction:
Type: AWS::Lambda::Function
Properties:
Environment:
Variables:
STATE_MACHINE_ARN: !GetAtt myFastRestoreStepFunction.Arn
Code:
S3Bucket: !Sub 'cft-templates.${AWS::AccountId}.${AWS::Region}'
S3Key: !Ref LambdaStartStepFunctionZipLocation
Description: 'Launch Step Function handling FRS enable/disable on EC2 launch'
FunctionName: !Sub 'StartFastRestoreStepFunctionLambda_${Environment}'
Handler: StartStepFunction.lambda_handler
Role: !GetAtt myLambdatoStartStepFunctionRole.Arn
Runtime: 'python3.8'
Timeout: 300
# Lifecycle hook to trigger the Enable/Disable of FRS when instance is launched on the Orchestrator ASG
myStartLifecycleHookTopic:
Type: AWS::SNS::Topic
myStartLifecycleHookSubscription:
Type: AWS::SNS::Subscription
Properties:
Endpoint: !GetAtt myLambdatoStartFastRestoreStepFunction.Arn
Protocol: "lambda"
TopicArn: !Ref myStartLifecycleHookTopic
myStartPermission:
Type: AWS::Lambda::Permission
Properties:
Action: "lambda:InvokeFunction"
FunctionName: !GetAtt myLambdatoStartFastRestoreStepFunction.Arn
Principal: sns.amazonaws.com
SourceArn: !Ref myStartLifecycleHookTopic
myStartLifecycleHookASG:
Type: AWS::AutoScaling::LifecycleHook
Condition: ActivateLifecycleConditionStart
Properties:
AutoScalingGroupName: !Ref Asg
LifecycleTransition: "autoscaling:EC2_INSTANCE_LAUNCHING"
DefaultResult: CONTINUE
HeartbeatTimeout: 600
NotificationMetadata: !Sub |-
{
"lifeCycleHook": "null",
"ASGName": "${Asg}",
"instance": "null",
"region": "${Region}",
"status": "null",
"AMIid": "null",
"cmd": "null"
}
NotificationTargetARN:
Ref: myStartLifecycleHookTopic
RoleARN: !GetAtt myLifecycleHookRole.Arn
myLambdatoManageFrs:
Type: AWS::Lambda::Function
Properties:
Code:
S3Bucket: !Sub 'cft-templates.${AWS::AccountId}.${AWS::Region}'
S3Key: !Ref LambdaManageFrsZipLocation
Description: 'Manage Fast Restore Snapshot'
FunctionName: !Sub 'ManageFRS_${Environment}'
Handler: FRS.lambda_handler
Role: !GetAtt myLambdaManageFrsRole.Arn
Runtime: 'python3.8'
Timeout: 300
myLambdaManageFrsRole:
Type: AWS::IAM::Role
Properties:
Description: 'Manage FRS'
MaxSessionDuration: 3600
Path: '/service-role/'
RoleName: !Sub 'ManageFRS_${Environment}'
AssumeRolePolicyDocument:
Version: '2012-10-17'
Statement:
- Effect: Allow
Principal:
Service:
- lambda.amazonaws.com
Action:
- sts:AssumeRole
Policies:
- PolicyName: allowLogging
PolicyDocument:
Version: '2012-10-17'
Statement:
- Effect: Allow
Action:
- logs:CreateLogGroup
- logs:CreateLogStream
- logs:PutLogEvents
Resource: '*'
- PolicyName: allowEC2
PolicyDocument:
Version: '2012-10-17'
Statement:
- Effect: Allow
Action:
- ec2:CreateTags
- ec2:Describe*
- ec2:DisableFastSnapshotRestores
- ec2:EnableFastSnapshotRestores
Resource: '*'
- PolicyName: allowDescribeAutoScaling
PolicyDocument:
Version: '2012-10-17'
Statement:
- Effect: Allow
Action:
- autoscaling:DescribeAutoScalingGroups
Resource: '*'
- PolicyName: allowIamPassRole
PolicyDocument:
Version: '2012-10-17'
Statement:
- Effect: Allow
Action:
- iam:PassRole
- iam:ListAccessKeys
Resource: !Sub "arn:aws:iam::${AWS::AccountId}:role/*"
We will have several Lambda Functions in order to cover the logic:
FRS Lambda:
import boto3
from botocore.config import Config
import json
from datetime import tzinfo, timedelta, datetime
import time
import sys
import os
# --------------------------------------------------------------------------------------------------------------------------------
# To display debug messages (True or False)
DEBUG = True
def message(msg):
if DEBUG:
print(msg)
return
def error(err, msg ):
raise Exception (f"{msg} --- err = {err}")
def getASGInstanceId(asgClient, asgName):
try:
message("---- getInstanceId")
# get object for the ASG we're going to update, filter by name of target ASG
response = asgClient.describe_auto_scaling_groups(AutoScalingGroupNames=[asgName])
if not response['AutoScalingGroups']:
err = f"## No such ASG '{asgName}'"
message(err)
raise RuntimeError(err)
# get InstanceID in current ASG that we'll use to model new Launch Configuration after
instanceId = response.get('AutoScalingGroups')[0]['Instances'][-1]['InstanceId']
except Exception as e:
message(f"==== e = {e}")
return(error(e, f"## Failed to get ASG Instance Id of ASG '{asgName}'"))
message(f"InstanceId = {instanceId}")
return instanceId
def getInstanceImageId(instanceId):
ec2_client = boto3.client('ec2')
ec2_response = ec2_client.describe_instances(
InstanceIds = [instanceId]
)
for instances in ec2_response['Reservations']:
for image in instances['Instances']:
print(image['ImageId'])
AMIid = image['ImageId']
return AMIid
def enableFastRestore(AMIid,asgName):
ec2 = boto3.client('ec2')
ec2response = ec2.describe_images(
ImageIds=[
AMIid
],
)
asg = boto3.client('autoscaling')
# You may need to edit the filter here for your use case
for i in range(5):
if ec2response['Images'][0]['BlockDeviceMappings'][i]['Ebs']['VolumeSize'] < 40:
snapshotid=ec2response['Images'][0]['BlockDeviceMappings'][i]['Ebs']['SnapshotId']
print(snapshotid)
asgresponse = asg.describe_auto_scaling_groups(
AutoScalingGroupNames=[
asgName,
]
)
print(asgresponse['AutoScalingGroups'][0]['AvailabilityZones'][0])
az=asgresponse['AutoScalingGroups'][0]['AvailabilityZones'][0]
try:
fsrresponse = ec2.enable_fast_snapshot_restores(
AvailabilityZones=[
az,
],
SourceSnapshotIds=[
snapshotid,
]
)
except Exception as e:
message(f"==== e = {e}")
return(error(e, f"## Failed to Enable Fast Restore "))
return fsrresponse
# deactivate FastRestore Snapshot
def disableFastRestore(AMIid,asgName):
ec2 = boto3.client('ec2')
ec2response = ec2.describe_images(
ImageIds=[
AMIid
],
)
asg = boto3.client('autoscaling')
# You may need to edit the filter here for your use case
for i in range(5):
if ec2response['Images'][0]['BlockDeviceMappings'][i]['Ebs']['VolumeSize'] < 40:
snapshotid=ec2response['Images'][0]['BlockDeviceMappings'][i]['Ebs']['SnapshotId']
print(snapshotid)
asgresponse = asg.describe_auto_scaling_groups(
AutoScalingGroupNames=[
asgName,
]
)
print(asgresponse['AutoScalingGroups'][0]['AvailabilityZones'][0])
az=asgresponse['AutoScalingGroups'][0]['AvailabilityZones'][0]
try:
fsrresponse = ec2.disable_fast_snapshot_restores(
AvailabilityZones=[
az,
],
SourceSnapshotIds=[
snapshotid,
]
)
except Exception as e:
message(f"==== e = {e}")
return(error(e, f"## Failed to Disable Fast Restore "))
return fsrresponse
def showFastRestore(AMIid,asgName):
ec2 = boto3.client('ec2')
ec2response = ec2.describe_images(
ImageIds=[
AMIid
],
)
asg = boto3.client('autoscaling')
# You may need to edit filter here for your use case
for i in range(5):
if ec2response['Images'][0]['BlockDeviceMappings'][i]['Ebs']['VolumeSize'] < 40:
snapshotid=ec2response['Images'][0]['BlockDeviceMappings'][i]['Ebs']['SnapshotId']
print(snapshotid)
asgresponse = asg.describe_auto_scaling_groups(
AutoScalingGroupNames=[
asgName,
]
)
print(asgresponse['AutoScalingGroups'][0]['AvailabilityZones'][0])
az=asgresponse['AutoScalingGroups'][0]['AvailabilityZones'][0]
try:
fsrresponse = ec2.describe_fast_snapshot_restores(
Filters=[
{
'Name': 'snapshot-id',
'Values': [
snapshotid,
]
},
]
)
except Exception as e:
message(f"==== e = {e}")
return(error(e, f"## Failed to Disable Fast Restore "))
fsrstate=fsrresponse['FastSnapshotRestores'][0]['State']
return fsrstate
def lambda_handler(event, context):
message("Received event: " + json.dumps(event, indent=2))
#----------------
lifeCycleHook = event['lifeCycleHook']
asgName = event['ASGName']
instance = event['instance']
status = event['status']
region_name=event['region']
status=event['status']
AMIid=event['AMIid']
cmd = event['cmd']
#----------------
autoscalingClient = boto3.client('autoscaling')
ec2Client = boto3.client('ec2', region_name)
ec2Resource = boto3.resource('ec2', region_name)
ssmClient = boto3.client('ssm', region_name)
#----------------
message(f"---- CMD = {cmd}")
if cmd == 'getASGInstanceId':
message("---- Get instance Id from ASG")
instance = getASGInstanceId(autoscalingClient, asgName)
AMIid = getInstanceImageId(instance)
elif cmd == 'getInstanceImageId':
instance = getASGInstanceId(autoscalingClient, asgName)
message("---- Get Instance Image Id on instance: " + instance)
AMIid = getInstanceImageId(instance)
elif cmd == 'enableFastRestore':
message("---- Enable Fast Restore")
instance = getASGInstanceId(autoscalingClient, asgName)
AMIid = getInstanceImageId(instance)
response = enableFastRestore(AMIid,asgName)
elif cmd == 'disableFastRestore':
instance = getASGInstanceId(autoscalingClient, asgName)
message("---- Disable Fast Restore")
AMIid = getInstanceImageId(instance)
response = disableFastRestore(AMIid,asgName)
elif cmd == 'showFastRestore':
message("---- Show Fast Restore")
instance = getASGInstanceId(autoscalingClient, asgName)
AMIid = getInstanceImageId(instance)
status = showFastRestore(AMIid,asgName)
else:
return(error(e, f"Command '{cmd}' is not recognized or planned to be managed" ))
result = {"lifeCycleHook": lifeCycleHook, "ASGName": asgName, "instance": instance, "region": region_name, "status": status, "AMIid": AMIid, "cmd": cmd}
message("##### RETURN result : " + json.dumps(result, indent=2))
return result
Complete Lifecycle Hook Lambda:
import boto3
from botocore.exceptions import ClientError
import os
import json
asgClient = boto3.client('autoscaling')
def lambda_handler(event, context):
print('The Lambda function is starting.')
print("Received event: " + json.dumps(event, indent=2))
autoScalingGroup = event['ASGName']
instanceId = event['instance']
lifeCycleHook = event['lifeCycleHook']
actionResult = "CONTINUE"
response = asgClient.complete_lifecycle_action(
LifecycleHookName = lifeCycleHook,
AutoScalingGroupName = autoScalingGroup,
LifecycleActionResult = actionResult,
InstanceId = instanceId
)
print(f"Complete lifecycle hook response : {response}")
return
Start Step Function Lambda:
import boto3
from datetime import tzinfo, timedelta, datetime
from botocore.exceptions import ClientError
import os
import json
import ast
STATE_MACHINE_ARN = os.environ.get('STATE_MACHINE_ARN')
stepFnClient = boto3.client('stepfunctions')
def lambda_handler(event, context):
print('The Lambda function is starting.')
print("Received event: " + json.dumps(event, indent=2))
datename = datetime.now().strftime('%Y_%m_%d_%HH%M')
EXECUTION_NAME = 'Life_Cycle_Hook_Start_' + datename
message = event['Records'][0]['Sns']['Message']
msgJson = json.loads(message)
instanceId=msgJson["EC2InstanceId"]
lifecycleHookName=msgJson["LifecycleHookName"]
lifecycleActionToken=msgJson["LifecycleActionToken"]
autoScalingGroupName=msgJson["AutoScalingGroupName"]
notificationMetadata=msgJson["NotificationMetadata"]
notificationMetadataDict = json.loads(notificationMetadata)
print(f"----- NotificationMetadataDict = {notificationMetadataDict}")
notificationMetadataDict["lifeCycleHook"] = lifecycleHookName
notificationMetadataDict["instance"] = instanceId
notificationMetadataDict["ASGName"] = autoScalingGroupName
print(f"----- NotificationMetadataDict = {notificationMetadataDict}")
notificationMetadata = json.dumps(ast.literal_eval(str(notificationMetadataDict)), indent=2)
print(f"----- NotificationMetadata = {notificationMetadata}")
print('Starting step function ...')
response = stepFnClient.start_execution(
stateMachineArn=STATE_MACHINE_ARN,
name=EXECUTION_NAME,
input=notificationMetadata
)
print(f"Execution arn of the Step Function : {response.get('executionArn')}")
return
To Sum up - I have tried to present you with an approach that will allow you to benefit from Fast Snapshot Restore functionality and in the meantime reduce the spending on the service to the really needed one.
Top comments (0)