Intro/Backstory
Hi folks! Once we had to package some video files but using ec2, only 4 to 5 videos per day, so I thought why keep running an ec2 instance 24/7? Why not make it like microservice? I solved the problem by running ec2 instances using lambda.
Let's say, Sometimes you might have wanted to process large things but did not want to use a 24/7 running server, or simply put.. sometimes you may have wished if AWS Lambda had more computing power to train your machine learning model, or to perform an operation for a long time (such as video processing etc.) and then shut off after completion... only if lambda had that power and that low cost feature of an ec2!!
Well, turns out you can achieve it another way.
So, when any objects come to an s3 bucket, the lambda triggers and starts and ec2 instance to process the video. Or you can trigger using url too (yes, it's cool).
Pros and Cons?
By running ec2 instances using aws lambda you pay bill for both the ec2 and the lambda but, you use ec2 as pay as you go and as more events come, more lambda runs, more ec2 instances run to process in parallel, and the bill is actually less in my use case.
Sounds cool? Let's see how we can do it.
Step1: Write a Python Script (to be used in lambda)
The best thing about the boto3
library of python3 is that it is already installed on aws lambda so you don't have to install and zip it along the source code, you just write the code and provision that. That's it.
To run your application use
- A
launch template
- configure ec2, provision user data script (recommended, gives you more freedom) - Integrate user data inside the lambda (i.e. if you are just pulling a docker image)
init_script = f'''#!/bin/bash
echo "run something on your instance"
'''
Firt of all, I do not want to include the secrets (security group, region, ssh_key name) hardcoded inside the lambda, so I am reading them from environment variable -
# env specific vars
AMI = os.environ['img_id']
PROFILE = os.environ['profile']
ENV = os.environ['env']
INSTANCE_TYPE = os.environ['instance_type']
SECURITY_GROUP = os.environ['sg']
KEY_NAME = os.environ['key']
INSTANCE_NAME = os.environ['instance_name']
REGION = os.environ['region']
Here's the script -
"""
Launch ec2 instance from lambda
author: ashraf minhaj
mail : ashraf_minhaj@yahoo
"""
import os
import boto3
# env specific vars
AMI = os.environ['img_id']
PROFILE = os.environ['profile']
ENV = os.environ['env']
INSTANCE_TYPE = os.environ['instance_type']
SECURITY_GROUP = os.environ['sg']
KEY_NAME = os.environ['key']
INSTANCE_NAME = os.environ['instance_name']
REGION = os.environ['region']
def create_instance():
""" launch ec2 instance. """
ec2 = boto3.client('ec2', region_name=REGION)
init_script = f'''#!/bin/bash
echo "run something on your instance"
'''
# logger.info(init_script)
instance = ec2.run_instances(
ImageId=AMI,
InstanceType=INSTANCE_TYPE,
KeyName=KEY_NAME,
MaxCount=1,
MinCount=1,
UserData=init_script,
InstanceInitiatedShutdownBehavior='terminate',
IamInstanceProfile={
'Name': PROFILE
},
TagSpecifications=[{
'ResourceType': 'instance',
'Tags': [{
'Key': 'Name',
'Value': INSTANCE_NAME
},
]
}
]
)
# logger.info("New instance created:")
instance_id = instance['Instances'][0]['InstanceId']
# logger.info(instance_id)
return instance['ResponseMetadata']['HTTPStatusCode']
def launcher_handler(event, context):
""" check mime, on success dump message in sqs. """
# logger.info(event)
# print(type(event))
instance_create_resp = create_instance()
if instance_create_resp == 200:
print("Job Successful")
else:
print("Instance creation went wrong")
Step2: Provision the lambda using Terraform
Now we will provision (create and send to aws) the lambda using Terraform.
File1: variables.tf
First, all the variables are stored insdie the variables file -
variable "aws_region" {
# default = "ap-southeast-1"
}
variable "component_prefix" {
default = "min"
}
variable "component_postfix_env_tag" {
# default = "test"
}
variable "lambda_artifacts_bucket" {
default = "lambda-xxxxx"
}
variable "archive_file_type" {
default = "zip"
}
# lambda
variable "launcher_name" {
default = "lambda-ec2-launcher"
}
variable "launcher_handler" {
default = "launcher_handler"
}
variable "launcher_key" {
default = "launcher.zip"
}
variable "launcher_timeout" {
default = "15"
}
variable "launcher_runtime" {
default = "python3.9"
}
# ec2
variable "ami_name" {
default = "xxxxx"
}
variable "instance_clone_name" {
default = ""
}
variable "instance_type" {
default = "t2.medium"
}
variable "ssh_key" {
default = "xxxx"
}
variable "security_groups" {
default = "launch-wizard-00"
}
File2: lambda.tf
The lambda needs IAM permission to create ec2 instances. We need CloudWatch Log write (optional), ec2 RunInstances, IAM pass and get role for the lambda.
data "aws_caller_identity" "current" {
}
resource "aws_iam_role" "lambda_role" {
name = "lambda-role"
assume_role_policy = jsonencode(
{
"Version": "2012-10-17",
"Statement": [{
"Action": "sts:AssumeRole",
"Principal": {
"Service": "lambda.amazonaws.com"
},
"Effect": "Allow",
"Sid": ""
}
]
}
)
}
# create policy
resource "aws_iam_policy" "lambda_policy" {
name = "lambda-policy"
policy = jsonencode({
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"logs:CreateLogGroup",
"logs:CreateLogStream",
"logs:PutLogEvents"
],
"Resource": "arn:aws:logs:*:*:*"
},
{
"Effect": "Allow",
"Action": [
"iam:GetRole",
"iam:PassRole"
],
"Resource": "*"
},
{
"Action": [
"ec2:RunInstances"
],
"Effect": "Allow",
"Resource": "*"
},
{
"Effect": "Allow",
"Action": [
"ec2:CreateTags"
],
"Resource": "arn:aws:ec2:${var.aws_region}:${data.aws_caller_identity.current.account_id}:*/*",
"Condition": {
"StringEquals": {
"ec2:CreateAction" : "RunInstances"
}
}
}
]
})
}
# attach policy to the role
resource "aws_iam_role_policy_attachment" "policy_attachment" {
role = "${aws_iam_role.lambda_role.name}"
policy_arn = "${aws_iam_policy.lambda_policy.arn}"
}
Now the lambda file (python file) will be zipped and taken to s3, then that will be attached to the lambda. Adn we pass the environment variables inside the environment
block -
Zip the Lamda function on the fly
data "archive_file" "lambda_source" {
type = "${var.archive_file_type}"
source_dir = "../src/lambda/"
output_path = "../src/lambda/tmp.zip"
}
# upload zip to s3 and then update lamda function from s3
resource "aws_s3_object" "lambda_object" {
source_hash = "${data.archive_file.lambda_source.output_base64sha256}"
bucket = "${aws_s3_bucket.s3_warehouse.bucket}"
key = "${var.lambda_key}"
source = "${data.archive_file.lambda_source.output_path}"
}
resource "aws_lambda_function" "lambda_lambda" {
function_name = "${local.lambda_lambda_component}"
source_code_hash = "${data.archive_file.lambda_source.output_base64sha256}"
s3_bucket = "${aws_s3_object.lambda_object.bucket}"
s3_key = "${aws_s3_object.lambda_object.key}"
role = "${aws_iam_role.lambda_role.arn}"
handler = "${var.component_prefix}-${var.lambda_name}.${var.lambda_handler}"
runtime = "${var.lambda_runtime}"
timeout = "${var.lambda_timeout}"
environment {
variables = {
img_id = "${data.aws_ami.current_ami.id}"
profile = "${aws_iam_instance_profile.instance_profile.name}"
instance_name = "${local.transcoder_clone}"
env = "${var.component_postfix_env_tag}"
instance_type = "${var.instance_type}"
sg = "${var.security_groups}"
key = "${var.ssh_key}"
region = "${var.aws_region}"
dest_bucket = "${local.destination_bucket_component}"
backend_api = "${var.backend_api}"
backend_api_access_key = "${var.backend_api_access_key}"
backend_api_secret_key = "${var.backend_api_secret_key}"
pallbearer_endpoint = "${aws_lambda_function_url.pallbearer_endpoint.function_url}"
cloudfront_dis_id = "${aws_cloudfront_distribution.destination_distribution.id}"
}
}
description = "Creates chef instance to cook video."
tags = {
app = "${var.component_prefix}"
Environment = "${var.component_postfix_env_tag}"
}
}
But how will our API or external services trigger it? Well, lambdas can have endpoints too. Good thing is it does not cost anything extra, you only pay for how much time a lambda runs -
# trigger using this endpoint
resource "aws_lambda_function_url" "lambda_endpoint" {
function_name = aws_lambda_function.lambda_lambda.function_name
# authorization_type = "AWS_IAM"
authorization_type = "NONE"
cors {
allow_credentials = false
allow_origins = ["*"]
allow_methods = ["*"]
max_age = 0
}
}
To get the url, we can just use terraform output -
output "lambda_url" {
value = aws_lambda_function_url.lambda_endpoint.function_url
}
So, here's the full lambda.tf
file -
resource "aws_iam_role" "lambda_role" {
name = "lambda-role"
assume_role_policy = jsonencode(
{
"Version": "2012-10-17",
"Statement": [{
"Action": "sts:AssumeRole",
"Principal": {
"Service": "lambda.amazonaws.com"
},
"Effect": "Allow",
"Sid": ""
}
]
}
)
}
# create policy
resource "aws_iam_policy" "lambda_policy" {
name = "lambda-policy"
policy = jsonencode({
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"logs:CreateLogGroup",
"logs:CreateLogStream",
"logs:PutLogEvents"
],
"Resource": "arn:aws:logs:*:*:*"
},
{
"Effect": "Allow",
"Action": [
"iam:GetRole",
"iam:PassRole"
],
"Resource": "*"
},
{
"Action": [
"ec2:RunInstances"
],
"Effect": "Allow",
"Resource": "*"
},
{
"Effect": "Allow",
"Action": [
"ec2:CreateTags"
],
"Resource": "arn:aws:ec2:${var.aws_region}:${data.aws_caller_identity.current.account_id}:*/*",
"Condition": {
"StringEquals": {
"ec2:CreateAction" : "RunInstances"
}
}
}
]
})
}
# attach policy to the role
resource "aws_iam_role_policy_attachment" "policy_attachment" {
role = "${aws_iam_role.lambda_role.name}"
policy_arn = "${aws_iam_policy.lambda_policy.arn}"
}
# Zip the Lamda function on the fly
data "archive_file" "lambda_source" {
type = "${var.archive_file_type}"
source_dir = "../src/lambda/"
output_path = "../src/lambda/tmp.zip"
}
# upload zip to s3 and then update lamda function from s3
resource "aws_s3_object" "lambda_object" {
source_hash = "${data.archive_file.lambda_source.output_base64sha256}"
bucket = "${aws_s3_bucket.s3_warehouse.bucket}"
key = "${var.lambda_key}"
source = "${data.archive_file.lambda_source.output_path}"
}
resource "aws_lambda_function" "lambda_lambda" {
function_name = "${local.lambda_lambda_component}"
source_code_hash = "${data.archive_file.lambda_source.output_base64sha256}"
s3_bucket = "${aws_s3_object.lambda_object.bucket}"
s3_key = "${aws_s3_object.lambda_object.key}"
role = "${aws_iam_role.lambda_role.arn}"
handler = "${var.component_prefix}-${var.lambda_name}.${var.lambda_handler}"
runtime = "${var.lambda_runtime}"
timeout = "${var.lambda_timeout}"
environment {
variables = {
img_id = "${data.aws_ami.current_ami.id}"
profile = "${aws_iam_instance_profile.instance_profile.name}"
instance_name = "${local.transcoder_clone}"
env = "${var.component_postfix_env_tag}"
instance_type = "${var.instance_type}"
sg = "${var.security_groups}"
key = "${var.ssh_key}"
region = "${var.aws_region}"
dest_bucket = "${local.destination_bucket_component}"
backend_api = "${var.backend_api}"
backend_api_access_key = "${var.backend_api_access_key}"
backend_api_secret_key = "${var.backend_api_secret_key}"
pallbearer_endpoint = "${aws_lambda_function_url.pallbearer_endpoint.function_url}"
cloudfront_dis_id = "${aws_cloudfront_distribution.destination_distribution.id}"
}
}
description = "Creates chef instance to cook video."
tags = {
app = "${var.component_prefix}"
Environment = "${var.component_postfix_env_tag}"
}
}
# the endpoint is created keeping the intention of retry only.
# the endpoint is never to be used by any other means.
resource "aws_lambda_function_url" "lambda_endpoint" {
function_name = aws_lambda_function.lambda_lambda.function_name
# authorization_type = "AWS_IAM"
authorization_type = "NONE"
cors {
allow_credentials = false
allow_origins = ["*"]
allow_methods = ["*"]
max_age = 0
}
}
output "lambda_url" {
value = aws_lambda_function_url.lambda_endpoint.function_url
}
File3: main.tf or provider.tf
This is the file where we configure cloud provider and backend things -
provider "aws" {
region = "${var.aws_region}"
}
terraform {
required_providers {
aws = {
source = "hashicorp/aws"
version = "~> 4.30"
}
}
}
That is it actually. It's that easy.
Terminate instance after completing the job
Now our lambda can be triggered using the url and we can start using aws ec2 like a big brother of lambda. If you want to stop the instance after executing your process, use this command at the end of the user data sudo shutdown -h now
, so the user data will be -
init_script = f'''#!/bin/bash
echo "run something on your instance"
sudo shutdown -h now
'''
Conclusion
This solution can be beneficial for Machine learning model training, video processing etc. use cases if the frequency is low. However, in order to find out which suits better, just try and see. Something will click.
Thanks for reading. Happy Coding!
Top comments (0)