Amazon Rekognition facilitates object and scene detection in images, offering a secure, stateless API that returns a list of related labels along with confidence levels.
In this tutorial, you'll build a serverless system to perform object detection upon image uploads to an Amazon S3 bucket. AWS Lambda will handle the processing logic, while Amazon DynamoDB will serve as the storage solution for the label detection results.
Object Detection Context and Limitations
Object and Scene Detection involves identifying objects and their context within an image. Object Detection locates specific instances of objects like humans, cars, buildings, etc. In Amazon Rekognition, Object Detection is akin to Image Labeling, extracting semantic labels from images. This approach also lends itself to scene detection, identifying both individual objects and overall scene attributes, such as "person" or "beach".
In AWS, Object Detection with Amazon Rekognition involves utilizing its DetectLabels API. You can input the image as either a binary string or an S3 Object reference. Submitting images as S3 Objects offers advantages such as avoiding redundant uploads and supporting images up to 15MB in size, compared to the 5MB limit.
The response typically follows a JSON structure similar to the following:
{
"Labels": [
{
"Confidence": 97,
"Name": "Person"
},
{
"Confidence": 96,
"Name": "Animal"
},
...
]
}
The API provides an ordered list of labels, ranked by confidence level, starting from the highest.
The quantity of labels returned is determined by two parameters:
- MaxLabels: This sets the maximum number of labels returned by the API.
- MinConfidence: Labels with a confidence score below this threshold will not be included in the response.
It's crucial to note that setting low values for MaxLabels alongside high values for MinConfidence could result in empty responses.
Throughout the tutorial, we'll use an S3 bucket to store images, leveraging the API to extract labels from each newly uploaded image. We'll store each image-label pair in a DynamoDB table.
Let's start
1) create DynamoDB table named "images"
2) Create S3 bucket
Choose a unique name for the bucket, I will use the name "demo21-2"
Make sure to select ACLs Enabled:
Create a folder in the bucket and name it "images"
3) Create a Lambda function as follows:
Choose "Use a blueprint" when creating the function.
For the blueprint name, choose "Use Rekognition to detect faces"
You will need to give the lambda function permissions to access the following services:
- "CloudWatch" to put logs
- "DynamoDB" to put, update, describe item
- "Rekognition" to detect labels and faces
So you will create a role with the following IAM policies, and assume the role by the lambda function.
Basic execution role
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"logs:CreateLogGroup",
"logs:CreateLogStream",
"logs:PutLogEvents"
],
"Resource": "*"
}
]
}
Lambda policy
{
"Version": "2012-10-17",
"Statement": [
{
"Action": [
"dynamodb:PutItem",
"dynamodb:UpdateItem",
"dynamodb:DescribeStream",
"dynamodb:GetRecords",
"dynamodb:GetShardIterator",
"dynamodb:ListStreams"
],
"Resource": "arn:aws:dynamodb:us-west-2:909737842772:table/images",
"Effect": "Allow"
},
{
"Action": [
"rekognition:DetectLabels",
"rekognition:DetectFaces"
],
"Resource": "*",
"Effect": "Allow"
},
{
"Action": [
"s3:*"
],
"Resource": "*",
"Effect": "Allow"
}
]
}
For the S3 trigger section enter the following:
You are now ready to test the function to see if it's successfully been triggered and called AWS Rekognition
Upload a picture to the folder "images" in your bucket
Go check the CloudWatch log group for your functions, you should see a log similar to this one below:
Implementing the Object Detection Logic
In the Code source section, double-click the lambda_function.py file, and replace the code with the following:
import boto3, urllib.parse
rekognition = boto3.client('rekognition', 'us-west-2')
table = boto3.resource('dynamodb').Table('images')
def detect_labels(bucket, key):
response = rekognition.detect_labels(
Image={"S3Object": {"Bucket": bucket, "Name": key}},
MaxLabels=10,
MinConfidence=80,
)
labels = [label_prediction['Name'] for label_prediction in response['Labels']]
table.put_item(Item={
'PK': key,
'Labels': labels,
})
return response
def lambda_handler(event, context):
data = event['Records'][0]['s3']
bucket = data['bucket']['name']
key = urllib.parse.unquote_plus(data['object']['key'])
try:
response = detect_labels(bucket, key)
print(response)
return response
except Exception as e:
print(e)
raise e
The code outlined above facilitates the following functionalities:
- Each image uploaded triggers the creation of a new DynamoDB item, with the S3 Object Key serving as the primary key.
- The item includes a list of corresponding labels retrieved from Amazon Rekognition, stored as a set of strings in DynamoDB.
- Storing labels in DynamoDB enables repeated retrieval without additional queries to Amazon Rekognition.
- Labels can be retrieved either by their primary key or by scanning the DynamoDB table and filtering for specific labels using a CONTAINS DynamoDB filter.
You can now test the function by uploading one or more images to the "images" folder in your bucket
Go to the DynamoDB table and click on Explore items from the left pane, you will find the items returned with the labels that are recognized by AWS Rekognitions
Conclusion
In this tutorial, you set up an Amazon S3 bucket and configured an AWS Lambda function to trigger when images are uploaded to the bucket. You implemented the Lambda function to call the Amazon Rekognition API, label the images, and store the results in Amazon DynamoDB. Finally, you demonstrated how to search DynamoDB for image labels.
This serverless setup offers flexibility, allowing for customization of the Lambda function to address more complex scenarios with minimal effort.
Top comments (0)