DEV Community

Cover image for Solving it the AWS way.
Asna Siji
Asna Siji

Posted on

Solving it the AWS way.

My kids love to write and draw. To reduce the screen time and to improve their writing skills, I encourage them to scribble their thoughts on paper. So far so good. But keeping the hard copies is a bit difficult. I would like to convert these handwritten notes to soft copies in textual format [Foreseeing some Kindle plans, when it becomes a small little bundle :-) ].

Let’s solve this the AWS way.

Solution Description

Let’s build a solution leveraging S3 replication / event notification, a serverless compute layer with lambda function and Textract. The sample architecture looks like this.

Image description

Two buckets are created, source bucket and destination bucket. Source bucket will hold the images user is giving as input.
The drawings are uploaded to the source bucket with tags say ‘picture’. S3 same region replication is enabled on the source bucket on the configured tag.
A folder is created say ‘story’ in source bucket to upload the handwritten text. S3 event notification is enabled on the source bucket on prefix ‘/story’, in such a way that any upload to this folder will create an event notification.
A lambda function is set up to receive these event notifications. The lambda function will invoke Textract passing the uploaded image as input. Textract will scan the image and return the identified words in the document to the lambda function.
Lambda function logic will consolidate it into sentences and save it to destination bucket as text.
Event notification is enabled on the destination bucket to invoke SNS to send external emails once the process is completed.

Serverless Features in Spot Light
S3 Replication

Amazon Simple Storage Service (S3) can automatically replicate S3 objects to help you reduce costs and protect data.
Two main replication options are:
Cross-Region Replication (CRR) — copies S3 objects across multiple AWS Regions.
Same-Region Replication (SRR) — copies S3 objects between buckets in different availability zones (AZs), in the same region.
S3 also offers a Replication Time Control (RTC) that guarantees object replication in less than fifteen minutes.

S3 Event Notification

Amazon S3 Event Notifications feature helps you to receive notifications when certain events happen in your S3 bucket.
To enable this, a notification configuration to be added with events details that you want S3 to publish and a destination where you want S3 to send the notifications.

Computation Layer with Lambda

AWS Lambda is the serverless event-driven computation layer that runs your code without provisioning or managing servers

Textract

Amazon Textract is a service that can be used to automatically extracts text, handwriting, and data from documents /images.
It uses advanced machine learning (ML) algorithms to achieve this.

SNS

Amazon Simple Notification Service (Amazon SNS) is a fully managed messaging service. It enables communication between applications with a pub/sub functionality. It also supports application to person communications by sending messages to users at scale via SMS, mobile push, and email.

Let’s Build.
Step 1 : Create Buckets

Navigate to the S3 console and create 2 buckets in the same region, say ‘textractsourcebucket’ and ‘textractdestinationbucket’ with versioning enabled.

Image description

Step 2 : Creating Roles

We need 2 roles, one for S3 replication and another for lambda function.

Role 1 : Let the S3 replication feature itself create a new role for this. Skip this for now.

Role 2 : Create a role with Cloud Watch, Textract and S3 access for lambda.

Image description

Step 3 : Enabling replication

Navigate to the Management tab of the source bucket and create replication rules.

Image description

Create replication rule by providing a name. Enable versioning if you have not enabled it before.

Image description

We are filtering the replication items based on tag. The drawings will be uploaded with a tag ‘picture’.

Image description

Select the destination bucket created before in Step 1 as destination.

Image description

Allow S3 replication feature to create a role for you with required permissions.

Image description

Save the replication configuration.

Image description

You can choose not to replicate existing objects.

Step 4 : Creating lambda function

The lambda function will receive the event notification from S3 when handwritten note is uploaded. It will call Textract to extract the textual details and consolidate it and save to destination bucket as text file.

Navigate to lambda console and create a new function with run time as Node.js.

Image description

Change default execution role as the existing role and select the role created in Step 2 Role 2 for lambda. Copy the contents of index.js from below GitHub repo (after making modification, like region, bucket name etc. if any) to the code section of lambda.

https://github.com/asnakhader/textract

Create a test event to test the function. Sample given below (obtained from Cloud Watch logs when the S3 event notification happens).

{
    "Records": [
        {
            "eventVersion": "2.1",
            "eventSource": "aws:s3",
            "awsRegion": "ap-south-1",
            "eventTime": "2022-09-23T04:59:35.441Z",
            "eventName": "ObjectCreated:Put",
            "userIdentity": {
                "principalId": "AWS:BBBBBBB"
            },
            "requestParameters": {
                "sourceIPAddress": "00.00.00.174"
            },
            "responseElements": {
                "x-amz-request-id": "XDG0P50ZJ02XX7JR",
                "x-amz-id-2": "c+xP2nQ780GjtppxeXERgXK9OJt7ZTLUqR941EE9/y74GhIBoQX5ZRb2leJWD40XpkFFK82HZR/lfT/lTR0n/Atnr26OmFK5wzx3F7npKJc="
            },
            "s3": {
                "s3SchemaVersion": "1.0",
                "configurationId": "storyUpload",
                "bucket": {
                    "name": "<your source bucket name>",
                    "ownerIdentity": {
                        "principalId": "BBBBBBB"
                    },
                    "arn": "arn:aws:s3:::<your source bucket name>"
                },
                "object": {
                    "key": "<your object name>",
                    "size": 1522574,
                    "eTag": "c5fa617910f9118efa01ddcbd82e1433",
                    "versionId": "Q9yd0M4L2otyIdURpFOgnZdi46C54y91",
                    "sequencer": "00632D3D375CC84842"
                }
            }
        }
    ]
}
Enter fullscreen mode Exit fullscreen mode

Test the function to see if return a success response.

Step 5: Setting up S3 Event Notification

Navigate to the S3 console and select the source bucket.

Create a folder, say ‘images; to upload handwritten notes.

Image description

Navigate to the ‘Properties’ tab and create event notifications.

Image description

Create event notification with prefix as /images and suffix as .jpg as we the handwritten notes will be uploaded to this folder as .jpg image.

Image description
Enable this for all object creation events and set the destination as lamda function created before.

Image description

Step 6 : Create SNS Topic

Navigate to the SNS Console and create a topic.

Image description
Navigate to the Access Policy tab and attach a policy which allows the S3 bucket to publish messages to SNS.

Image description
Sample policy is given below.

Create a subscription for this topic with an email end point.

Image description

Confirm the subscription by accepting the invitation received in email.

Step 7 : Setting up email notification for process completion

Navigate to the destination bucket and create event notification.

Image description

Select destination as the previously created SNS topic and save changes.

Step 8 : Testing

Navigate to S3 Source Bucket and upload a drawing with tag as ‘picture’.

Image description

Source Bucket
Upload the handwritten text images in folder ‘/images’.

Image description

Navigate to destination bucket and check the details. S3 replication has replicated the uploaded image with tag ‘picture’.

Image description

Check the details inside the image folder.

Image description

The lambda function with Textract has converted the handwritten notes uploaded as .jpg images into textual format. An email notification is received indicating process completion. Woohoo!

Final Product

Image description
Excuse the spelling / grammar mistakes pls. Big shoutout to Textract for reading this :-)

At the end of the day, the whole purpose of technology is to ease human lives and save it from monotonous and repetitive work. Do you agree?

Tail End: Now that the kids have taken the pictures of all their stories and given to me, the ball is in my court to convert it and keep it safe.

Top comments (0)