Vikas Solegaonkar for AWS Community Builders

Posted on May 13, 2021 • Edited on Jun 9, 2022 • Originally published at builtin.com

Asynchronous API with DynamoDB Streams

#aws #dynamodb #serverless #api

Responsiveness is one of the most important parameters for the success of any web application. And asynchronous processing is The Solution for attaining this responsiveness. A server request from the browser should return immediately - without wait for completion. The data flow should be designed in a way that it does not depend upon an immediate response from the server.

There are several architecture patterns to do this. But a major problem with asynchronous processing is error handling. How would the client know if the request failed? We should not lose data in the process. In fact, a fire and forget service, cannot afford to fail. Even if the processing failed for some reason, the data has to reach the DB.

DynamoDB streams provides a cool solution to this problem. Let's check it out

What is DynamoDB Streams

Most of the traditional databases have a concept of Triggers. These are events generated when some data change in the DB. DynamoDB Streams are quite similar. With one difference - instead of generating a distinct trigger per data change, it generates a stream of events that flows into a target - Lambda or Kinesis.

We can have a Lambda function triggered by such events - which can process this data. The incoming API call can directly dump the data into the DynamoDB - using API Gateway service integration. This ensures very low response time in the API. The DynamoDB streams can be configured to invoke a Lambda function that can

We have an advantage here - the data is already available in our DB before we start processing it. Even if the downstream processing fails, the data is already available in the DB. And in the unlikely event of DB insert/update failing, the API itself will return an error and the client will be able to handle it.

Thus, we have best of both the worlds - low response time as well as error resilience. Let us try to implement this in our AWS account.

Lambda Function

Start with creating a Lambda Function. This is simple. Just go to the Lambda Console, create a new Lambda Function. Call it StreamProcessor. (or any other name you like). Add the below code in it:



const AWS = require('aws-sdk');
const ddb = new AWS.DynamoDB.DocumentClient();

const TABLE_NAME = "Restaurant";
const TTL = 300;

exports.handler = async(event) => {
    var plist = [];
    event.Records.forEach(record => {
        if (record.eventName == "INSERT") {
            plist.push(processInsert(record.dynamodb.NewImage));
        }
    });

    await Promise.all(plist);
    const response = {
        statusCode: 200,
        body: JSON.stringify('Hello from Lambda!'),
    };
    return response;
};

const processInsert = async(newImage) => {
    // Business Logic to process the input
    console.log("Processing: " + JSON.stringify(newImage));
    var error = await businesslogic(newImage);

    if (!error) {
        await ddb.update({
            TableName: TABLE_NAME,
            Key: { id: newImage.id.S },
            UpdateExpression: "set #error = :error",
            ExpressionAttributeValues: { ":error": error },
            ExpressionAttributeNames: { "#error": "error" }
        }).promise();
    }
    else {
        await ddb.update({
            TableName: TABLE_NAME,
            Key: { id: newImage.id.S },
            UpdateExpression: "set #ttl = :ttl",
            ExpressionAttributeValues: { ":ttl": TTL + Math.floor(Date.now() / 1000) },
            ExpressionAttributeNames: { "#ttl": "ttl" }
        }).promise();
    }
};

const businesslogic = async(input) => {
    return;    // return "Error Details" in case of any error
};

IAM Role for Lambda

This lambda function should have enough permissions to operate on the streams of DynamoDB, along with the update permissions. To get this, create a new IAM Role.

Along with the usual Lambda permissions, include the below for DB permissions



{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "VisualEditor0",
            "Effect": "Allow",
            "Action": [
                "dynamodb:DescribeStream",
                "dynamodb:UpdateItem",
                "dynamodb:GetShardIterator",
                "dynamodb:GetItem",
                "dynamodb:UpdateTable",
                "dynamodb:GetRecords"
            ],
            "Resource": [
                "arn:aws:dynamodb:us-east-1:1234567890:table/TableName",
                "arn:aws:dynamodb:us-east-1:1234567890:table/TableName/index/*",
                "arn:aws:dynamodb:us-east-1:1234567890:table/TableName/stream/*"
            ]
        }
    ]
}

DynamoDB Table

Next, we create the table in DynamoDB. This table should have a primary key - id. On the Exports & Streams tab, click on Enable DynamoDB Streams (Not the Kinesis Streams).

Along with that, add the Lambda Triggers to the stream. Click on Create Trigger and choose the Lambda Function we created above. And that sets up the stream + trigger.

Then, go the Additional Settings tab. There, we can enable the TTL as below.

That will setup the DynamoDB table for us.

API Gateway

Finally, we come to the API Gateway - to create an API. A client can invoke this API. Here, we configure the API Gateway to add the request into the database table. This is achieved using the AWS Service integration of the API Gateway. Let's work on that.

Create a new REST API on the API Gateway. And add a Put method in it. Integrate the request with DynamoDB as below:

And add this to the JSON Mapping Template.



{
    "TableName":"TableName",
    "Item": {
        "id": {"S": "$context.requestId"},
        "request": {"S": "$util.escapeJavaScript($input.body)"}
    }
}

For every API call, this will add a new Item to the table - with the Request Id as the key. This insert will trigger the Lambda function and take the processing ahead.

Benefits

One would ask - what is the big deal? Why not directly call the lambda from API Gateway? There are two aspects to this. Foremost, the response time. Directly adding data from API Gateway to DynamoDB is a lot faster than invoking a Lambda. So the client application sees superfast response.

Second, resilience and error handling get complex when we invoke Lambda directly from the API gateway? What do we do if the Lambda fails half way? Retry the same API? Invoke another API that will heal the damage? This is too much intelligence to be carried into the client. It is best that the server manages this for itself.

The client just dumps data into the DynamoDB, and is assured that its data is accepted and will be processed by the server. It will not be lost because it is already in the DB. The processing and error handling is then localized on the server - leading to a cleaner architecture.

Top comments (4)

Comment hidden by post author - thread only accessible via permalink

Rolf Streefkerk • May 14 '21

you're missing the feedback loop here, probably adding a websocket would solve that problem.

the other option is to use an API Gateway & Lambda Async function with a websocket, from the client's perspective it's virtually the same experience.

Vikas Solegaonkar • May 14 '21 • Edited

Thanks Rolf for your inputs.

Websocket with API Gateway - Lambda is async anyway. But my main concern in this application was resilience - without the client worrying about retries. Not really worried about the feedback.

The original application had Lambda picking directly from the API Gateway. It had a complex processing logic - that would often fail halfway. Error handling was messy because of this. So I moved it to this architecture where the input request is baselined in the DB - leaving the client free of any error handling responsibility.

Additionally, as the Lambda processes the request, it updates the DB record with the status. This helps it resume the processing if it errors half way. I know we can do this partly in Step functions. But the functionality is pretty complex and interwoven, so I had no option but to do it in code.

Do suggest if there is a better way to solve this problem - Resilience in Web sockets - with no retries on the client.

Emil • May 14 '21 • Edited

I would say that depends if you need a feedback at all. Mostly cases you can return http 201 and/or refresh the page and reload you data from the backend. A websocket feedback does not make sense in my view since its way more complicated and makes it synchronous somehow. You can then check the error state then.

The approach all behind that is a Saga

Rolf Streefkerk • May 14 '21

It does not make it synchronous, in some cases you want to inform the end-user about the status and such a websocket feedback loop makes sense.

Some comments have been hidden by the post's author - find out more

DEV Community

Asynchronous API with DynamoDB Streams

What is DynamoDB Streams

Lambda Function

IAM Role for Lambda

DynamoDB Table

API Gateway

Benefits

Top comments (4)

Read next

Effective Strategies for Application Monitoring in AWS

RDS MySQL Zero ETL Integration with Redshift

Autopost to Bluesky and Mastodon by API

Develop APIs 100x Faster: Unlock the Power of This All-in-One IntelliJ Plugin