DEV Community

Cover image for Amazon DynamoDB single-table design with AWS Step Functions

Amazon DynamoDB single-table design with AWS Step Functions

In my latest blog post, I've talked about performing Lambda-less outbound HTTP requests on AWS.

Continuing the topic of Lambda-less approach, in this article, I would like to explore how one might implement some of the single-table design data schemes on AWS utilizing AWS DynamoDB as the persistence layer and AWS Step Functions as the orchestration layer.

This blog post is based on the code from this GitHub repository

Let us dive in.

A note about Amazon DynamoDB single-table design

This blog post assumes that you are familiar with the concept of a single-table design and Amazon DynamoDB. If you are not, do not fret!

There are great resources on the subject matter. In my personal opinion, this article by none other than Alex DeBrie is an excellent introduction to the concept.

Composite Keys

One of the heavily used concepts in the context of Amazon DynamoDB single-table design schemes is the concept of a composite key. The composite primary/sort key is created by concatenating values together, like an entity ID with its type.

The following is an example of utilizing the composite primary and sort key to retrieve an entity from Amazon DynamoDB using AWS SDK (javascript).

const { DynamoDBClient, GetItemCommand } = require("@aws-sdk/client-dynamodb");

const client = new DynamoDBClient({ region: "us-west-2" });

const command = new GetItemCommand({
  TableName: "YOUR_TABLE_NAME",
  Key: {
    // Composite primary (PK) and sort (SK) keys
    PK: "USER#<USER_ID>",
    SK: "METADATA#<USER_ID>"
  }
});

const result = await client.send(command);
Enter fullscreen mode Exit fullscreen mode

Creating values such as USER#<USER_ID> is usually not a problem in an AWS Lambda environment – we have the power of a programming language at our disposal. But what about Lambda-less environments?

Composite Keys in AWS Step Functions world

To re-create the example from above, without using the AWS Lambda function, one can use the AWS Step Functions integration with Amazon DynamoDB and the States.Format intrinsic function.

The following is an example of defining an AWS Step Functions task that integrates with Amazon DynamoDB and uses composite primary and secondary keys to retrieve user entity.

const dataTable = ...

new aws_stepfunctions_tasks.DynamoGetItem(this, "GetUserMetadataTask", {
  table: dataTable,
  key: {
    PK: aws_stepfunctions_tasks.DynamoAttributeValue.fromString(
      aws_stepfunctions.JsonPath.stringAt("States.Format('USER#{}', $.userId)")
    ),
    SK: aws_stepfunctions_tasks.DynamoAttributeValue.fromString(
      aws_stepfunctions.JsonPath.stringAt(
        "States.Format('METADATA{}', $.userId)"
      )
    )
  }
});
Enter fullscreen mode Exit fullscreen mode

The States.Format intrinsic function is excellent for our use case. Sadly, going the other way around, from a composite key to a "regular" ID, is not as simple as utilizing some built-in function. Let us explore that topic next.

From Composite Key to ID

The first hurdle that we encounter is that, to my best knowledge, there is no way to split strings using the AWS Step Functions language. I also could not find a way to retrofit the States.Format function to perform the operation we need.

The solution to our problem lies not in some additional functionality that the AWS Step Functions language supports but in structuring the data in a certain way. By duplicating the parts used to create the composite key, we should return only the part that interests us. In our case, that would be the ID of the user entity.

All credit goes to my my colleague Jason Legler who mentioned this solution on Twitter.

// The same as the previous example
const getUserMetadataTask = ..

const transformUserMetadata = new aws_stepfunctions.Pass(
  this,
  "FormatUserMetadataTask",
  {
    inputPath: "$",
    parameters: {
      /**
       * The `userId` is duplicated. It exists as a separate attribute and is also embedded within the `PK` composite key.
       * The `getUserMetadataTask` returns the data in _AWS DynamoDB_ format, thus the reference to `userId.S`.
       */
      userId: aws_stepfunctions.JsonPath.stringAt("$.Item.userId.S")
    }
  }
);

const getUserTask = getUserMetadataTask.next(transformUserMetadata);
Enter fullscreen mode Exit fullscreen mode

If the solution I presented does not fit your requirements, you might be interested in Stedi Mappings – a powerful data transformation API.
Full disclosure: I'm part of a team responsible for that service.

Retrieving item collections

Amazon DynamoDB item collection is a concept not necessarily specific to the idea of single-design, but it still plays a vital role in that context. To learn more about item collections, refer to this official AWS documentation page.

Retrieving an item collection from , Amazon DynamoDB used not to be possible – the AWS Step Functions "optimized integrations" for Amazon DynamoDB did not include the Query operation. All has changed with the introduction of AWS SDK service integrations.

With AWS SDK integrations available, we can perform the Query operation invoking the Amazon DynamoDB service directly!
The following is an example of performing the Query operation, retrieving an item collection of cats belonging to a given user.

const dataTable = ...

new aws_stepfunctions_tasks.CallAwsService(this, "GetCatsFromDataTableTask", {
  service: "dynamodb",
  action: "query",
  parameters: {
    TableName: dataTable.tableName,
    KeyConditionExpression: "PK = :PK AND begins_with(SK, :SK)",
    /**
     * We cannot use `DynamoAttributeValue` here.
     * The `CallAwsService` construct is incompatible with the `DynamoAttributeValue` class.
     */
    ExpressionAttributeValues: {
      ":PK": {
        "S.$": "States.Format('USER#{}', $.userId)"
      },
      ":SK": {
        S: "CAT#"
      }
    },
    ScanIndexForward: false
  },
  iamResources: [dataTable.tableArn]
});
Enter fullscreen mode Exit fullscreen mode

Remember that the data returned from Amazon DynamoDB will have a particular format. I would argue that in most cases, another step to transform the data is needed. If that is the case, you might find the Pass state we have used in the From Composite Key to ID section handy.

Data integrity

The serverless community wrote many great articles on the concept of Lambda-less and related topics. Paul Swail in his piece The trade-offs with functionless integration patterns in serverless architectures mentions "Risk to data integrity" as one of the downsides of such architecture.

For Amazon DynamoDB specific points about data integrity, refer to this excellent blog post by Jeremy Daly.

In my opinion, in our case, the best thing one can do, in the context of our use case, to ensure data integrity at the code-level is to consolidate all the code that deals with Amazon DynamoDB operations for a particular entity into a single file, like so.

// user.ts
import { aws_dynamodb, aws_stepfunctions_tasks } from "aws-cdk-lib";

export function getSaveUserTask(dataTable: aws_dynamodb.Table) {
  return new aws_stepfunctions_tasks.DynamoPutItem(
    this,
    "SaveUserToDataTableTask",
    {
      /** Parameters... */
    }
  ).next(transformSaveOwnerResponse);
}

const transformSaveUserToDataTableTask = new aws_stepfunctions.Pass(
  this,
  "TransformSaveUserToDataTableTask",
  {
    /** Parameters... */
  }
);

// user-machine.ts
import { getSaveUserTask } from "./user";

getSaveUserTask().next(anotherTask);
Enter fullscreen mode Exit fullscreen mode

Not a great solution, but I could not come up with a better way to compose the areas of concern in the same way that one might do it in an AWS Lambda environment (usually an entity class with public/static methods). Another solution might be to use AWS CDK State Machine Fragments.

The ultimate answer would be to thoroughly test the deployed state machines. But that opens another can of worms which I might touch on in another article.

Summary

In closing – utilizing Amazon DynamoDB single-table designs using AWS Step Functions is possible, though it might be hard for more complex architectures. The main issues are limited capabilities for data transformations and issues when it comes to data integrity (some of them can be addressed at Amazon DynamoDB API level).

I encourage you to give this way of building APIs a shot. I had a blast playing around with all the concepts listed in this article.

Consider following me on Twitter for more serverless content - @wm_matuszewski.

Thank you for your time.

Top comments (0)