DEV Community

Cover image for NEW: DynamoDB Streams Filtering in Serverless Framework
Pawel Zubkiewicz for AWS Community Builders

Posted on

NEW: DynamoDB Streams Filtering in Serverless Framework

From this article, you will learn how to utilize recently released functionality of Streams Filtering with DynamoDB and Lambda.

We will move deeper than a basic sample of DynamoDB event action filtering. You will learn how to combine it with your business logic. I will be using DynamoDB single-table design setup for that.

What's new?

If you haven't heard, just before #reInvent2021 AWS dropped this huge update.

What's changed?

Before the update

Every action made in a DynamoDB table (INSERT, MODIFY, REMOVE) triggered an event that was sent over DynamoDB Streams to a Lambda function. Regardless of the action type, a Lambda function was always invoked. That had two repercussions:

  • You had to implement filter logic inside your Lambda code (if conditions) before executing your business logic (i.e. filter INSERT actions to send welcome email whenever new User was added into the table).
  • You paid for every Lambda run, even though in most cases you were interested only in some events.

That situation was multiplied in single-table design, where you store multiple types in a single table, so in reality you have many INSERTs with subtypes (ie. new user, new address, new order etc.)

After the update

Now, you can filter out events that are not relevant to your business logic. By defining filter criteria, you control which events can invoke a Lambda function. Filtering evaluates events based on values that are in the message.

This solves above-mentioned problems:

  • Logic evaluation is pushed on AWS (no more ifs in Lambda code)
  • No more needless Lambda execution.

All of that thanks to the small JSON snippet defining filter criteria.

Refactoring to the Streams Filtering

Since you're reading this article, it's safe to assume you're like me, already using DynamoDB Streams to invoke your Lambda functions.

Therefor, let me take you through the refactoring process. It's a simplified version of the code that I use on production.

In my DynamoDB table, I store two types of entities: Order and Invoice. My business logic requires me to do something only when Invoice is modified.
Business logic conditions
As you can see, it's just the single case out of six. Imagine what happens when you have more types in your table, and your business logic requires you to perform other actions as well.

Old event filtering

Let's start from those ugly if statements that I had before the update because I had to manually filter events.

My Lambda's handler started with execution of parseEvent method:

const parseEvent = (event) => {
  const e = event.Records[0] // batch size = 1
  const isInsert = e.eventName === 'INSERT'
  const isModify = e.eventName === 'MODIFY'

  const isOrder = e.dynamodb.NewImage?.Type?.S === 'Order'
  const isInvoice = e.dynamodb.NewImage?.Type?.S === 'Invoice'

  const newItemData = e.dynamodb.NewImage
  const oldItemData = e.dynamodb.OldImage

  return {
    isInsert, isModify, isOrder, isInvoice, newItemData, oldItemData
  }
}
Enter fullscreen mode Exit fullscreen mode

Next step, I had to evaluate the condition in my handler:

const {
    isInsert, isModify, isOrder, isInvoice, newItemData, oldItemData
  } = parseEvent(event)


if (isModify && isInvoice) {
  // perform business logic
  // uses newItemData & oldItemData values
}
Enter fullscreen mode Exit fullscreen mode

New event filtering

New functionality allows us to significantly simplify that code by pushing condition evaluation on AWS.

Just to recap, my business logic requires me to let in only MODIFY events that was performed on Invoice entities. Fortunately, I keep Type value on my entities in DynamoDB Table (thanks Alex 🤝).

The DynamoDB event structure is well-defined, so basically what I need to do is make sure that:

  • eventName equals to MODIFY, and
  • dynamodb.NewImage.Type.S equals to Invoice.

All of that is defined in filterPatterns section of Lambda configuration. Below is a snippet from Serverless Framework serverless.yml config file. Support for filterPatterns was introduced in version 2.68.0 - make sure you are using it or newer.

    functionName:
      handler: src/functionName/function.handler
      # other properties
      events:
      - stream:
          type: dynamodb
          arn: !GetAtt DynamoDbTable.StreamArn
          maximumRetryAttempts: 1
          batchSize: 1
          filterPatterns:
            - eventName: [MODIFY]
              dynamodb:
                 NewImage:
                   Type:
                     S: [Invoice]
Enter fullscreen mode Exit fullscreen mode

And that's all you need to do to filter your DynamoDB Stream.

Amazing, isn't it?

Gotchas

Bear in mind that there can be several filters on a single source. In such case, each filter works independently of the other. Simply put, there is OR not AND logic between them.

I learned that the hard way by mistakenly creating two filters:

          filterPatterns:
            - eventName: [MODIFY]
            - dynamodb:
                 NewImage:
                   Type:
                     S: [Invoice]
Enter fullscreen mode Exit fullscreen mode

by adding - in front of dynamodb:. It resulted in the wrong filter:

{
  "filters": [
    {
      "pattern": "{\"eventName\":[\"MODIFY\"]}"
    },
    {
      "pattern": "{\"dynamodb\":{\"NewImage\":{\"Type\":{\"S\":[\"Invoice\"]}}}}"
    }
  ]
}
Enter fullscreen mode Exit fullscreen mode

That one catches all MODIFY actions OR anything that has Invoice as Type in NewImage object, so DynamoDB INSERT actions as well!

Correct filter:

{
  "filters": [
    {
      "pattern": "{\"eventName\":[\"MODIFY\"],\"dynamodb\":{\"NewImage\":{\"Type\":{\"S\":[\"Invoice\"]}}}}"
    }
  ]
}
Enter fullscreen mode Exit fullscreen mode

You can view filter in Lambda console, under Configuration->Triggers section.

Global Tables

As kolektiv mentioned in the comments below, this functionality does not work with Global Tables.

One more catch, you can't use filtering with global tables, your filter will not be evaluated and your function will not be called. Confirmed with aws support.

Thanks for pointing that out.

How much does it cost?

Nothing.

There is no information about any additional pricing. Also, Jeremy Daly confirmed that during re:Invent 2021.

In reality, this functionality saves you money on maintenance because it's easier to write & debug Lambda code, and on operations, as functions are executed only responding to business relevant events.

Low coupling

Before the update, people implemented event filtering logic in a single Lambda function. Thus, struggling from high coupling (unless they utilized some kind of dispatcher pattern).

Now, we can have several independent Lambda functions, each with its filter criteria, attached to the same DynamoDB Stream. That results in lower coupling between code that handles different event types. This will be very much appreciated by all single-table design practitioners.

Update

I forgot to mention that you can do more than just evaluate string equals condition in the filter. There are more possibilities, delivered by several comparison operators.

Here is a table stolen borrowed from AWS Docs (If it's not OK to included it here please let me know.):

Comparison operator Example Rule syntax
Null UserID is null "UserID": [ null ]
Empty LastName is empty "LastName": [""]
Equals Name is "Alice" "Name": [ "Alice" ]
And Location is "New York" and Day is "Monday" "Location": [ "New York" ], "Day": ["Monday"]
Or PaymentType is "Credit" or "Debit" "PaymentType": [ "Credit", "Debit"]
Not Weather is anything but "Raining" "Weather": [ { "anything-but": [ "Raining" ] } ]
Numeric (equals) Price is 100 "Price": [ { "numeric": [ "=", 100 ] } ]
Numeric (range) Price is more than 10, and less than or equal to 20 "Price": [ { "numeric": [ ">", 10, "<=", 20 ] } ]
Exists ProductName exists "ProductName": [ { "exists": true } ]
Does not exist ProductName does not exist "ProductName": [ { "exists": false } ]
Begins with Region is in the US "Region": [ {"prefix": "us-" } ]

Summary

I hope this short article convinced you to refactor your Lambda functions that are invoked by DynamoDB Streams. It's really simple and makes a huge difference in terms of code clarity and costs.

Discussion (8)

Collapse
droizman profile image
kolektiv

One more catch, you can't use filtering with global tables, your filter will not be evaluated and your function will not be called. Confirmed with aws support.

Collapse
nosqlknowhow profile image
Kirk Kirkconnell

Are you saying the two cannot be used at the same time or that you cannot use the event filter to filter Global Tables traffic from source to destination regions? These are two VERY different things.

Collapse
droizman profile image
kolektiv

You cannot use stream filtering on a Global Table. Your Global Table will continue replicate/sync but your stream filter will not be evaluated and your trigger will not fire.

Thread Thread
simiobs profile image
simi-obs

I am sorry but this answer IMO is very misleading. You did not actually answer Kirk's question properly. He is correct when he says you cannot use the event filter to filter Global Tables traffic from source to destination regions

But you can actually use this feature (of event filtering for lambdas). I confirmed with AWS. Here is the link: repost.aws/questions/QUgOGCJJhAStm...

Thread Thread
droizman profile image
kolektiv

this is the reply from AWS support:

-->
That said, DynamoDB streams capture any modification to a DynamoDB table for example an insert,update or delete. We can attach a trigger to the stream, specifically a lambda function. This lambda function will be invoked every time a modification is made to the table. There is no option to filter this action on only certain items, the reason for this is that the streams are required to keep the replica table in the different region in sync with the base table.

We can however add logic to our trigger function to discard any items that do not contain the required/desired tag/value. However the function will still be triggered if the item updated/inserted or deleted does not contain the value/tag you want to filter on.
<--

So you can see according to AWS, on a global table your Lambda still gets called and ignores the filter.

Thread Thread
leeroy_hannigan profile image
Leeroy Hannigan • Edited on

This is not correct information. The filtering is on the Event Source Mapping on the Lambda side which is completely decoupled from DynamoDB Stream. Event filtering works regardless, as Global Table replication system is completely separate from your Lambda trigger.

On a side-note, try this filter @koletiv

{
  "filters": [
    {
      "pattern": "{\"dynamodb\":{\"NewImage\":{\"region\":{\"S\":[\"us-west-2\"]}}}}"
    }
  ]
}
Enter fullscreen mode Exit fullscreen mode
Thread Thread
pzubkiewicz profile image
Pawel Zubkiewicz Author

@droizman have you tested that?

Collapse
weisisheng profile image
Vince Fulco (It / It's)

Great article. Thank you.

One typo, "Therefor, let me take you through the refactoring process."-->"Therefore, let me take you through the refactoring process."