DEV Community

loading...
Cover image for Flexible Upsert With DynamoDB

Flexible Upsert With DynamoDB

dengel29 profile image Dan ・5 min read

I recently made a switch from Postgres to Dynamodb and have been learning some pretty basic stuff. I’m working with a single table with a single kind of item so none of the tough NoSQL relational paradigm stuff, just some basic putting and updating. In using DynamoDB's built-in update function, I came across what I imagine is a pretty ordinary case that was not covered in documentation: how to update records allowing any possible combination of properties.

Update Expressions from the Amazon DynamoDB docs, for reference

The Case

Here’s the case:

  • I have an item that I want to insert into my table, that has an arbitrary amount of properties;
  • I expect it will be updated, adding to those initial properties another, different set of arbitrary properties.

In other words: I’ll create an item with x and y properties, and later on will update it with property z.

I read that the default behavior for DynamoDB’s update is like Postgres’ UPSERT, ie it will create the record in the database if it cannot find it by its primary key. So it sounded like I could do my insert and update logic in the same function, which would be great.

By its nature, DynamoDB (and NoSQL databases in general, as far as I understand) have the flexibility to create records with any properties. That’s why it was surprising to me that the problem I kept running into was specifically of that nature: DynamoDB would not let me update a record with a property it didn’t already have if I didn’t supply all its other already-existing properties.

In the most simplified example, this is what was happening:

  • I create an item with x and y properties;
  • I then update that item with z property;
  • The update fails because I did not supply x and yproperties. The error message was: "ExpressionAttributeValues cannot be NULL" (if you're new to DynamoDB I cover ExpressionAttributeValues below).

Here’s a good Stack Overflow post with a similar problem.

Coding the Problem

In my project, I’m building a database of video games. I want to save some info about each one: title, developer, publisher, releaseDate, and a lot more. But let’s start simple, with just one attribute.

In the example below, let’s imagine we have a table where the primary key (Key below) is the title, and we already have a record in the database that exists with that Key. This is how we would update a single attribute, developer.


const data = {title: 'Uncharted', developer: 'Naughty Dog'}
const params = {
  TableName: process.env.DYNAMODB_TABLE,
  Key: { title: data.title },
  UpdateExpression: 'SET developer =  :dev',
  ExpressionAttributeValues: {
   ':dev': data.developer
    }
  };
  await dynamodb.update(params).promise()
}

Enter fullscreen mode Exit fullscreen mode

So this works, but with this approach, we’ve totally hamstrung the flexibility of DynamoDB. We are expecting a specific bit of data to be sent in with the request, data.developer.

What happens if we want to add any other properties to this request? Well, if we don’t change the code at all and send a request with, say, a publisher property in the data with our developer and title, it simply won’t get written to the record. To accommodate new data to add, we'd have to change the UpdateExpression and the ExpressionAttributeValues like this:


const data = {title: 'Uncharted', developer: 'Naughty Dog', publisher: 'Sony'}
const params = {
  TableName: process.env.DYNAMODB_TABLE,
  Key: { title: data.title },
  UpdateExpression: 'SET developer =  :dev', publisher = :pub,
  ExpressionAttributeValues: {
    ':dev': data.developer,
    ':pub': data.publisher
    }
  };
  await dynamodb.update(params).promise()
}

Enter fullscreen mode Exit fullscreen mode

Again, this works, but if you ever update your object without every single attribute, the request will fail with the ExpressionAttributeValues cannot be NULL error. You're not allowed to update just the developer, or just the publisher with this approach.

This completely goes against what I’d read and learned about as one of the main selling points of NoSQL: flexibility.

How We Fix It

So looking at the Stack Overflow article I posted above gave me a few ideas that got me to a solution I feel good about.

If you look at the structure of the params that you actually supply to the dynamodb.update() function, it’s just a simple object with UpperCamelCased keys. So rather than hardcode the attributes we want to update, we can dynamically build our update function based on whatever attributes have been provided.

So the key things we need to build are the UpdateExpression and the ExpressionAttributeValues.

Programming UpdateExpression

UpdateExpression uses an SQL-like syntax to SET, ADD (increment), or REMOVE certain attributes on the item you’re updating. If you want to SET multiple attributes in a single expression it will look like what we did above:

SET latestUpdate = :updateTime, developer = :developer, publisher = :publisher

Those colon-prefixed keys like :updateTime will be found in the ExpressionAttributeValues. They're arbitrary placeholders, so name them anything you want as long as they match a value in the ExpressionAttributeValues. You can think of it as the value substitution (aka "Parameterized Queries") you find in SQL.

You can probably see how we go about solving this then:

  1. we take all the keys found in data object we're updating,
  2. map them to properties we want to update,
  3. create a colon-prepended key for them, then
  4. add those values to the ExpressionAttributeValues list.

The final result looks like this:

const data = {title: 'Uncharted', developer: 'Naughty Dog', publisher: 'Sony'}

const params = {
  TableName: process.env.DYNAMODB_TABLE,
  Key: { title: data.title },
};
params.UpdateExpression = "SET latestUpdate = :updateTime"
params.ExpressionAttributeValues = {}
for (let k in data) {
  if (k !== 'title') {
    params.UpdateExpression += `, ${k} = :${k}`
    params.ExpressionAttributeValues[`:${k}`] = data[k]
  }
}
params.ExpressionAttributeValues[':updateTime'] = Date.now()

await dynamodb.update(params).promise()
Enter fullscreen mode Exit fullscreen mode

You’ll notice a few things about the above code:

  1. we start our UpdateExpression with SET latestUpdate = :updateTime; the purpose of that is so we don’t need to deal with the edge case of the first item in our iteration through the keys. With that starting off our expression, we can simply concatenate the string with all the keys, starting with a colon to separate each value. Makes life easier, plus it's good practice to save that information. Two birds, one stone.
  2. We have to set the initial value of ExpressionAttributeValues to an empty object {}, so we can write the keys and values to it.
  3. In our for loop, there’s an if statement because we want to get every attribute except title, which is the Key to our table. Even if it hasn’t changed, we can’t put that in the UpdateExpression or DynamoDB will error. You're not allowed to put the Key in the UpdateExpression.

Now we should see no errors when we send two separate requests to the same endpoint, the first which will create the game, the second which will update it with a new attribute.

Personal Takeaway

I’m new to DynamoDB so I’m not sure if it’s crazy or clever that this was the way to achieve this type of behavior in the update function. I have my complaints about the default implementation of the update function, but at the same time it wasn't a huge leap of imagination to get the desired behavior, so perhaps in a sense this is as flexible as marketed.

In only a few days of playing with it, I'm starting to see the patterns one can utilize to fill in the gaps where desired built-ins might be missing. Let me know if this was helpful for you or if you've faced any similar blockers with DynamoDB!

Discussion

pic
Editor guide