Conor Woods

Posted on Jul 25, 2020

Why does writing code for DynamoDb get my spidey senses tingling?

#dynamodb #aws #productivity #discuss

I am not the best developer, but I'm certainly not the worst.
I've got a lot of years under my belt using a host of languages and technologies, but writing code to access DynamoDb feels wrong. Very wrong. Why?

All code is a liability, I've spent my career trying to write less of it and get more done. The only thing I like better than writing code is deleting it.

But I find myself writing reams of code and performing all sorts of mental gymnastics to do previously 'simple' things with DynamoDb. CRUD, sorting, filtering all seem to require way too much code and mental effort; something doesn't 'smell' right.

Is it just me? Doth this programmer complain too much? Have I not just found a good library to take away all the pain?

Top comments (9)

Andrew Berth • Jul 25 '20

DynamoDB requires different thinking compared to most mainstream solutions. Could it be you’re bringing habits over from other systems? I have sure been in that place.

Maybe if you could show how you’re currently using it, I might be able be more specific.

Conor Woods • Jul 26 '20

You are absolutely correct Andrew, those habits an ingrained and hard to shake.

But what I'm finding is even simple tasks require an amount of code and a degree of 'fluffiness' that I'm not comfortable writing.

For example, a fairly common and simple scenario is a customer looking to return 'products' based on category.
Given this task, with an RDBMS the task could be solved with 2x relational tables and a SQL statement like "SELECT a.* FROM PRODUCTS a, PRODUCT_CATEGORIES b WHERE a.id = b.product_id AND b.category_name LIKE 'driink' OR b.category_name LIKE 'foood' ORDER by a.product_name'

What would the code look like to get a list of product items from dynamodb, where a product can have many categories, based on a lookup of multiple categories?

Don't get me wrong, I like the promise of dynamodb, and would not consider myself a novice at modelling and using it. But the code I need to write to solve tasks similar to the above, or even basic CRUD, seems very verbose currently, and more code means more liability.

Andrew Berth • Jul 26 '20

Can you tell me how you’re going about tackling such a problem right now in Dynamo? For example, what do your tables look like? What are the kind of API calls you’re doing?

Conor Woods • Jul 26 '20

Hi Andrew, I appreciate the response but in truth, I was hoping to see some code to show me how you would personally solve the task using DynamoDb. But I'll tell you how I would approach it using DynamoDb and how I'll actually probably end up approaching it.

Just one of the api calls that I need to satisfy is as follows:
/products/?search=[freetext]&category=[cat1]&category=[cat2]

The data/schema for a product currently looks like the following:
{
"entity": "Product",
"sk": "ORG-1#PRODUCT",
"val": {
"name": "Product 1",
"orgId": "1",
"id": "EwfoHf7zAdRvNsiHw2SbxTeSnPb2",
"categories": [
{
"name": "Business",
"primary": true
},
{
"name": "Executive"
},
{
"name": "Career"
}
],
"status": "ACTIVE",
"createDate": "2020-07-23T14:56:29.994Z"
},
"pk": "PRODUCT-EwfoHf7zAdRvNsiHw2SbxTeSnPb2",
"updatedDateTime": "2020-07-23T14:56:29.994Z",
"entityId": "EwfoHf7zAdRvNsiHw2SbxTeSnPb2"
}

IF I were to solve this with DynamoDb I would either use a GSI to store a composite key of the categories or denormalize and duplicate the data. However, even if I could do a full-text search on the composite key (which I don't believe you can) it wont help with a multi-category search (unless I used a filter). And if I went down the denormalization route, I would still have the full text issue and I would also need to write the code and employ extra infrastructure to manage this (stream + SNS probably).

How I'll actually end up solving it is by introducing another piece of infrastructure like Elastic and populate it using Dynamodb streams.

My point is this: I'm jumping through hoops, adding extra infrastructure and writing way more code to solve previously simple tasks.

I am aware of the trade-offs, but I really wish there was an abstraction over DynamoDb to take the development pain away. I want to have my cake and eat it; I want the blazing fast speed and I want to write/support/maintain as little code as possible.

Maybe I'm just lazy, but something doesn't sit right with me.

Andrew Berth • Jul 26 '20

I did not write any code because, as I suspected, no amount of code is going to do what you want.

DynamoDB’s entire ‘thing’ is getting certain chunks of data (partitions) really fast, no matter how much data you’re dealing with. They do this by having all related data together: no multiple tables, no joins. They say you should duplicate your data in the format you’re going to want to access it, all to prevent you from having to compile stuff when you’re looking for it.

Getting all products from one or more categories should be no problem. With a GSI, as you said, you could perfectly make partitions of your data based on their category. Then you could get one (GetItem) or multiple (BatchGetItem) very easily.

Free text search, on the other hand, is an entirely different beast. Free text search is all about getting some arbitrary values out of a certain group. DynamoDB was not made to do this kind of thing. The same goes for relational databases, really. The LIKE operator lets you do some searching, but in a really limited way. (And it’s slow, which Dynamo does not allow you to be.) That’s why they made an Elasticsearch integration.

Now, of course it would be nice to have one service that would do exactly what you need, quickly, and with an API so simple even a toddler could use it. I know I would use the features you’re describing in a heartbeat. But for now, it simply does not exist.

Conor Woods • Jul 26 '20

Don't disagree with anything there Andrew.
By chance I came across this from the burning monk
lumigo.io/aws-serverless-ecosystem...
Now we're getting somewhere.
Coupled with Jeremy daly's dynamo toolbox, it could scratch an itch.
github.com/jeremydaly/dynamodb-too...

Max • Jul 28 '20

I had the same love/hate relationship with DynamoDB because I treated it like an SQL DB.

It is a key-value store with maybe an additional index.
The only retrieval I do now is by key. No scans, minimal filtering. Loving it!

I maintain the relationships and integrity in Postgres. It has only the keys to keep the SQL DB small. All the payload lives in DDB. So far I am very happy with this arrangement.

Conor Woods • Jul 28 '20

Very interested to understand the Postgres bit and how it works Max.
How do you keep the keys updated and what does querying looks like?

Max • Jul 29 '20 • Edited

Reads if ID is known Vue - Appsyn - DDB
Reads if ID is not known or related IDs are required: Vue - Appsync - Lambda - get IDs from Postgres - Lambda - DDB batch read - pack it up for Appsync - back to Vue.
Search by IDs, relations, dates, numeric, order - same as above
Search by text or complex relations: add ElasticSearch to the pic
Writes / deletes: Vue - Appsync - Lambda - full record incl IDs into DDB - IDs only into PG

You need to denormalize and partition your data.

This architecture does not work well for anything transactional, but then I would not use DDB for anything transactional either.

You may be better off using Postgres alone if your project is within the size of a single DB server.

Feel free to message me if you want to know more.