DEV Community

loading...

DynamoDB PrimaryKey, HashKey, SortKey (RangeKey)

tomerbendavid profile image Tomer Ben David ・3 min read

Last week I came accross DynamoDB. Over the past few years I was fascinated by how the industry went from relational to nosql to newsql and then spread to all direction, collapse into mysql / postgres etc. The whole thing is both funny and fascinating. You can call it funnysacting.

From my past experience whenever we used a kv store we paid big time, scalability comes with a check, you pay for loosing features, you gain by having high performance, so you have to tradeoff - Economy 101.

In anyway I started reading about DynamoDB which is a managed KV store and noticed that there was both complexity and powerfulness into it's key structure and into it's indexing. It took me some time to learn (I learn slow and forget fast :) how it's key structure is working out and it took me some additional time to get how indexing work. I'm still unsure whether I got it or not. In anyway I want to share with you my visual understanding of it, and how I would like the topic to be presented to me:

This is the first thing that should be presented when talking about DynamoDB first see that there are 3 components, once you have them in your head move on, let's begin:

There are 3 basic building blocks or terms you should get familiar while first learning about DynamoDB, your data resides at:

  1. Table
  2. Item
  3. KV Attribute.

This is a 3 level hierarchy. Any data you store, first belongs to a table (some use a single table), then to an item and then your internal actual data values are in internal kv-attributes inside items.

You can fetch an Item or Items by it's primary key. The way to fetch multiple items by primary key query (which sounds weird at first), is to specify the hash key, and then a range on the range key.

It's as if your key is split to to:

(part 1 you have to specify it's fully, part to you can specify a range on it).

More visually as its complex, the way I see it:

    +----------------------------------------------------------------------------------+
    |Table                                                                             |
    |+------------------------------------------------------------------------------+  |
    ||Item                                                                          |  |
    ||+-----------+ +-----------+ +-----------+ +-----------+                       |  |
    |||primaryKey | |kv attr    | |kv attr ...| |kv attr ...|                       |  |
    ||+-----------+ +-----------+ +-----------+ +-----------+                       |  |
    |+------------------------------------------------------------------------------+  |
    |+------------------------------------------------------------------------------+  |
    ||Item                                                                          |  |
    ||+-----------+ +-----------+ +-----------+ +-----------+ +-----------+         |  |
    |||primaryKey | |kv attr    | |kv attr ...| |kv attr ...| |kv attr ...|         |  |
    ||+-----------+ +-----------+ +-----------+ +-----------+ +-----------+         |  |
    |+------------------------------------------------------------------------------+  |
    |                                                                                  |
    +----------------------------------------------------------------------------------+

    +----------------------------------------------------------------------------------+
    |1. Always get item by PrimaryKey                                                  |
    |2. PK is (Hash,RangeKey), great get MULTIPLE Items by Hash, filter/sort by range     |
    |3. PK is HashKey: just get a SINGLE ITEM by hashKey                               |
    |                                                      +--------------------------+|
    |                                 +---------------+    |getByPK => getBy(1        ||
    |                 +-----------+ +>|(HashKey,Range)|--->|hashKey, > < or startWith ||
    |              +->|Composite  |-+ +---------------+    |of rangeKeys)             ||
    |              |  +-----------+                        +--------------------------+|
    |+-----------+ |                                                                   |
    ||PrimaryKey |-+                                                                   |
    |+-----------+ |                                       +--------------------------+|
    |              |  +-----------+   +---------------+    |getByPK => get by specific||
    |              +->|HashType   |-->|get one item   |--->|hashKey                   ||
    |                 +-----------+   +---------------+    |                          ||
    |                                                      +--------------------------+|
    +----------------------------------------------------------------------------------+
Enter fullscreen mode Exit fullscreen mode

So what is happening above. Notice the following observations. As we said our data belongs to (Table, Item, KVAttribute). Then Every Item has a primary key. Now the way you compose that primary key is meaningful into how you can access the data.

If you decide that your PrimaryKey is simply a hash key then great you can get a single item out of it. If you decide however that your primary key is hashKey + SortKey then you could also do a range query on your primary key because you will get your items by (HashKey + SomeRangeFunction(on range key)). So you can get multiple items with your primary key query.

Note: I did not refer to secondary indexes.

Hopefully some fog was cleared.

Discussion (1)

Collapse
buinauskas profile image
Evaldas

As far as I remember Cassandra is inspired by DynamoDB and shares exactly same concept. Hash key there is called partition key and is needed to locate exact machine within the cluster that had desired data, otherwise you'd end up scanning whole cluster and you want to avoid it.

There's great course about Cassandra (distributed databases in general) design on DataStax academy. Was an eye opener.

Forem Open with the Forem app