DEV Community

Ronen Botzer for Aerospike

Posted on

Operations on Nested Data Types in Aerospike

A document store modeling approach

Alt Text

Photo by Yingchih on Unsplash

Aerospike version 4.6 (released in August 2019) added the ability to apply list and map operations to elements nested at an arbitrary depth. In this post we'll see how this works. I'll start with an overview, so if you're familiar with Aerospike you can skip the following section.

Overview

Aerospike is a high performance, row-oriented, distributed database. Objects in Aerospike are called records. They are similar to rows in relational databases. Records are uniquely identified by the 3-tuple (namespace, set, user-key). A namespace combines the database and tablespace concepts of a relational database. The set in Aerospike is similar to a schema-less table. The user-key is simply the unique identifier for a record in this set from the perspective of the application. This is similar to how a primary key uniquely identifies a single row of a table in a relational database. The entire record is stored contiguously in the storage medium defined by the namespace (on SSD, in persistent memory, or in DRAM).
Alt Text
An Aerospike record keeps its data in one or more bins, which are similar to the columns of a row in a relational database, just without a schema. Each bin holds a value of a supported data type: integer, double, string, bytes, list, map, geospatial. Each data type has an API of atomic, server-side operations. For example, the integer data type has an increment operation, which can be used to implement counters in the record.

The list and map data types are particularly interesting and flexible. As storage units they can embed other data types inside them, including nesting other lists and maps. Both have extensive APIs.

The Aerospike database does single record transactions. Multiple operations, against a single record, can be executed efficiently under a record lock, atomically and in isolation.

Tracking High Scores

In this example, we’ll track the high scores for classic video games using a nested data structure { player: [score, {attribute map}] }.

Scores can be added individually or in bulk using map_put_items(), the Python client’s implementation of the Aerospike map API’s add_items() operation.

Each video game is tracked in a different record

At this point the scores can be retrieved by rank. The rank is established based on the ordering rules for the values of this map, which in this case are all lists with the tuple structure [score, {attribute map}].

Returns all the map elements by ascending rank. Due to Aerospike’s ordering rules, the list values of this map get ordered primarily by the value of their first element (the score)
[ 'ETC',
  [9200, {'dt': '2018-05-01 13:47:26', 'ts': 1525182446891}],
  'CPU',
  [9800, {'dt': '2017-12-05 01:01:11', 'ts': 1512435671573}],
  'CFO',
  [17400, {'dt': '2017-11-19 15:22:38', 'ts': 1511104958197}],
  'EIR',
  [18400, {'dt': '2018-03-18 18:44:12', 'ts': 1521398652483}],
  'SOS',
  [24700, {'dt': '2018-01-05 01:01:11', 'ts': 1515114071923}],
  'ACE',
  [34500, {'dt': '1979-04-01 09:46:28', 'ts': 291807988156}]]
Enter fullscreen mode Exit fullscreen mode

Before Aerospike version 4.6 there was no way to apply map operations on the attribute map nested inside the list values of the scores map. The Complex Data Types (CDT) operations were limited to the top level elements of the list or map in question. Let’s assume that the attribute map optionally contains awards won by the players.

Grants the 🦄 award once and only once

In earlier versions of Aerospike, we would need to read the list value from the server to the application, add the awards map to the attribute map, then write the modified list back to the server. Leveraging the feature added in version 4.6, it can now be done atomically. I created a context that identifies the path to the attribute map, then applied a map_put operation at that spot. The map policy MAP_WRITE_FLAGS_CREATE_ONLY ensures this award is granted once. The MAP_WRITE_FLAGS_NO_FAIL policy makes the operation behave in a tolerant way if the 🦄 award was already in place. The transaction continues to the next operation (if there is one) and the client side doesn’t need to handle an exception.

Give the player with the top score a 🏆award
{ 'ACE': [ 34500,
           { 'awards': {'🏆': 1},
             'dt': '1979-04-01 09:46:28',
             'ts': 291807988156}],
  'CFO': [ 17400,
           { 'awards': {'🦄': 1},
             'dt': '2017-11-19 15:22:38',
             'ts': 1511104958197}],
  'CPU': [9800, {'dt': '2017-12-05 01:01:11', 'ts': 1512435671573}],
  'EIR': [ 18400,
           {'dt': '2018-03-18 18:44:12', 'ts': 1521398652483}],
  'ETC': [9200, {'dt': '2018-05-01 13:47:26', 'ts': 1525182446891}],
  'SOS': [ 24700,
           {'dt': '2018-01-05 01:01:11', 'ts': 1515114071923}]}
Enter fullscreen mode Exit fullscreen mode

In the code section above, the context is enhanced to a further depth so that a 🏆 award is initialized if it doesn’t exist, then incremented. You need to have a path leading to an element, so simply incrementing without initializing it would risk a failure. Notice that the context path doesn’t have to be a physical set of direction changes. Here the 🏆 is given to the element with the highest rank (-1).

Hand out another top score 🏆award, then display the top three scores
[ 'EIR',
  [18400, {'dt': '2018-03-18 18:44:12', 'ts': 1521398652483}],
  'SOS',
  [24700, {'dt': '2018-01-05 01:01:11', 'ts': 1515114071923}],
  'ACE',
  [ 34500,
    {'awards': {'🏆': 2}, 'dt': '1979-04-01 09:46:28', 'ts': 291807988156}]]
Enter fullscreen mode Exit fullscreen mode

This code lives in the aerospike-examples/aerospike-modeling repo on GitHub.

Originally published on Medium (Aerospike Developer Blog), November 2 2019

Latest comments (0)