Roman Right

Posted on Sep 13, 2022 • Edited on Sep 20, 2022

Announcing AnnaDB - next-gen NoSQL database

#database #programming #opensource #rust

I'm excited to introduce AnnaDB - the next-generation developer-first NoSQL data store.

I work with many small projects daily - proofs of concepts and experiments with new frameworks or patterns. For these purposes, I needed a database that works with flexible data structures, as I change it frequently during my experiments. And it must support relations out of the box, as this is a natural part of the structures' design - links to other objects. I tried a lot (if not all) databases, but nothing fit my requirements well. So, I decided to make my own then. This is how AnnaDB was born.

AnnaDB is an in-memory data store with disk persistence. It is written with Rust, a memory-safe compilable language. AnnaDB is fast and safe enough to be and the main data storage, and the cache layer.

Features

Flexible object structure - simple primitives and complicated nested containers could be stored in AnnaDB
Relations - you can link any object to another, and AnnaDB will resolve this relation on finds, updates, and other operations.
Transactions - out of the box

Basics

I want to start with the basic concepts and examples of the syntax here and continue with the usage example.

Collections

AnnaDB stores objects in collections. Collections are analogous to tables in SQL databases.

Every object and sub-object (item of a vector or map) that was stored in AnnaDB has a link (id). This link consists of the collection name and unique uuid4 value. One object can contain links to objects from any collections - AnnaDB will fetch and process them on all the operations automatically without additional commands (joins or lookups)

TySON

The AnnaDB query language uses the TySON format. The main difference from other data formats is that each item has a value and prefix. The prefix can mark the data or query type (as it is used in AnnaDB) or any other useful for the parser information. This adds more flexibility to the data structure design - it is allowed to use as many custom data types as the developer needs.

You can read more about the TySON format here

Data Types

There are primitive and container data types in AnnaDB.

Primitive data types are a set of basic types whose values can not be decoupled. In TySON, primitives are represented as prefix|value| or prefix only. Prefix in AnnaDB shows the data type. For example, the string test will be represented as s|test|, where s - is a prefix that marks data as a string, and test is the actual value.

Container data types keep primitive and container objects using specific rules. There are only two container types in AnnaDB for now. Maps and vectors.

Vectors are ordered sets of elements of any type. Example: v[n|1|,n|2|,n|3|,]
Maps are associative arrays. Example: m{ s|bar|: s|baz|,}

More information about AnnaDB data types can be found in the documentation

Query

Query in AnnaDB is a pipeline of steps that should be applied in the order it was declared. The steps are wrapped into a vector with the prefix q - query.

collection|test|:q[
   find[
   ],
   sort[
      asc(value|num|),
   ],
   limit(n|5|),
];

If the pipeline has only one step, the q vector is not needed.

collection|test|:find[
   gt{
      value|num|:n|4|,
   },
];

Server

To run AnnaDB locally please type the next command in the terminal:

docker run --init -p 10001:10001 -t romanright/annadb:0.1.0

Client

AnnaDB shell client is an interactive terminal application that connects to the DB instance, validates and handles queries. It fits well to play with query language or work with the data manually.

The client can be installed via pip

pip install annadb

Run

annadb --uri annadb://localhost:10001

Usage example

You are prepared for the fun part of the article now. Let's play with AnnaDB!

I'll create a database for the candy store to show the features.

Insert primitive

Let's start with categories. I'll represent categories as simple string objects. Let's insert the first one into the categories collection.

Request:

collection|categories|:insert[
    s|sweets|,
];

collection|categories| shows on which collection the query will be applied. In our case - categories.

insert[...] - is a query step. You can insert one or many objects using the insert operation.

s|sweets| - is the object to insert. In this case, it is a string primitive. Prefix s means that it is a string, | wrap the value of the primitive. Other primitive types could be found in the Data Types section.

Response:

result:ok[
    response{
        s|data|:ids[
            categories|d2e9fecd-8b3d-429d-9a9b-34810120a221|,
        ],
        s|meta|:insert_meta{
            s|count|:n|1|,
        },
    },
];

If everything is ok, the result will have an ok[...] vector with responses for all the transaction pipelines. Each response contains data and meta information. In our case, there is only one response with a vector of ids in data and a number of inserted objects in meta.

Insert container

Let's insert a more complicated object now - a chocolate bar. It will have fields:

name
price
category

For the category, I'll use the already created one.

Request:

collection|products|:insert[
    m{
        s|name|:s|Tony's|,
        s|price|:n|5.95|,
        s|category|:categories|d2e9fecd-8b3d-429d-9a9b-34810120a221|,
    },
];

The query is similar to the previous one, but the object is not a primitive but a map. The value of the category field is a link that was received after the previous insert.

Response:

result:ok[
    response{
        s|data|:ids[
            products|17b12780-349c-4091-9bd2-7e08ad509ad0|,
        ],
        s|meta|:insert_meta{
            s|count|:n|1|,
        },
    },
];

The response is nearly the same as before - link in data and number of inserted objects in meta.

Get object

Let's retrieve the information about this chocolate bar now. I'll use the get operation for this, to the object by id

Request:

collection|products|:get[
    products|17b12780-349c-4091-9bd2-7e08ad509ad0|,
];

This time I use the get[...] query step. Using this step you can retrieve one or many objects using object links.

Response:

result:ok[
    response{
        s|data|:objects{
            products|17b12780-349c-4091-9bd2-7e08ad509ad0|:m{
                s|category|:s|sweets|,
                s|price|:n|5.95|,
                s|name|:s|Tony's|,
            },
        },
        s|meta|:get_meta{
            s|count|:n|1|,
        },
    },
];

In the response here you can see the objects{...} map, where keys are links to objects and values are objects. objects{} map keeps the order - it will return objects in the same order as they were requested in the get step, or as they were sorted by the sort step.

The category was fetched automatically and the value was returned.

Let's insert another chocolate bar there to have more objects in the collection:

collection|products|:insert[
    m{
        s|name|:s|Mars|,
        s|price|:n|2|,
        s|category|:categories|d2e9fecd-8b3d-429d-9a9b-34810120a221|,
    },
];

I use the same category id for this bar.

Modify primitive

Let's modify the category to make it more accurate.

Request:

collection|categories|:q[
    get[
        categories|d2e9fecd-8b3d-429d-9a9b-34810120a221|,
    ],
    update[
        set{
            root:s|chocolate|,
        },
    ],
];

The query here consists of 2 steps. Get the object by link step and modify this object step. The update[...] operation is a vector of update operators. Read more about the update.

Response:

result:ok[
    response{
        s|data|:ids[
            categories|d2e9fecd-8b3d-429d-9a9b-34810120a221|,
        ],
        s|meta|:update_meta{
            s|count|:n|1|,
        },
    },
];

The response of the update operation contains the ids of the updated objects as data and the number of the updated objects as meta.

Let's take a look at how this affected the chocolate objects.

Request:

collection|products|:find[
];

To find objects, I use the find[...] operation. It is a vector of find operators. If it is empty, all the collection objects will be returned.

Response:

result:ok[
    response{
        s|data|:objects{
            products|0b6ddf36-b8ba-487f-acd8-4dfee05d5177|:m{
                s|price|:n|2|,
                s|name|:s|Mars|,
                s|category|:s|chocolate|,
            },
            products|17b12780-349c-4091-9bd2-7e08ad509ad0|:m{
                s|price|:n|5.95|,
                s|name|:s|Tony's|,
                s|category|:s|chocolate|,
            },
        },
        s|meta|:find_meta{
            s|count|:n|2|,
        },
    },
];

The category was changed for both products, as the category object was linked with these objects.

Modify container

Now I'll increase the price of the bars, where it is less than 2

Request:

collection|products|:q[
    find[
        lt{
            value|price|:n|3|,
        },
    ],
    update[
        inc{
            value|price|:n|2|,
        },
    ],
];

The find step can stay before the update step as well. All the found objects will be updated. Read more about find operation and operators here.

Response:

result:ok[
    response{
        s|data|:ids[
            products|0b6ddf36-b8ba-487f-acd8-4dfee05d5177|,
        ],
        s|meta|:update_meta{
            s|count|:n|1|,
        },
    },
];

The response is similar to the previous one.

Here is how all the products look like after the update:

result:ok[
    response{
        s|data|:objects{
            products|0b6ddf36-b8ba-487f-acd8-4dfee05d5177|:m{
                s|category|:s|chocolate|,
                s|name|:s|Mars|,
                s|price|:n|4|,
            },
            products|17b12780-349c-4091-9bd2-7e08ad509ad0|:m{
                s|name|:s|Tony's|,
                s|price|:n|5.95|,
                s|category|:s|chocolate|,
            },
        },
        s|meta|:find_meta{
            s|count|:n|2|,
        },
    },
];

Sort objects

To sort objects, I'll use the sort operation against the price field.

Request:

collection|products|:q[
    find[
    ],
    sort[
        asc(value|price|),
    ],
];

The sort[...] operation is a vector of sort operators - asc and desc. Sort operators are modifiers that contain paths to the sorting value. The sort operation is not an independent step, it can stay only after find-like operations that return objects. You can read more about sort here

Response:

result:ok[
    response{
        s|data|:objects{
            products|0b6ddf36-b8ba-487f-acd8-4dfee05d5177|:m{
                s|name|:s|Mars|,
                s|price|:n|4|,
                s|category|:s|chocolate|,
            },
            products|17b12780-349c-4091-9bd2-7e08ad509ad0|:m{
                s|category|:s|chocolate|,
                s|price|:n|5.95|,
                s|name|:s|Tony's|,
            },
        },
        s|meta|:find_meta{
            s|count|:n|2|,
        },
    },
];

Objects in the response are sorted by price now.

It is useful to use limit and offset operations together with sort. You can read about them in the documentation

Delete objects

After any find-like step, you can use the delete operation to delete all the found objects. Or it can be used independently to delete the whole collection.

Request:

collection|products|:q[
    find[
        gt{
            value|price|:n|5|,
        },
    ],
    delete,
];

The delete operation is a primitive without value.

Response:

result:ok[
    response{
        s|data|:ids[
            products|17b12780-349c-4091-9bd2-7e08ad509ad0|,
        ],
        s|meta|:update_meta{
            s|count|:n|1|,
        },
    },
];

The response contains affected ids in data and the number of deleted objects in meta.

Using from your app

AnnaDB has a Python driver. It has an internal query builder - you don't need to learn AnnaDB query syntax to work with it. But it supports raw querying too.

I'll add drivers for other languages soon. If you can help me with it, I'll be more than happy :)

Plans

This is the very early version of the database. It can already do things, and I use it in a few of my projects. But there are many features to work on yet.

Drivers

I plan to add drivers to support the most popular languages, like JS, Rust, Go, and others. If you can help with this - please get in touch with me.

Rights management

This is probably the most important feature to implement. Authentication, authorizations, roles, etc.

Performance increase

There are many performance-related things to improve now.

Query features

Projections
More find and update operators
Developer experience improves

Data Types

I plan to add more data types like geo points and graph vertices to make AnnaDB more comfortable working with different data fields.

Managed service

My big goal is to make a managed data store service. Hey, AWS, Google Cloud, MS Azure, I'm ready for collaborations! ;)

Top comments (4)

Christiaan Pretorius • Sep 13 '22

Hi Roman,
I wrote a similar thing years ago here github.com/tjizep/libspaces
You're welcome to use other ideas you find use full there in your own project.
I'm not really one for query languages for these types of databases that's why I use native bindings to dynamic languages. It does however support json path for instance.

It's great that more people are starting to realize the power of these kinds of databases though.

Thanks!
Keep it going

Roman Right • Sep 14 '22

Hi Christiaan,

Thank you for the link. It looks interesting :)

JoelBonetR 🥇 • Sep 14 '22

By NoSQL you mean NoSQL or non-relational?

Roman Right • Sep 14 '22

I mean NoSQL. AnnaDB supports relations.

DEV Community

Announcing AnnaDB - next-gen NoSQL database

Basics

Collections

TySON

Data Types

Query

Server

Client

Usage example

Insert primitive

Insert container

Get object

Modify primitive

Modify container

Sort objects

Delete objects

Using from your app

Plans

Drivers

Rights management

Performance increase

Query features

Data Types

Managed service

Links

Top comments (4)

Read next

Database schema design of Splitwise application

The All-in-One Fake API for developers.

Using BroadcastChannel API with Vue to sync a ref across multiple tabs

How can one code line crash application? Looking for issues and vulnerabilities in ScreenToGif