Supercharge your geolocalized DynamoDB Queries with Z-Order Indexing 🚀

#javascript #database #serverless #aws

TL;DR: Embark on our journey to enhance DynamoDB geospatial queries with custom Z-Order indexing implementation! Discover calculating Z-addresses, fine-tuning queries, and results. Although performance is equivalent to Mongo Atlas, maintaining your own Z-Order indexing is very challenging. We recommend using non DynamoDB off-the-shelf alternatives.

This article will guide you through our experimentation of a Z-Order index on DynamoDB, based on the excellent work by Zack Slayton in his article on Z-order indexing.

When it comes to handling complex, geographic queries in Amazon DynamoDB, there is no off the shelf solution. Z-Order indexing offers an efficient and powerful solution to this problem. It is particularly suited for querying a small zone around a geolocation.

Before diving in, let’s go through some of Z-Order core concepts.

🤖 All our implementations of the functions we are describing are available on this gist

Creating Z-Addresses for Your Records 🏷️

The first step in implementing a Z-Order index is calculating the Z-addresses for your records. A Z-address is a single number representing a tuple of field values. To calculate a Z-address, simply interleave the bits of each field in the record. Check out the examples provided in the base article to see how this is done.

Once you can generate a z-order index, apply it to your dataset and store it in DynamoDB. The SK will be the z-order Index.

Crafting Queries for Data Retrieval 🔍

With Z-addresses in hand, you can now build queries to retrieve data. This step was much more challenging than we thought.

Begin by defining a minimum and maximum value for the range of items you want to read. Then, create two records representing the lower and upper bounds of this range. Translate these records into Z-addresses to form the query space.

However, if your query range intersects with "seams" in the Z-order curve, you'll need to use two functions to avoid irrelevant regions in the search space: isRelevant and zDivide. These functions help narrow down the search space and minimize garbage data.

TypeScript Implementation of Z-Order Index 📚

You can find our implementation in this gist. We left it as a PoC and consider it to be far from prod ready

We parametrized the algorithm to be able to optimize the precision of our Z-Order indexes and the data outside of our search box.

Limitations and Trade-offs of Z-Order Indexing 🤔

The main limitation of Z-Order indexing is the accuracy of your queries. This is due to the conversion from latitude/longitude to an index

You can optimize the precision of your Z-Order indexes by tweaking the parameters of the algorithm.

Our implementation also doesn’t include more than 2D data, you will not be able to query on timestamp as an index for example.

Wrapping Up 🎁

Z-Order indexing significantly improves the efficiency of your DynamoDB queries when fetching geographic data. By implementing the Z-Order index and utilizing functions zDivide, you can minimize garbage data and optimize your database's performance.

Our feeling is that handling your own geo-index is challenging and hard to maintain. If it is not critical for your use case to handle it yourself, we recommend you use off the shelf solutions such as

MongoDB Atlas
Postgis Plugin on your Postgre Database

With the information provided in this article and the base knowledge from Zack Slayton's article on Z-Order indexing, you're now equipped to supercharge your DynamoDB queries with Z-Order indexing. Happy querying! 🚀🎉

Written with ❤️ by @Guillaume Duboc and @Valentin Beggi