TL;DR: Embark on our journey to enhance DynamoDB geospatial queries with custom Z-Order indexing implementation! Discover calculating Z-addresses, fine-tuning queries, and results. Although performance is equivalent to Mongo Atlas, maintaining your own Z-Order indexing is very challenging. We recommend using non DynamoDB off-the-shelf alternatives.
This article will guide you through our experimentation of a Z-Order index on DynamoDB, based on the excellent work by Zack Slayton in his article on Z-order indexing.
When it comes to handling complex, geographic queries in Amazon DynamoDB, there is no off the shelf solution. Z-Order indexing offers an efficient and powerful solution to this problem. It is particularly suited for querying a small zone around a geolocation.
Before diving in, letโs go through some of Z-Order core concepts.
๐ค All our implementations of the functions we are describing are available on this gist
Creating Z-Addresses for Your Records ๐ท๏ธ
The first step in implementing a Z-Order index is calculating the Z-addresses for your records. A Z-address is a single number representing a tuple of field values. To calculate a Z-address, simply interleave the bits of each field in the record. Check out the examples provided in the base article to see how this is done.
Once you can generate a z-order index, apply it to your dataset and store it in DynamoDB. The SK will be the z-order Index.
Crafting Queries for Data Retrieval ๐
With Z-addresses in hand, you can now build queries to retrieve data. This step was much more challenging than we thought.
Begin by defining a minimum and maximum value for the range of items you want to read. Then, create two records representing the lower and upper bounds of this range. Translate these records into Z-addresses to form the query space.
However, if your query range intersects with "seams" in the Z-order curve, you'll need to use two functions to avoid irrelevant regions in the search space: isRelevant
and zDivide
. These functions help narrow down the search space and minimize garbage data.
TypeScript Implementation of Z-Order Index ๐
You can find our implementation in this gist. We left it as a PoC and consider it to be far from prod ready
We parametrized the algorithm to be able to optimize the precision of our Z-Order indexes and the data outside of our search box.
Limitations and Trade-offs of Z-Order Indexing ๐ค
The main limitation of Z-Order indexing is the accuracy of your queries. This is due to the conversion from latitude/longitude to an index
You can optimize the precision of your Z-Order indexes by tweaking the parameters of the algorithm.
Our implementation also doesnโt include more than 2D data, you will not be able to query on timestamp as an index for example.
Wrapping Up ๐
Z-Order indexing significantly improves the efficiency of your DynamoDB queries when fetching geographic data. By implementing the Z-Order index and utilizing functions zDivide
, you can minimize garbage data and optimize your database's performance.
Our feeling is that handling your own geo-index is challenging and hard to maintain. If it is not critical for your use case to handle it yourself, we recommend you use off the shelf solutions such as
- MongoDB Atlas
- Postgis Plugin on your Postgre Database
With the information provided in this article and the base knowledge from Zack Slayton's article on Z-Order indexing, you're now equipped to supercharge your DynamoDB queries with Z-Order indexing. Happy querying! ๐๐
Written with โค๏ธย by @Guillaume Duboc and @Valentin Beggi
Top comments (0)