Version Your Database / Future Directions

#discuss #database #kotlin #java

Hi all,

as I want to release version 1.0.0 of SirixDB[1] soon, but lack an Open Source community sadly I wanted to discuss here what you think is most important for future directions.

To keep it short SirixDB keeps the history of each resource in a database through a huge index-trie structure completely copy-on-write based. This means it shares unchanged database pages between revisions. SirixDB allows sophisticated time-travel queries and implements diffing algorithms. It stores XML and JSON in a binary format natively, but could as well store graphs or other kinds of data.

Ideas for the future would be:

horizontally scaling, that is writing through a single master, providing reading your own writes consistency, replicate resources on a few cluster-nodes... most probably using ZooKeeper and Apache BookKeeper with exactly once delivery semantics...
interactive visualizations of the differences between revisions of the resources. SirixDB currently stores tree structured data in a binary format, that is both XML and JSON. Diffing capabilities are already there. Also some outdated visualizations[2] in Processing which I'd love to port to D3 to the web. Furthermore a web-interface would be nice
Adding cost-based query optimizer rules and index-rewrite rules to improve query performance considerably
Looking into how to cleverly be able to delete old revisions (I have to look up how ZFS allows deletion of snapshots). However, as a kind of ugly hack a background process could for instance copy the most recent revision to a new resource for now. It's getting kind of tricky I guess as unchanged database pages are shared between revisions and record pages are even versioned. Thus, a page needs to be reconstructed from page fragments of different revisions depending on the algorithm used.

Besides I want to finish stuff for versioning the whole database, not just resources in a database.

Until recently I thought I'd look into horizontal scaling, to use the GraalVM for native images, that is to provide super fast startup times in docker containers, work on writing/reading from a Bookkeeper cluster and deploy everything to a Kubernetes cluster.

But maybe showcasing what's possible with beautiful interactive visualizations would get probably more attention and I think for me it would be great to learn front-end stuff, too. It might also be more useful due to the complete lack of users, thus it's only really interesting from an engineering perspective ;-)

Kind regards and have a great weekend
Johannes

[1] https://sirix.io and https://github.com/sirixdb/sirix
[2] https://m.youtube.com/watch?feature=youtu.be&v=l9CXXBkl5vI

DEV Community

Version Your Database / Future Directions

Top comments (0)

Read next

Build a Flashlight in Jetpack Compose

Optimizing Pagination in PostgreSQL: OFFSET/LIMIT vs. Keyset

Benchmarking Crunchy Data for latency

What is the best programming language? [2024/2025]