How we made our SQL database QuestDB even faster and more accurate

#opensource #java #database #sql

See our article here

About a month ago, we posted about using SIMD instructions to make aggregation calculations faster.

Many comments suggested that we implement compensated summation (aka Kahan) as the naive method could produce inaccurate and unreliable results. This is why we spent some time integrating kahan and Neumaier summation algorithms. This post summarises a few things we learned along this journey.

We thought Kahan would badly affect the performance since it uses 4x as many operations as the naive approach. However, some comments also suggested we could use prefetch and co-routines to pull the data from RAM to cache in parallel with other CPU instructions. We got phenomenal results thanks to these suggestions, with Kahan sums nearly as fast as the naive approach.

A lot of you also asked if we could compare this with Clickhouse. As they implement Kahan summation, we ran a quick comparison. Here's what we got for summing 1bn doubles with nulls with Kahan algo. The details of how this was done are in the post.

QuestDB: 68ms Clickhouse: 139ms

Thanks for reading and please leave us a star if you find the project interesting!

Nic

Top comments (3)

Raunak Ramakrishnan • May 29 '20

Hi Nicolas,

I have been following QuestDB for sometime now. I use QuestDB as an example for well-written minimal dependency Java projects.

It would be great if you could link the commits/diffs for the Kahan and Neumaier summations too in this post so we can look into the changes required for such an undertaking.

Another idea for a blog post would be tips on how to vectorize Java code. AFAIK, the JVM auto-vectorizes code or we need to use C++ code via JNI.

Thanks,
Raunak

Vlad Ilyushchenko • May 29 '20

Hi Raunak,

Thank you for the kind comments. Here is commit diff that added Kahan and Neumaier vector summations

Vlad

Shodipo Ayomide • Jun 2 '20

Awesome one Nicolas!
cc Vlad

You all are amazing!⚡️

DEV Community

How we made our SQL database QuestDB even faster and more accurate

Top comments (3)

Read next

A Comparative Analysis between RK3588 and RK3576 Chips: Unveiling the Technological Distinctions

The New Way To Use OPA With Backstage

Planning and creating a database before you start working on your backend can save you time

01. தரவுத்தளம் எவ்வாறு உருவானது, அதன் தேவை என்ன? How did the database come about, What is its need?