Memory Management in CockroachDB

#cockroachdb #memory

If you've played with CockroachDB before, you may have seen some memory-related flags when starting Cockroach. Specifically, things called "cache" and "max-sql-memory":

cockroach start \
--certs-dir=certs \
--advertise-addr=<node1 address> \
--join=<node1 address>,<node2 address>,<node3 address> \
--cache=.25 \
--max-sql-memory=.25

We talk a little bit about these settings in our docs, but how are they actually used?

I put together a few diagrams to illustrate what we mean by cache and sql-memory, and I'll try to illustrate how they're used.

Above is a diagram of a node running in a CockroachDB cluster. You can see that there are two caches managed by the cockroach process called SQL Memory and Pebble Cache. What we refer to above as "cache" is this Pebble cache. Pebble is the name of the Key/Value storage engine that CockroachDB leverages to actually store data.

There are some other areas of memory available outside of the cockroach process, namely the Operating System's page cache.

Let's consider a read that happens against a CRDB cluster, and let's assume for this example that this is the first read that's happened against this node so there is nothing loaded into any caches.

Here are the steps that happen to satisfy this read query:

SQL SELECT query hits the SQL engine
SQL engine asks KV for data
KV asks the O/S for data
O/S reads from disk & loads files into the page cache (compressed -- all SSTables in CRDB are compressed)
O/S gives KV data
KV puts data in Pebble cache (uncompressed)
KV processes data and gives to SQL
SQL uses memory to process data (ordering, grouping, etc.)
SQL returns results to client

You can imagine that on a subsequent read, that some of these steps could be short-circuited if they got a cache hit -- namely, you might be able to avoid the read from disk and only hit the O/S Page cache; or, even better, if the Pebble cache has the data, you don't need to even talk to the O/S at all.

The write path for a CockroachDB node is simpler.

The write involves:

Mutation query (i.e., INSERT/UPDATE/DELETE) hits the SQL engine
SQL engine passes data to KV engine
KV writes data to SSTables in memory
KV calls fsync to flush memory to disk (compressed)
KV updates the Pebble cache (uncompressed)
KV acknowledges write to SQL engine
SQL returns acknowledgement to client

Hopefully, this blog clears up what "SQL Memory" and "Cache" mean in the context of CockroachDB. Happy SQL-ing!

DEV Community

Memory Management in CockroachDB

Top comments (0)

Read next

Advantages of Asymmetric Encryption with Random Public and Symmetric Private Keys

Introduction to Docker: A Comprehensive Guide to Containers, Images, and Modern Software Development

How Test-Driven Development (TDD) Enhances Code Refactoring and Maintains Software Quality

How to Fix Data Corruption in MariaDB after Improper Shutdown?