DEV Community

Ronen Botzer for Aerospike

Posted on • Originally published at developer.aerospike.com

In-memory database improvements with Database 7

Welcome to Aerospike Database 7, and the initial server 7.0 release that sets the foundation for big developer API additions in server 7.1 and beyond. Server 7.0 also overhauls in-memory namespace in significant ways.

Aerospike releases have always been a continuum, with preceding minor releases providing prerequisite work upon which the next major release stands. For example, server 5.6 added a data structure for set indexes later used for secondary indexes (SI). Server 5.7 overhauled secondary index garbage collection and cut down SI memory consumption by 60%. Aerospike Database 6 was built on these essential improvements. Similarly, server 6.4 removed support for single-bin and data-in-index namespaces, freeing up primary index (PI) space needed for upcoming server 7.1 features.

Though major server releases have distinct themes, work on specific subsystems doesn’t end when our main focus shifts. Just as server versions 6.1 and 6.4 delivered significant throughput improvements to cross-datacenter replication (XDR) following the XDR rewrite theme of Aerospike Database 5, secondary index improvements will continue in Aerospike Database 7 releases.

Unified storage format revolutionizes in-memory namespaces

After previously using the same storage format for namespaces that persist data on SSD or Intel Optane™ Persistent Memory (PMem), an overhaul of in-memory namespaces in server 7.0 consolidates all three storage engines to the same efficient flat format. This results in multiple operational benefits.

Faster restarts for in-memory namespaces

In Aerospike Enterprise Edition (EE) and Standard Edition (SE), the new storage-engine memory places its data in shared memory (shmem) rather than volatile process memory. This means that in-memory namespaces can now fast restart (AKA warmstart) after clean shutdowns.

An in-memory namespace without persistence shares the fast restart capability mentioned above. More interestingly, cold restarts, during which Aerospike daemon (asd) rebuilds its indexes, run much faster, as record data is read from shared-memory, rather than a storage device.

An in-memory namespace with storage-backed persistence benefits from faster cold restarts when certain configuration parameters are adjusted (for example, partition-tree-sprigs). Only after a crash will Aerospike read from storage-backed persistence to repopulate record data into memory, along with rebuilding the indexes.

Adding to its existing capability of backing up index shmem segments to disk, the Aerospike Shared Memory Tool (ASMT) can now be used after Aerospike shuts down to back up in-memory namespace data ahead of restarting the host machine.

To summarize, in-memory namespaces without storage-backed persistence don’t lose their data on restarts of asd, and all in-memory namespaces benefit from faster restarts (both warmstart and faster coldstart).

Compression for in-memory namespaces

Now you can configure an in-memory namespace to use storage compression, such as ZStandard (zstd), LZ4, or Snappy. Customers already using storage compression in the persistence layer of an in-memory namespace (or data on SSD or PMem) will achieve the same compression ratio for their data, regardless of the storage engine, at the same CPU cost.

Stability and performance improvements for in-memory namespaces

As in-memory and on-device storage use the exact same storage format in server 7.0, an in-memory namespace is mirrored to its persistence layer. This means that its write-block defragmentation happens much faster in memory and no longer requires device reads. The same continuous write-block defrag mechanism eliminates heap fragmentation encountered in the former JEMalloc-based in-memory storage. Similarly, tomb raiding of durable-delete tombstones happens fully in memory and requires no device reads. This removes back-pressure generated by the persistence layer’s devices on in-memory namespace operations.

The unified format's single-pointer and contiguous record storage improve performance in two other ways. First, reading the entire record costs a single memory access, as opposed to the older scattered bins approach requiring multiple independent reads from RAM. Second, since write-blocks are mirrored to the persistence layer, write operations save the CPU previously consumed by separate serialization to a second storage format.

Capacity planning for in-memory namespaces

The first thing to note about in-memory namespaces with storage-backed persistence in server 7.0 is that the persistence layer is exactly the same as the in-memory storage; the two are mirrors of each other, which has a capacity planning implication. For previous versions, we had recommended that persistent storage be a multiple of memory-size (the max memory allocation for the namespace), typically 4x. Starting with server 7.0, persistent storage needs to be at a 1:1 ratio to the memory you wish to dedicate to your in-memory namespace.

The second thing to be aware of is that in-memory data storage is static, and gets pre-allocated in server 7.0 rather than progressively growing and bound by the (now obsolete) memory-size configuration parameter.

Capacity planning for indexes has not changed. In server 7.0, indexes continue to start small and grow in increments defined by the configuration parameters index-stage-size, sindex-stage-size. Set indexes grow in 4KiB increments after an initial pre-allocation. After upgrading a cluster node, the namespace indexes consume the same amount of memory as before, out of the system memory not pre-allocated for namespace data storage.

We no longer have diverging capacity planning formulas for data in-memory versus data on persistent storage. All storage-engine configurations use the same storage format and a single formula in the capacity planning guide. Treating capacity planning of in-memory data storage the same as you would data storage on an SSD is helpful. In both cases, you are aiming to size pre-allocated storage.

The memory consumed by your indexes (and room for them to grow), plus the pre-allocated namespace data storage, should fit within your host machine’s RAM and within the previously declared memory-size. Read the special upgrade instructions for Server 7.0 for more details.

Caveats

Support for single-bin namespaces was removed in server 6.4. Aerospike users without single-bin namespaces in their cluster may upgrade to server 7.0 through a regular rolling upgrade. Otherwise, please consult the special upgrade instructions for Server 6.4.

Due to the unified storage format, a write-block-size limit of 8MiB applies to in-memory namespaces (with or without persistence). Aerospike users who depend on the former 128MiB record size limit of in-memory without persistence will need to break up their records. Customers may choose to delay upgrading till server 7.1, which will enable an easier transition.

Configuration and monitoring

The newly released Aerospike Observability Stack 3.0 and Tools 10.0 support metrics and configuration for both Aerospike 6 and Aerospike 7 and are designed to ease your transition to server 7.0. If you are not familiar with Aerospike Observability and Management (O&M), we have a short video, blog, and webinar available on our site.

Configuration parameter and metric changes in detail

Namespace configuration, regardless of storage engine choice, is simpler in server 7.0.

Stop-writes and eviction thresholds are controlled by the storage-engine configuration parameters stop-writes-used-pct, stop-writes-avail-pct, and evict-used-pct, which are relative to the namespace data storage size. There are also a pair of thresholds relative to the system memory - stop-writes-sys-memory-pct and evict-sys-memory-pct.

Data storage metrics have been simplified to data_used_bytes, data_total_bytes, data_used_pct, and data_avail_pct for all storage engine types.

Index metrics are also simpler. Set indexes have set_index_used_bytes. Primary and secondary indexes have index_used_bytes and sindex_used_bytes, whether they’re stored in shared memory, PMem, or an SSD. If they’re in persistent storage, they also have index_mounts_used_pct and sindex_mounts_used_pct relative to the mounts_budget.

Multi-tenancy

Many Aerospike customers deploy their database clusters as a multi-tenant service, with distinct users separated by sets within a namespace. Multi-tenancy leans on Aerospike enterprise features, such as scoped role-based access control (RBAC), rate quotas, and set quotas.

Server 7.0 makes multi-tenant deployment easier with several new features.The limit of 64K unique bin names per-namespace was removed. Operators no longer need to advise developers to restrict how many bin names their applications write into the namespace. As a result, the bins info command and the available_bin_names namespace statistic, were removed.

The limit on unique set names per namespaces was raised from 1023 to 4095, allowing for set-level segregation of more tenants on the same Aerospike cluster.

Finally, an operator can now assign a unique set-level default-ttl as an override of the namespace default-ttl.

New developer API features

Server 7.0 adds the capability to index and query bytes data (BLOBs).

Application developers may now choose to persist key-ordered Map indexes, trading off extra storage for improved performance. The new MapPolicy option can be applied when creating a new Map bin or with the Map set_type operation.

MapPolicy(MapOrder order, int flags, boolean persistIndex)

Persisting Map indexes should only be used by an application once all the nodes in the Aerospike cluster have been upgraded to version >= 7.0.

Dropping support for old OS versions

Server 6.4 added support for Amazon Linux 2023 and Debian 12. As I previously warned, server 7.0 removes support and el7 builds for Red Hat Enterprise Linux 7 and its variants, including CentOS 7, Amazon Linux 2, and Oracle Linux 7. Similarly, server 7.0 will not be available on Debian 10.

Aerospike engineering takes performance seriously, and Aerospike users choose to build mission-critical applications on our database because it has the best cost-performance in its field. Performance is hindered by running new Aerospike server versions on ancient lower-performing OS kernels, such as the 3.10 Linux kernel packaged with RHEL 7. In a recent announcement, the Linux kernel team declared they will no longer offer six years of LTS support. This announcement validated our perspective on the stability and performance cost of running Aerospike on old kernels.

Both Debian 10 and RHEL 7 reach their end of life (EOL) in June 2024 ( 5 and 10 years from their initial release, respectively ). Each new major and minor Aerospike server version gets two years of bug fixes and security vulnerability support. Going forward, new server versions will not be offered on OS distro versions scheduled to expire during this support period. Subsequent patch releases (hotfixes) will continue to be built and tested on the same OS distro versions as when they were first released.

Try Aerospike 7.0

Persistence, fast restart capability, and built-in compression all make the new Aerospike in-memory namespaces appealing for in-memory database use cases.

Consult the platform compatibility page and the minimum usable client versions table. For more details, read the Server 7.0 release notes. You can download Aerospike EE and run in a single-node evaluation; you can get started with a 60-day multi-node trial at our Try Now page.

Top comments (0)