Introduction
Aerospike is a key value database maximising SSD/Flash technology in order to offer best in class throughput and latency at petabyte scale.
Standard Aerospike usage will have the primary key index in DRAM and the data on SSD. Although Aerospike's usage of DRAM is very low at 64 bytes per object, for very large numbers of objects (100bn+) users might wish to consider the all-flash mode in which the primary key index is also placed on disk. More detail at all flash usage.
There are a number of non-trivial steps to go through to set up all flash. For that reason I've extended aerospike-ansible to allow automation of this process. This article walks through the automated process. It's envisaged that this will be useful for those evaluating the feature, or looking to get up and running with it quickly.
A working knowledge of aerospike-ansible is assumed. This introductory article may also be useful.
All Flash Calculations
In order to correctly configure a system for all flash, you need to know the number of partition-tree-sprigs that are appropriate for the object count you will have in your database. You can think of a partition tree sprig as a mini primary key index - we use these in order to have a lower depth primary key tree, allowing us to lookup record location more rapidly. More detail at sprigs.
It's important for all-flash because we size the system so the sprigs fit inside single disk blocks, minimising read and write overhead.
You can find details of the calculation here, but to make life easier a spreadsheet can be found in aerospike-ansible at assets/all-flash-calculator.xlsx
.
Populate the yellow cells - # of objects, replication factor and object size.
The spreadsheet will calculate required partition-tree-sprigs.
It will also determine the fraction of available disk space that should be given over to the primary key index, based on the object size. In the screenshot, we can see that for 100m records, replication factor 2, average record size 1024 bytes, the overhead per record is 172 bytes and the overall record footprint is 2220 bytes, so approx 1/13 of the disk space should be allocated to the index.
Using Aerospike-Ansible
In vars/cluster-config.yml
- Set
partitions_per_device
to the value given in the spreadsheet - 13 in the example. The first partition on each device is used for the all flash index to ensure the correct index:data disk space ratio. - Add
partition_free_sprigs: YOUR_VALUE
- YOUR_VALUE would be 1024 for this example
You will also need to
- Set
all_flash: true
- Set
enterprise: true
- Provide a path to a valid Aerospike feature key using
feature_key: /your/path/to/key
. You must therefore be either a licensed Aerospike customer, or running an Aerospike trial.
Having done that
ansible-playbook aws-setup-plus-aerospike-install.yml
You should check that the aggregate disk space across your cluster exceeds the amount recommended in the spreadsheet.
Verification
Once the setup process is complete, log into one of your cluster nodes
./scripts/cluster-quick-ssh.sh
then access asadm
(admin tool) followed by info
command
The index type comes up as 'flash' as per the highlight.
Data Load
You can follow the instructions in benchmarking to quickly load some data into the new configuration.
As before, we can use asadm to examine the (highlighted) disk footprint of the primary key index for (in this case) 10m records (20m includes replicas).
Conclusion
The aerospike-ansible tooling makes it easy to set up all flash for Aerospike and benefit from the DRAM saving it offers.
Cover image Michał Mancewicz
Top comments (0)