In the Enterprise Database Security session I presented at Aerospike Summit 2020 I gave an overview of network security with Aerospike Enterprise.
To provide context, refer to the following diagram depicting an Aerospike deployment.
On the left we have developers building applications and back office jobs that will use an Aerospike database.
In the middle we have an Aerospike cluster managed by one or more administrative groups such as SREs, DevOps, DBAs, etc.
And on the right we have downstream systems which ingest and analyze security events and log data for use by information security teams.
The red arrows highlight where there is network connectivity between the Aerospike database and other systems or users as well as connectivity in between individual Aerospike nodes. This is where we need to apply network security.
First, let's look at firewall rules adhering to the principle of least privilege in which the firewall blocks all traffic and then rules are opened up to allow network access to only as needed.
For an Aerospike cluster there are 4 types of network traffic that needs to be allowed.
First, every Application node must be allowed to open up a TCP connection to every Aerospike node on the service port. This is port
3000 by convention but all Aerospike network settings can be configured.
The simple and more common way to setup these firewall rules is to allow the CIDR range for the Application network to open TCP connections to the CIDR range for the Aerospike network:
However, some security models might require each IP address to be explicitly allowed:
The next type of network traffic to allow are the heartbeat connections between each Aerospike node. This is the clustering protocol in mesh mode which allows the Aerospike nodes to form a cluster.
Every Aerospike node must be allowed to open a TCP connection to every other Aerospike node on the heartbeat port which is 3001 by convention.
The common rule to allow the heartbeat connectivity in the Aerospike CIDR range:
Alternatively, the rules to allow explicit IP addresses for Aerospike nodes:
The third type of network traffic to allow are the fabric connections between each Aerospike node. This connectivity allows data to transfer between the nodes for replication and "migrations" (redistribution of data).
Every Aerospike node must be allowed to open a TCP connection to every other Aerospike node on the fabric port which is 3002 by convention.
The common rule to allow the fabric connectivity in the Aerospike CIDR block range:
Alternatively, the rules to allow explicit IP addresses for Aerospike nodes:
Notice that the fabric rules are identical to the heartbeat rules except for the port. If the configured ports are sequential then the rules for heartbeat and fabric can be combined for firewalls that allow specifying port ranges:
The fourth and final type of network traffic is only applicable for deployments which are using Cross Datacenter Replication (XDR) to replicate data between Aerospike clusters in different data centers or cloud regions.
XDR traffic uses the same service connections that applications use. That means that every Aerospike node in the XDR source cluster must be allowed to open a TCP connection to every Aerospike node on the service port which is 3000 by convention.
The rule to allow traffic from the XDR source cluster to the XDR destination cluster using CIDR blocks:
Alternatively, the rules to allow explicit IP addresses:
Notice that the xdr rules are identical to the service rules except for the source being the XDR source instead of the Application nodes. The rules for service and xdr can be combined for firewalls that allow specifying multiple CIDR/IP ranges:
The second part of securing the network is about using TLS to encrypt data in transit and ensure connections are only established with trusted machines on the network.
We just looked at the 4 types of network connectivity: service, heartbeat, fabric, and XDR. Aerospike can be configured to use TLS on each of those types of network connections independently.
The service connections support standard or mutual authentication TLS also referred to as mTLS.
Both modes encrypt the data in transit, however, with standard TLS, only the Aerospike nodes authenticate themselves to the application nodes. With mutual TLS, the Aerospike nodes authenticate themselves to the application nodes who also authenticate themselves to the Aerospike nodes. So it’s a 2-way authentication.
A “bad actor” that found its way into the network somehow, could not pretend to be an application node nor pretend to be an Aerospike node without possessing the correct private key.
When TLS is enabled on the fabric or heartbeat connections, they will always use what amounts to mutual authentication for those Aerospike-to-Aerospike connections. So once again, if a “bad actor” somehow breaches the private network, they could not pretend to be another Aerospike node in the cluster nor decrypt any of the data transferring between nodes without possessing the appropriate private key.
If you recall, XDR connectivity is actually just using the service connections. So with XDR, the source cluster acts as the TLS clients, much like the application nodes, and the destination cluster acts as the TLS servers.
Now, all of these types of connections are generally configured to use the same server certificate as they are the same servers, however, they can technically be configured to have separate certificates.
Additionally, every Aerospike node can be configured to use the same certificate, meaning the entire cluster shares that certificate, or every node can be set up with it’s own unique certificates.
So that gives us three dimensions to work with; Standard vs. mutual TLS on the service connections, individual or shared certificates on each type of connection, and individual or shared certificates on each Aerospike server node.
Obviously, standard TLS with a single cluster-wide certificate is the simplest in terms of setup and management complexity. And if you’ve spent much time dealing with certificate lifecycle management, one certificate certainly sounds more pleasant to manage than dozens, hundreds, or thousands. And indeed it is.
But for organizations adopting more of a “zero trust” model, perhaps within environments dealing with highly sensitive data, on networks which the organization has deemed as untrusted such as the public cloud, unique certificates on each node may be required.
However, most enterprise use cases will fall somewhere in between these two extremes and the flexibility of Aerospike’s TLS configuration will allow it to be tailored to the specific needs of the organization, the environment, and the use case.
A cipher suite is a set of algorithms that are used in various phases during the TLS communication. The protocol allows for the client and server to negotiate which set of these algorithms both sides support.
Every TLS connection I described in the previous section can be configured as to which cipher suites are allowed and in what priority.
Without going too deep into the weeds about TLS cipher suites, let me just make two points about selecting TLS cipher suites to use with Aerospike Enterprise.
Unlike a public, internet-facing application, many Aerospike deployments are done in environments where the organization is in control of both the client and the server. That means that compatibility with public clients like web browsers is not a factor and the list of allowed cipher suites can be narrowed down to just the more current algorithms which provide the best security and performance.
At the time of this presentation, that is highly likely to mean a cipher suite using AES encryption, which has hardware acceleration built-in to modern CPUs, using Galois Counter Mode or GCM, which also typically out-performs previous block cipher modes.
Aerospike uses OpenSSL and thus configuring the cipher suite uses the OpenSSL notation. This is recognizable by the use of hyphens as shown in the top line in the image. Other tools and libraries, such as Java, may use the IANA notation. This is recognizable by the use of underscores as shown in the second line here. This means that specifying the cipher suites in Aerospike configuration may use a different notation that other sources you may be referencing.