DEV Community

Spencer Shaw for IT Minds

Posted on

Building a Custom Cloud Certificate Authority

Building a Custom Cloud CA

Out on the wild internet, we sometimes want to build ecosystems of mutual trust networks where each entity trusts each other entity.

To do that, one of our best tools is asymmetrical key architecture: public/private key pairs. The only problem is that by its nature, this technique requires some third-party arbiter that both entities have agreed to trust. This entity is called a Certificate Authority (CA), and it manages building, distributing, and revoking the certificates that we use to verify each others’ identities.

If you have ever called the help desk of one of the major SSL companies that function as Certificate Authorities (CAs) for the open web — DigiCert, Let’s Encrypt, GlobalSign, etc. — you know that it can feel slightly less streamlined than trying to order an entire raw salmon at a juice bar.

Certainly, the benefits of working with these widely-used and trusted companies can be worthwhile for many projects, but sometimes you might want to be able to just cut out the bureaucracy and manage your own certification. This solution is particularly appealing for situations where you want to allow secure, encrypted communication between a set of components in a distributed system that you’ve built, of which you control all the nodes.

So let’s say you decide to go it alone. You want to set up your own trust-network.

What Does a CA Do?

A Real CA in Action

There are several perfectly functional tutorials out there for how to use a tool like openSSL to set up the root and intermediate certificates and private keys you’ll use to form your bedrock of your CA’s authority, but having these resources is just the start. You’ll need an application that can actually use them to perform the tasks required by a CA.

First, you’ll need to be able to generate certificates when clients ask for them.

Second, you’ll need to allow administrators to revoke certificates if they are compromised.

Third, you’ll need to allow clients to check the validity of certificates they’ve received from people they wish to communicate securely with (eg. if Kevin sent Joe a Certificate to prove his identity, Joe would want to check with the CA that the certificate was still valid, and not expired or revoked).

Note that this happens in one of two protocols: CRL or OCSP. CRLs (Certificate Revocation Lists) are simply a list of all the certificates that have been revoked by the CA, which the client can request, cache, and scan themselves to check that a given certificate is not on the list. OCSP (Online Certificate Status Protocol) is a more modern and lightweight (but not universally supported) way to check a certificate’s validity: the client simply bundles the relevant certificate into an OCSP request, and the CA will respond in real-time with the status of the certificate.

And last, you’ll need to be able to verify the identity of clients and administrators. In this implementation, we use a username/password system that clients use when asking for a certificate, and which administrators use when revoking certificates.

How Do I Build My CA?

So we want to build a CA ourselves that can do this. And let’s say we want to make it cloud-based, using Amazon Web Services.

With just a small suite of Lambda functions written in python and using pyOpenSSL, we can build application logic for each of the features listed above, as well as some background administrative functions that hold it all together.

With a few dynamoDB tables, we can store the CA’s own certificates, the certificates it distributes to clients, the revocations it receives, and a list of usernames and hashed passwords to provide authentication roles for clients.

With an S3 bucket, we can store a CRL file that can be updated and accessed by clients.

With a Secret in the AWS Secrets Manager, we can store the intermediate certificate’s private key, used to sign things with the authority of the CA. (The root certificate’s private key should be kept offline, locked away securely! If it’s compromised, the whole system is lost. If an intermediate certificate is compromised, the CA can in theory recover by implementing a new intermediate certificate and revoking all the certificates given by the compromised intermediate CA).

And with the AWS Serverless Application Model (a cousin of AWS CloudFormation that is custom-made for making Lambda deployment easier), we can deploy it all from a command line or, say, a Gitlab runner.

The Details

Here’s an overview of the Lambda Functions I ended up with to guide you in your quest to build the perfect CA:

request_certificate

Arguments: CSR (Certificate Signing Request), username, password
Return: Certificate

This allows your clients to authenticate with you using a username and password, along with a CSR: essentially a client's request to be certified as a member of the trust network.

In the background, this function will need to contact the database to:

  • verify the username / password in a valid users table
  • store the new certificate in a certificates table

Security Tip:
Whenever you implement a password authentication system, never simply perform a quick check of whether the username provided exists in your system and return a quick “failed” response if it doesn't.
Instead, be sure to always implement a fake hashing process that hashes a dummy password in the event that the client sent an invalid username. This prevents you from being exposed to timing attacks, where an attacker could deduce the presence of a username in your system by trying usernames until one took a longer time than others due to the need to subsequently go through the slow process of hashing the password they’ve provided to check it against the stored value.

revoke_certificate

Arguments: certificate’s serial number, revocation reason, admin username, password
Return: confirmation

This function lets an admin tell the CA that a certificate (referenced via serial number) has been revoked for one of the nine standard reasons (such as “privilegeWithdrawn”, “keyCompromise”, “CACompromise”).

In the background, this function will need to contact the database to:

  • verify the username / password in a valid users table, and ensure that it is marked as an admin role.
  • find the entry for this certificate and tag it as revoked for the given reason, also storing the revocation time alongside the entry.

ocsp_request

Arguments: An OCSP request (referencing a given certificate by serial number).
Return: An OCSP response SSL object (containing the status of the requested certificate).

This function will need to parse the OCSP request, extract the serial number, and ask the database to return the status of the given certificate.

This is an unauthenticated method, so no need for usernames and passwords.

update_crl

Arguments: none
Return: none

Side effect: A file called something like my-ca.crl is added to an S3 bucket to an endpoint known to your clients, which is created by scanning your revocation table and generating a CRL object using pyOpenSSL, which will include serial numbers, revocation reasons, and revocation times for all revoked certificates.

This function should run on a schedule so that you regularly are updating your CRL with any new revocations that are added to your database via the revoke_certificates function.

You might be asking “why would I wait until the next scheduled update_crl to put a revocation into the CRL? Shouldn’t I do it immediately after receiving the revocation?”.

And of course, you can do that, but remember that your clients often cache CRL’s, and each CRL contains an expiration date, which when exceeded will prompt the client to ask for a new one. So depending on your exact implementation, there may be no point in updating the CRL prematurely anyway, since your clients most likely will not request it until the previous one has expired.

get_crl

This is not really a Lambda function, but it’s listed here to remind us that when your clients want the newest CRL, they can request it from the endpoint where the update_crl Lambda deposits the CRL file in the S3 bucket.

Note that CRL files are not considered confidential, so there is no extra security here.

get_ca_certificates

Arguments: none
Return: Certificate Chain

This is an important but simple function. Basically, a client needs to have a copy of the CA’s root and intermediate certificate chains stored locally so it can use them to perform the SSL operations needed to verify things that are signed by the CA’s private keys.

These certificates are public, so there is once again no need for authentication on this function.

Extra Tips

You might be thinking that setting up all these usernames and passwords would be a hassle, but if you simply set up one single admin credential pair manually in your dynamoDB table, you can set up an administrative helper lambda function that allows you to use that first role to create more.

I also found that it was nice to have a function for auto-populating the CA’s own certificates and private keys, to be run once after deployment to avoid having to manually keep track of keys. Obviously, make sure to never commit your private keys to version control!

The Joy of Cloud Deployment

And the best part is that if you use AWS SAM, CloudFormation, Terraform, etc., you can simply spin up as many instances of your Certificate Authority as you need to cover all the different trust-networks you want to create! All you need to do is hop into an SSL tool like openSSL, pop out some new root and intermediate key pairs, and you’re ready to go!

Top comments (0)