This article was originally posted on Bizongo Engineering
If you’re from the software engineering industry, I’m sure you must be aware of how important is Secure Sockets Layer or SSL as an extra layer of security on any application, especially if you deal with sensitive information such as user authentication, secret tokens and payments.
Since we at Bizongo deal with such critical information on a daily basis like all e-commerce companies, it becomes evident that all applications should communicate as securely as possible. This is where SSL enters the scenario as illustrated in the below image or as an interactive web-app here.
While our public facing storefronts, back-end APIs have it (along with curated firewall rules), our internal applications trailed behind in this regard. So we went about exploring possible solutions to discover LetsEncrypt, a free Certificate Authority usable to generate fully compliant certificates trusted by browsers.
You must have thought procuring a LetsEncrypt SSL would be all but there’s more to it. This is due to a few of our peculiar requirements listed below:
- Our internal Environments across a few domains are all level 2 wild-cards, i.e:
*.qaX.domain.comwhere x is 1,2,3…n. Traditional wild-card SSL don’t cover these as they address only 1 level, e.g:
*.domain.com— reference1 and reference2
- We didn’t want to end up installing a bunch of software on all servers making it tedious to maintain, upgrade and debug while subsequently becoming very un-scalable.
- SSL certificates are procurable for a limited duration (LetsEncrypt has a validity of 90 days) and therefore need periodic renewal. We wished to automate this for the ease of our entire engineering team avoiding any potential issues arising due to an expired one.
- All relevant services like nginx should reload to reflect the renewed SSL seamlessly.
- Finally, our solution should also have Email / Slack notifications for better observability.
Before diving in-depth, do note a Google Search will showcase plenty of articles detailing how to setup LetsEncrypt for installing and automating SSL renewals. However, most of them cover only bare domains and/or www but not wild-cards thereby not helping us en-route to solving this problem. So, we thought of writing one ourselves to benefit anyone with a similar use-case.
Enough said, lets get cracking!
To cover all above points, we started to make a Shell Script in combination with the AWS API that hosts all our infrastructure. As LetsEncrypt adheres to high compliance, anyone generating or renewing SSL for a domain should be able to prove its ownership as well. This is doable via methods like HTTPS validation through a standalone web-server or alternatively, DNS validation via publishing temporary resolving TXT records.
We found the latter simple as it didn't expose our internal Environment’s ports 443 to the world or need an additional overhead of white-listing LetsEncrypt’s IP Addresses. Since the DNS records are temporarily needed, they can be created & deleted post every renewal.
Below is how the records look like:
_acme-challenge.qa1.domain1.com 300 IN TXT “<random-hash-value-1>” _acme-challenge.qa2.domain2.com 300 IN TXT “<random-hash-value-2>” .... ... .. . _acme-challenge.qaX.domainX.com 300 IN TXT “<random-hash-value-X>”
Considering that, our initial iteration of the script checked for the LetsEncrypt package i.e: certbot on the target server before proceeding to implement the remaining logic. Here’s where it became tricky as Certbot depends on Python libraries, either v2.7 or v3.4+ with sudo privileges. For the uninitiated, fiddling with these can break many parts of a Unix System (like the apt, yum package managers).
So we discovered this utility which is a Shell Script working as an ACME protocol client for generating or renewing SSL certificates. We then decided to run our script on a single host (as opposed to all), transfer them to requisite servers, reload any relevant services with a Slack status message.
This is how our final custom script logic looks like:
- Do an
opensslcall on our internal domains to check for existing SSL certificates and get its dates.
- Check if today’s date differs with the above by 7 days, then execute the remainder of the script, else exit.
- If executed, generate all aforementioned wild-card SSL and
rsyncit to the target folders with correct permissions on our internal servers.
- Do a configuration test and reload
nginxonly if the above succeeds.
curlto trigger a Slack web-hook and send an appropriate message to the configured channel.
We learned how LetsEncrypt works to provide an extra security layer and leveraged it to protect our internal environments as a completely scheduled automated solution — saving us the hassle of manual certificate management. Our internal applications also communicate via encrypted channels that is a crucial safeguard in today’s era of data-snooping, phishing and digital theft.
While we’re sure this post will help you deploy a similar solution to make your infrastructure more secure, are there any other ideas or best practices you would like to add to make this even better? Please feel free to share in the comments below.