It was sometime ago, that i was working in a complex Greenfield project.
We had to design a secure infrastructure (in many aspects), make sure that all traffic was encrypted at Rest and in Transit and deploy a large number of services in AWS. While the Dev teams were working on building the applications, i was focusing on those requirements.
The main requirement, was to design and implement a flat Mesh Network on AWS (with encrypted traffic). All servers deployed, should have a point-to-point connection to every other peer in the network. On top of that, some servers hosted on Azure and GCP should be able to join the Mesh. And to add more complexity, external clients like Laptops or Mobile Phones should be able to access securely, specific servers/services in the Mesh.
After gathering all the information and speaking with a number of people to make sure that all requirements were properly documented and added to the backlog, it was time to start working for the PoC (Proof of Concept)
One of the tools that seemed like the right candidate was Wireguard.
WireGuard is a secure network tunnel, operating at layer 3, implemented as a kernel virtual network interface for Linux. It utilizes state-of-the-art cryptography and it aims to be faster, simpler, leaner, and more useful than IPsec. Wireguard is currently under heavy development, but already it might be regarded as the most secure, easiest to use, and simplest VPN solution in the industry. That’s why a lot of VPN service providers have started to using it..
After reading the documentation and running some tests, i decided to proceed with that. The process of setting up a Wireguard as a VPN is straight forward. You have to install it, generate the required keys, create a wg0.conf file for each server, configure the relevant Security groups on AWS and your VPN is up and running quickly.
But in our case, we had to build a Mesh consisting of 100s of servers, most of them being part of AutoScaling groups. As a result, we didn’t have the option to configure wg0.conf manually, every time a server had to join the mesh.
WireGuard works by encrypting the connection using a pair of cryptographic keys, each server needs to have it’s own private and public keys and then exchange public keys with the rest.
The wg0.conf file contains all the necessary configuration parameters for the WireGuard interface
Here are some of the main parameters that can be configured in wg0.conf:
PrivateKey: This parameter defines the private key for the WireGuard interface. It is used to authenticate and encrypt traffic between peers.
ListenPort: This parameter defines the port that WireGuard will listen on for incoming connections ( default is UDP 51820).
Address: This parameter defines the IP address and subnet mask for the WireGuard interface.
Peer: This parameter defines the configuration for a peer on the WireGuard network. It includes the public key of the peer, its IP address, allowed IPs (the IP ranges that the peer can access), and other options such as endpoint configurations.
I started working on my Spike and it was a challenge to find the best way to implement such a Mesh topology. At first i deployed a number of EC2 instances, by using Terraform, in multiple AWS regions. After that i was going through my list and started building and trying things:
- Terraform and Ansible: Successfully created a Mesh, but it was really difficult to manage any new peers and auto update the wg0.conf when they joined. Came to the conclusion that it was fine for static setups but not for dynamic.
- Terraform, Hashicorp Vault and a ton of bash scripts: That looked promising, let’s see how it works. When connecting nodes via wireguard, each node has to know the public key and endpoint ip of all peers. In this scenario, nodes with proper authentication in Vault were allowed to publish their own data and also to read connection data from other peers. They could all read the meeting point data for our mesh (data structure containing basic information about our mesh network), publish their own configuration to vault, query vault for other nodes known to the meeting point and add a wireguard peer for each of them. Although it worked, it was really complex to support it and troubleshoot, especially after the handover.
Then i came across a tool called Netmaker. It was at the early stages of development but looked really promising (Since then, i have tested all versions, including the current one 0.18.5 that was released a few days ago, with big improvements and fixes).
Netmaker is a platform for creating fast and secure virtual networks with WireGuard. It is a tool for creating and managing virtual overlay networks. If you have at least two machines with internet access that you need to connect with a secure tunnel or thousands of servers spread across multiple locations or cloud providers, Netmaker is the perfect “tool”. It connects machines securely, wherever they are.
Now, after this intro, let’s see how we can create a secure Mesh Network on AWS using Netmaker and Wireguard.
- Start by Launching a VM with Ubuntu 20.04 or latest with a public IP. (Ubuntu is the one currently supported)
- Open ports 443, 80, and 51821-51830 (UDP) on the security group. You can make this range smaller, but keep in mind that you need have a port for each network you create. (I ‘am going to explain more about Networks later)
- Run the following script:
sudo wget -qO /root/nm-quick-interactive.sh https://raw.githubusercontent.com/gravitl/netmaker/master/scripts/nm-quick-interactive.sh && sudo chmod +x /root/nm-quick-interactive.sh && sudo /root/nm-quick-interactive.sh
You need to answer a number of simple questions and at the the end you are going to presented with the login URL.
After typing the URL, you are going to be asked to create a username and password and when you login this is what you are going to see.
The first thing we have to do afterwards, is to create a Network and enter the IP ranges that our servers would use for secure cross-communication. (Wireguard interface wg0, is going to use an IP address for that range)
Click the ‘Networks’ tile on the dashboard, or in the left navigation panel click ‘Networks’.
On the Networks screen, click on the ‘Create Network’ button.
Give you network a name, and then enter your preferred CIDR. Or click on the ‘Autofill’ button and then change the name and the CIDR generated by the autofill option.
Then proceed by creating the required keys. When done, we can see that there multiple ways to add a peer to our Mesh Network
Most of the hard work is done. And now it’s time to launch a few instances in AWS in multiple regions and spread them across Public and Private subnets. In our case almost all instances are in Private subnets, with the exception of Netmaker server and Azure instance.
I like to use Terraform with Gitlab Runners for my test deployments and for this demo i had about 10 EC2 instances up and running really fast (Was using spot instances to minimise costs). Just remember that you need to deploy a standalone (on-demand) EC2 instance for Netmaker.
All the the Security Groups, for the Nodes, were configured to allow incoming traffic (UDP) to ports 51820–51830.
And with with the help of User Data and the command shown below, we can configure the nodes to join the Mesh during the launch process.
(Need to replace eyJzZXJxxxyxxxxxxxxxxxxxxxxcccccccccccvvvvvvv0000000== with your token)
#!/bin/bash sudo curl -Lo /etc/yum.repos.d/wireguard.repo https://copr.fedorainfracloud.org/coprs/jdoss/wireguard/repo/epel-7/jdoss-wireguard-epel-7.repo sudo yum install epel-release sudo amazon-linux-extras install -y epel && yum install -y wireguard-dkms wireguard-tools curl -sL 'https://rpm.netmaker.org/gpg.key' | sudo tee /tmp/gpg.key curl -sL 'https://rpm.netmaker.org/netclient-repo' | sudo tee /etc/yum.repos.d/netclient.repo sudo rpm --import /tmp/gpg.key sudo yum check-update sudo yum install -y netclient netclient register -t eyJzZXJxxxyxxxxxxxxxxxxxxxxcccccccccccvvvvvvv0000000==
After a few minutes we have our instances up and running, fully configured with Wireguard and Netclient (All of the them have automatically joined our Mesh network).
Now let’s launch one more server but this time in… Azure
Time to check our Netmaker GUI and make sure that all nodes have joined. If they don’t show immediately, there is no need to worry. It could take up to 5 mins to show up. In our case all Nodes are now visible with a Healthy status.
At this point we have successfully deployed and configured a flat Mesh network, not only between AWS instances but also with a server in a different cloud provider. All traffic between them is encrypted in transit, by using Wireguard.
Let’s see how our Mesh Network looks like at this stage
How about running some tests to confirm that everything is working as expected?
By default, Netmaker creates a “full mesh,” meaning every node in our network can talk to every other node. But there is a nice feature that you can use in order to enable/disable any peer-to-peer connection in the network.
The ACL feature can be accessed by either clicking on “ACLs” in the sidebar, or by clicking on a Node in the Node List.
There are cases that external clients need to access some services running in the nodes. That can be a Mobile phone, a laptop/tablet or an IoT device.
We can achieve that by creating an Ingress. (And once connected to the Ingress, we can reach all servers in the network.)
The next step is to generate the client configs. Clients can then join our mesh, either by scanning a QR code or by importing the Wireguard config (Please note, that Wireguard client must be installed in the mobile, laptop etc)
In our case i have download the config in my laptop and have connected using the Wireguard client.
For this demo, i have installed Apache in an AWS EC2 instance and in an Azure VM. As you can see, i can access both from my laptop, through a secure tunnel, using the 10.141.x.x IPs ( Mesh network CIDR)
This is just a use case of using Netmaker and Wireguard to create a secure Mesh Network on AWS. There are more as you can see below and we are going to discuss some of them in future posts.
- Automate the creation of a large WireGuard-based (Mesh) network
- Secure access to a home or office network
- Provide remote access to resources like an AWS VPC, or K8S cluster
- Create clusters that span environments
- Remotely access a cluster from an external source
- Remotely access an external source from a cluster
- Manage a secure mesh of IoT devices
Hope you found this post useful. Feel free to reach to me for any questions.