DEV Community

Cover image for Building labs using component-based architecture with Terraform and Ansible
Ederson Brilhante
Ederson Brilhante

Posted on

Building labs using component-based architecture with Terraform and Ansible

Currently, I am a Site Reliability Engineer(SRE) in the observability team at Splunk. But when I worked in this solution I was part of the GDI(Get Data In) organisation at Splunk.

Now, let's talk about the problem.
Part of the engineer's job in GDI is building add-ons to Splunk. Add-ons, in a nutshell, are plugins to connect third party data sources to Splunk platform.

Every time we need to work on a new add-on version of a specific third party, we need to set up 2 labs, 1 for development purposes, and the other with QA specifications.

The GDI organisation owns many add-ons, so we use a strategy to make rotations in which team and who in the team will work in a new version every time.
This is good to spread the knowledge, but we had problems keeping reliable and consistent labs across the dev cycle and the teams.

A big fraction of people's time was manual work to set up the labs(manual configuration or writing new bash/power-shell scripts). Along with a lot of time expended in the development process, the manual work creates a great deal of headache for the developers

The teams agreed it needed some automation, in order to reduce the pain to create labs and avoid duplication or rework.

We came up with the idea of using infra as code(IaC). Which was nothing so special that other companies weren't already doing.

Because the teams are small, and they are focused on the development of add-ons, we need an approach where the teams could have customised labs, but not necessary to write IaC scripts.

Based on the Design Principles of react components, we came up with an idea to create components that can be reused and plugged in other components. And each component would be a Terraform Module, an Ansible Playbook or an Ansible Role.

For a better elucidation, let's use this example - Build a lab with 3 different environments:

  • Environment A will have 4 windows instances:

    • 1 Windows Server 2016 as Domain Controller.
    • 3 Windows Servers as members server:
      • 1 Windows Server 2016 with Windows Event Collector:
      • Splunk Universal Forward
      • Collecting only Sysmon events from nodes
      • 1 Windows Server 2016 with Windows Event Forwarding
      • 1 Windows Server 2019 with Windows Event Forwarding
  • Environment B will have 7 windows instances:

    • 1 Windows Server 2016 as Domain Controller.
    • 6 Windows Servers as members server:
      • 1 Windows Server 2016 with Windows Event Collector (WEC A):
      • Splunk Universal Forward
      • Collecting only Application events from nodes
      • 1 Windows Server 2016 with Windows Event Forwarding, sending logs to WEC A
      • 1 Windows Server 2019 with Windows Event Forwarding, sending logs to WEC A
      • 1 Windows Server 2019 with Windows Event Collector (WEC B):
      • Splunk Universal Forward
      • Collecting only Security events from nodes
      • 1 Windows Server 2016 with Windows Event Forwarding, sending logs to WEC B
      • 1 Windows Server 2019 with Windows Event Forwarding, sending logs to WEC B
  • Environment C will have 3 windows instances:

    • 1 Windows Server 2016 as Domain Controller with Windows Event Collector :
      • Splunk Universal Forward
      • Collecting only Security and Sysmon events from nodes
    • 2 Windows Servers as Members Server:
      • 1 Windows Server 2016 with Windows Event Forwarding
      • 1 Windows Server 2019 with Windows Event Forwarding

Normally, Using terraform modules and Ansible Playbooks we could reproduce these environments.
We would need to create specific playbooks and terraform configs for each environment.
And here comes the problem. Spending time coding permutations in some similar configurations.

To avoid that, our approach with component based architecture, we only have to write a single config file describing which modules these labs need to run without touching any Terraform Script or Ansible Playbook.

Architecture

Image description

The solution we made is compatible with many kind of labs configurations deployed in AWS.

Terraform scripts are used to deploy the infrastructure, spinning up EC2 instances and other AWS resources. And to provision softwares and system configuration inside of each EC2 instance, terraform calls proper ansible playbooks.

Playbooks are a group of roles. A role represents an implementation of specifics configuration in an independent way.

Take role windows_splunk_universal_forward as example. This role downloads, installs and configures a splunk universal forward instance in Windows. This role is coded to be used any windows version.

Repo Structure:

   ├── ansible
   │   ├── all_roles
   │   │   └── distros
   │   │       ├── linux
   │   │       │   └── roles
   │   │       │       └── <new-linux-role>
   │   │       └── windows
   │   │           └── roles
   │   │               └── <new-windows-role>
   │   └── playbooks
   │       └── <new-playbook>
   └── terraform
       └── modules
           ├── distros
           │   └── <new-distro-type>
           └── environments
               └── <new-environment-type>
Enter fullscreen mode Exit fullscreen mode

Terraform

Terraform is an open-source infrastructure as code software tool that provides a consistent CLI workflow to manage hundreds of cloud services. Terraform codifies cloud APIs into declarative configuration files.

For more info check on official documentation.

Terraform Structure:

   terraform/
   ├── modules
   │   ├── constants
   │   ├── core
   │   ├── distros
   │   │   └── <distro-type>
   │   └── environments
   │       └── <environment-type>
   └── wire
Enter fullscreen mode Exit fullscreen mode

What is an environment?

A environment is a pre-defined kind of relations between nodes.
Each module environment is found in path terraform/modules/environments. And uses the modules in terraform/modules/distros to build the proper relations.

For elucidation, take this case as an example:

  • 1 Windows Domain Controller.
  • X number of Member Servers.

We have this hierarchy, because we need create first the DC and so give some data to member servers, such as IP:

```
# file: terraform/modules/environments/linux-standalone/main.tf

module "windows-domain-controller" {
    source = "../../distros/windows-server"
    ...
}

module "windows-server-member" {
    source = "../../distros/windows-server"
    ...
    windows_domain_controller = module.windows-domain-controller
    ...
}
```
Enter fullscreen mode Exit fullscreen mode

What is a distro?

A distro is a pre-defined kind of AMI with specific kind of setup and/or provisioning.
Each module distro is found in path terraform/modules/distros. And have a proper ansible playbook to execute the provisioning.

For elucidation, take these cases as examples:

  • Linux
  • Windows
  • Splunk
  • Free BSD

Terraform example:

```

locals {
...
provisioning_command     = "ansible-playbook -i $PUBLIC_IP /opt/automation/tools/ansible/playbooks/windows.yml --extra-vars='${local.extra_vars}'"
}

...

resource "aws_instance" "windows_server" {
...
}

resource "null_resource" "ansible" {

triggers = {
    command = replace(local.provisioning_command, "$PUBLIC_IP", "'${aws_instance.windows_server.public_ip},'")
}

provisioner "local-exec" {
    command = replace(local.provisioning_command, "$PUBLIC_IP", "'${aws_instance.windows_server.public_ip},'")
}
}
```
Enter fullscreen mode Exit fullscreen mode

Ansible

Ansible is an open-source software provisioning, configuration management, and application-deployment tool enabling infrastructure as code. It runs on many Unix-like systems, and can configure both Unix-like systems as well as Microsoft Windows.

For more info check on official documentation.

Ansible Structure:

```
ansible/
├── all_roles
│   ├── distros
│   │   └── <distro-type>
│   │       └── roles
│   │           └── <distro-role>
└── playbooks
```
Enter fullscreen mode Exit fullscreen mode

What is a distro type?

A distro type is folder that centralize all ansible roles that can be used executed in a specific.

Take windows as example: ansible/all_roles/distros/windows. This folder centralize all ansible roles that can be used executed in a windows machines.

What is a distro role?

A distro role is a group of ansible tasks, that implements related configurations that represents a functionality.

For elucidation, take the list of tasks from splunk UF role:
- Downloads Splunk UF
- Installs the download file
- Sets default configuration
- Starts Splunk UF

Explaining the config file

Here you can find a complete config example:

config = {
  "myenv01" = {
    "type" = "windows_standalone"
    "nodes" = {
      "myvm01" = {
        "type" = "windows"
        "enabled_roles" = {
          "windows_funcionality01" = true
          "windows_funcionality02" = true
        }
        "os" = {
          "size"    = "t2.medium"
          "distro"  = "windows"
          "type"    = "windows"
          "version" = "2016"
        }
      }
    }
  }
  "myvm02" = {
    "type" = "linux_standalone"
    "nodes" = {
      "mylinux01" = {
        "type" = "linux"
        "enabled_roles" = {
          "linux_funcionality01" = true
          "linux_funcionality02" = true
        }
        "os" = {
          "size"    = "t2.medium"
          "distro"  = "ubuntu"
          "type"    = "linux"
          "version" = "20"
        }
      }
    }
  }
}
Enter fullscreen mode Exit fullscreen mode

Under the hood this configuration will be translated to create 2 EC2 instances in AWS, and each instance will run playbooks with specific roles.

Block explanation:

  • Each block myenv0x represents how the environment will be deployed. The type represents which predefined environment will be used.

  • Each block myvm0x represents a VM that will be created. The type represents which predefined distro will be used.

  • The block os has 4 properties that will create a proper EC2 instance:

    • The AWS type instance
    • Type of Distro (windows, linux, etc)
    • OS Distro(ubuntu, debian, suse, windows, etc)
    • Version of the OS Distro

With this info the terraform will know which AWS AMI to use to spin up in the EC2 instance

  • The block enabled_roles represents a list of Ansible Roles to execute in each instance

For more details about the code and implementation, check the code demo, fully functional.

Discussion (0)