DEV Community πŸ‘©β€πŸ’»πŸ‘¨β€πŸ’»

Alex M. Schapelle for Otomato

Posted on • Updated on

Automating Custom ISO with `Cloud-Init`

Welcome back gentle reader. Thank you for coming back to enrich the pool of thy knowledge. My name Silent.Mobius and I am your faithful author. Before we go on, let us recap what we've done.

Story so far...

In the last article, we explored the manual structure of Linux distribution customization, which was mostly done in manually manner. It is usually useful in cases where our ISO needs some pre-configuration or to adding some tools before being used for development, testing, releasing, deployment, operation or monitoring.
Although that manual work can be useful, in other cases there is a need for type of automation that can enable more dynamic decision, such as partition divisions of drive, package installation, service configuration and many more.
For that purpose, almost every Linux distribution has its own automation tools: RedHat had KickStarter and Debian had Pre-Seed, yet these tools had either documentation problems or are strictly useful only on their dedicated distributions.

Enter Cloud-Init

According to Cloud-init documentation:

Cloud-init is the industry standard multi-distribution method for cross-platform cloud instance initialization. It is supported across all major public cloud providers, provisioning systems for private cloud infrastructure, and bare-metal installations.

What Cloud-Init is in reality?

It is a quite simple, but powerful way to configure machines. Let's assume, you want to set up a machine, that requires some settings pre-configured or packages installed. You can think of "create a user" or "install updates", but also "configure network" or "put a file there".

And how does this work?

Every time we use #ISO, on hardware or virtualization platform, Cloud-init service runs and does as follows:

  • Checks, validates and parses an existing cloud-init configuration.
  • Applies via API, Commands and config changes the cloud-init configuration.

What environment is suited for Cloud-Init ?

Based on the name, we can assume cloud. Chances are high, that most of hosting/cloud provider already supports it. You can expect very sane support in at least:

  • nocloud : virtualization or bare-metal
  • aws
  • gcp
  • azure
  • and many others...

As a side note

In my personal use cases, most use of Cloud-init was in customizing Ubuntu 18.04 and 20.04 for creating automated install on different types of hardware, thus in this article, I'd love to share some insights on the matter.

The Goal

  • To show basic Cloud-Init Setup
  • Write an automation file that will enable you to perform tasks
  • Provide tips and tricks to use Cloud-Init in automated CI/CD Pipeline

Let us init at the beginning...

Cloud-Init Setup

To start using Cloud-init, we need to have it installed on base image of Linux distribution that we are using. In our case ubuntu-20.04.4, currently latest stable for ubuntu Linux distribution, Already has Cloud-init installed, we only will need to tweak it in order to make it ready for #ISO installation.

Lets start by getting tools and #ISO file to disassemble it:

sudo apt update  && sudo apt install p7zip-full p7zip-rar genisoimage fakeroot xorriso isolinux binutils squashfs-tools 
curl -X GET -OL https://releases.ubuntu.com/20.04.4/ubuntu-20.04.4-live-server-amd64.iso 
7z x -y ubuntu-20.04.4-live-server-amd64.iso  -oiso
Enter fullscreen mode Exit fullscreen mode

Once it is done, you should have a folder named iso in which all the internals of #ISO files will be located.
We'll need to notify the Linux kernel to boot from custom config file that also should initialize Cloud-init and that is why we need to configure headers on main boot files in Ubuntu 20.04

sed -i -e 's/---/ autoinstall  ---/g' iso/isolinux/txt.cfg
sed -i -e 's/---/ autoinstall  ---/g' iso/boot/grub/grub.cfg
sed -i -e 's/---/ autoinstall  ---/g' iso/boot/grub/loopback.cfg
sed -i -e 's,---, ds=nocloud;s=/cdrom/nocloud/  ---,g' iso/isolinux/txt.cfg
sed -i -e 's,---, ds=nocloud\\\\\\;s=/cdrom/nocloud/  ---,g'  iso/boot/grub/grub.cfg
sed -i -e 's,---, ds=nocloud\\\\\\;s=/cdrom/nocloud/  ---,g' iso/boot/grub/loopback.cfg
Enter fullscreen mode Exit fullscreen mode

Note: In older or newer versions, the file locations or even file names might differ, thus do take time to investigate, and do not give up. I believe in you

If you look carefully on latest sed command inside single quotes, it provides the path to folder name ds=nocloud;s=/cdrom/nocloud. In reality that is not actual folder, but we need to remember that during OS install, the #ISO is mounted to /cdrom folder as source of its installation, thus we need to place nocloud folder under iso folder which represents root of our #ISO. nocloud should include two very significant files for Cloud-Init:

  • meta-data
  • user-data

We'll discuss these files and other parts of Cloud-Init in the next segment. Less talk - more work:

mkdir -p iso/nocloud
touch iso/nocloud/{meta-data,user-data}
Enter fullscreen mode Exit fullscreen mode

Now we have set our environment, and what is left is to build that automation we mentioned, before.

The Automation

Cloud-init uses meta-data file to configure cloud based API configurations and setups, as well as user-data file to customize unique behavior OS while being installed.
The syntax for both of these is an YAML and it has a rather structured way:


#cloud-config
autoinstall:
  version: 1
  early-commands:
  keyboard: 
  locale: 
  identity: 
  users:
  apt:
  ssh:
  network:
  storage:
  late-commands:
  user-data:
    write_files:
    timezone: 
    runcmd:
    bootcmd:
Enter fullscreen mode Exit fullscreen mode

Yes, indeed. All written above is useless without explanation. Please bare in mind, although the modules are written here, not all are required. Despite of the fact that the Cloud-init and its modules are written. Thus let us dive into the matter:

  • We start with header of #cloud-config, while most YAML based files do not require specific headers, Cloud-init does requires this header, other wise Cloud-init run will fail on header is missing error.
  • We continue with autoinstall tag, that notifies Cloud-init not to require users input.
  • File goes on with version that requires Cloud-inits API version, which as of now still is version 1
  • We carry on with early-commnds tag that enables us to run any specific commands or script that we would prefer to run before installation. What commands or scripts? In my case I have used a script that detects ssd type, whether it was regular SATA drive or NVME and adjust the configuration of Cloud-init as needed.
    • Note that Cloud-init reloads itself afterearly-command runs, enabling us to change our config file to our use
  • To stay on course, we use keyboard tag to create compatibility for our devices keyboard layout, which might differ from mine to yours.
  • In order sustain the course on the subject, Locale tag provides our systems local language setup.
  • We prolong our course with identity that enable us to configure default user. It can have several config options:
    • username to configure default username
    • password to set default encrypted password
    • hostname to administer our systems hostname
    • If it is not configured, and if no users tag is set, the default user and password, ubuntu/ubuntu is inserted automatically.
  • To extend the discussion regarding the system users, lets examine the users tag: the tags enables to add users to the system. Users are added after groups are added. Most of these configuration options will not be honored if the user already exists.
    • default heading of a default user configuration
    • name for unique username on the system
    • gecos to provide nickname in /etc/passwd file
    • primary_group for users primary group. if not set, setup fails.
    • groups: wheel, sudo, additional groups
    • passwd for encrypted password
    • clear-text-password for unencrypted password
      • only one of passwords is required
  • To pursue the system tools we use apt to either install tools, if network is configured, or to update the system
  • ssh tags provide management for ssh-keys to be inserted into. We can also require to install ssh service, with install-server: true
  • To remain on subject, network provides network configuration, and it does it with several types of api, which can be found here
  • As we get closer to the end, we get to storage tag that enables us to configure storage, but for some reason, Cloud-Init docs do not provide deep or any understanding of partitioning, on nocloud provider. Luckily for us, I've done some small investigating, and have learned that Cloud-init is built upon base of software called Curtain that provides very deep explanation in regards to partitioning, LVM, Raid and so on.
  • To prolong the drama, we'll look late-commands tag, that enables Cloud-init to run commands or scripts after installation is complete.
  • The last but not the least user-data tag, which by itself, does not provide too much itself, yet it has several sub-modules that enables additional parts to users space with initialize script, .bashrc, .profile and others. Lets check them out as well:
    • write_files to write or append, to configuration files
    • timezone: setup timezone for the user, although we can do it on system level as well
    • runcmd: run a command or a script on the first boot
    • bootcmd: run a command or a script on the every boot

Too much to remember and to process, right?! So here is a example for you to use:

#cloud-config
autoinstall:
  version: 1
  identity: {hostname: HOSTNAME, password: "ENCRYPTED-PASSWORD", username: USERNAME}
  keyboard: 
    layout: us
  locale: En_Us.UTF-8
  apt:
    disable_suites: [security]
  ssh:
    install-server: true
  network:
    version: 2
    ethernets:
      eno1:
        match:
          name: en*
        dhcp4: true
  storage:
    config:
      - {name: ubuntu-vg, devices:[partition-2], preserve: false, type: lvm_volgroup, id: lvm_volgroup-0 }
      - {name: ubuntu-lv, volgroup: lvm_volgroup-0, size: 10GB, wipe: superblock, preserve: false, type: lvm_partition, id: lvm_part-0}
      - {fstype: ext4, volume: lvm_part-0, preserve: false, type: format, id: fmt-2 }
      - {path: /, device: fmt-2, type: mount, id: mnt-2}
      - {path: /boot, device: fmt-1, type: mount, id: mnt-1}
      - {path: /boot/efi, device: fmt-0, type: mount, id: mnt-0}
  late-commands:
    - rm -rf /target/etc/update-motd.d/[0-9]*
    - rm -rf /target/etc/cron.{hourly,d,daily,weekly,monthly}/*
    write_files:
      - path: /etc/ssh/sshd_config
        content: |
          LogLevel INFO
          AllowUsers USERNAME
        append: true
    timezone: Asia/Jerusalem
    runcmd:
      - [groupadd, -g, 998, docker]
      - [usermod, -aG, docker, USERNAME]
      - [reboot]
Enter fullscreen mode Exit fullscreen mode

Although this is the small example, there may be additional parts either can be added or removed.

Tips and Tricks

Whenever it comes to building something with Cloud-init, the best place to start would always be a documentation on the tool. However from my personal experience, Cloud-init documentation is not built in very organized manner, and in addition to that some of the tools behavior won't always be explained with deep dive into the subject. Thus here are some tips to ease your work with Cloud-init:

  • Read as many github gists as possible. I was able to find a lot of examples that just worked and it saved me alot of time.
  • Use cloud-init validate command before closing the #ISO. Testing on bare-metal while you are not sure whether the Cloud-init syntax is not correct way and it will become frustrating after first few boots.
  • Build the Yaml file part by part. No need to drop everything into the file everything and then try to debug why it all failed.
  • Do the manual install to get after install generated YAML file. I have struggled with configuring LVM, and did the manual install to get the after install generated YAML file which made my life easier. The file is usually stored at /var/log/installer/ with the name autoinstaller-user-data
  • Do not use late-commands or runcmd for installing software. The strategy should be for you to have the network connected while setting up the #ISO on hardware and to use apt tag to install whatever you need.
    • If network is not something you can get while installation, download packages manually, pack them as part of your #ISO, and then only, install with runcmd.
    • Note that if there will be any dependency issues, like a library missing or package missing, whole installation of system will fail.
  • Do not use lxd for testing. Its usually does not provide
  • Create shell script that detects values for your YAML file. I had issue detecting several values with early commands, thus when running the build on CI/CD with Jenkins, I ran shell script to detect values before YAML file was inserted in #ISO, and swapped with sed the needed values. for example, I have used USERNAME in above example, which can be swapped to silent-mobius with sed command.
  • Do not hesitate to experiment.

    Conclusion

    Dear gentle reader, my gratitude for reading this article till the end. Hope you have extended your knowledge to new limits and that this article was informative for you, and remember: Do Try To Have Fun.

Some links to sources that made this article possible

Thank you

Top comments (0)

Create an Account!

πŸ‘€ Just want to lurk?

That's fine, you can still create an account and turn on features like 🌚 dark mode.