Matt Adorjan

Posted on Aug 25, 2021

Trying out AWS Controllers for Kubernetes (ACK)

#aws #ack #kubernetes #eks

Introduction

AWS Controllers for Kubernetes (ACK) allows creating AWS resources via the same process you use to deploy other resources in Kubernetes. ACK uses the Kubernetes controller model to interact with AWS APIs. Once the ACK service controller for a specific service is deployed, you can leverage Kubernetes Custom Resource Definitions (CRDs) to declare specific AWS resources in YAML manifests. You then send them to the Kubernetes API server and the specified AWS resources are provisioned within your AWS account.

When I initially heard the news about AWS Controllers for Kubernetes, I was immediately struck by a few previous challenges where having AWS resources tied directly to other Kubernetes deployments would be immensely helpful. I think the key for using ACK is understanding when it makes sense to use them. Most organizations have mature processes for provisioning cloud infrastructure using tools like CloudFormation or Terraform. By introducing a new way of provisioning resources, you introduce an additional place where you now need to govern standards in terms of naming/tagging/security config, validate permissions for the controller follow least privilege, etc.

Use Cases

In my opinion, there are only a few reasons to use ACK to deploy resources and a longer list of situations where you probably want to avoid using ACK. Obviously, this is all very dependent on your organization and your use cases.

When to use ACK

Managing AWS resources which are tied to the lifecycle of Kubernetes resources - there are many use cases where you may want to maintain resources alongside your Kubernetes deployments, or specify them in your Helm Chart. For example, a use case that will be helpful on day 1 for me is the ability to create SQS queues alongside deployments in Kubernetes which are set to scale based on the new SQS queue's length.

When not to use ACK

AWS resources are going to be primarily used outside of Kubernetes - this is not a replacement for an infrastructure as code solution like CloudFormation or Terraform.
Data retention is required - this is a grey area, as you can definitely create data stores using ACK and there might be good cases for this. However, it is important to understand how simply deleting the AWS resource object in Kubernetes accidentally would cause both the deletion of the AWS resource and all of its associated data.
ACK controllers don't expose required parameters - if the controller doesn't expose specific parameters you need set on a resource, you should create the resource using another method to avoid double work. For example, if you need S3 Public Access to be turned off, it's not an available parameter, so ACK isn't a good option.
You want a way to expose resource deployments to developers and they already know Kubernetes - just because a developer is familiar with how Kubernetes resources work, if the resources are not directly tied to a Kubernetes deployment, it doesn't make sense to manage those resources with ACK. It makes more sense for developers to learn CloudFormation or Terraform or to expose resource creation to them using something like AWS Service Catalog.

Choosing a deployment model

ACK calls these installScope and the options are either cluster or namespace. There are some pros and cons for each option.

Cluster:
- Pros: Only need to run one copy of each controller in the cluster, therefore reducing effort required to manage these components.
- Cons: Permissions for the IAM credentials used by each controller will need to be broad to accommodate all of the possible ways a resource needs to be created anywhere in the cluster.
Namespace:
- Pros: Can get finer grained with controller configuration and IAM permissions that are used when the controller creates resources.
- Cons: Need to manage each controller in all namespaces, which can add operational overhead (e.g. need to run upgrades across all namespaces instead of in just 1 place).

I'm a big fan of the "shared service" model in multi-tenant clusters, and will likely use the cluster option here. I think as long as you have proper governance in place within the cluster, you can centrally manage all of your components once and make them available to all users in the cluster. See the last section of this post for more information on an approach for governance.

Getting started with ACK's S3 Controller

I initially started following the Install instructions from the ACK docs, specifically using the Helm chart. Looking at the S3 controller Helm Chart in ECR, the latest version is v0.0.2.

export HELM_EXPERIMENTAL_OCI=1
export SERVICE=s3
export RELEASE_VERSION=v0.0.2
export CHART_EXPORT_PATH=/tmp/chart
export CHART_REPO=public.ecr.aws/aws-controllers-k8s/$SERVICE-chart
export CHART_REF=$CHART_REPO:$RELEASE_VERSION
export ACK_K8S_NAMESPACE=ack-system

mkdir -p $CHART_EXPORT_PATH

helm chart pull $CHART_REF
helm chart export $CHART_REF --destination $CHART_EXPORT_PATH

kubectl create namespace $ACK_K8S_NAMESPACE

helm install --namespace $ACK_K8S_NAMESPACE ack-$SERVICE-controller \
    $CHART_EXPORT_PATH/ack-$SERVICE-controller

This worked overall, but as I started deploying a test S3 bucket, a lot of the Specs listed in the docs were generating errors and not working. For example, my bucket manifest is below. v0.0.2 did not have any support for tagging so this would create a bucket without tags and throw a not implemented error.

apiVersion: s3.services.k8s.aws/v1alpha1
kind: Bucket
metadata:
  name: test-s3-matt-bucket
spec:
  tagging:
    tagSet:
      - key: CostCenter
        value: Development
  name: test-s3-matt-bucket

After trying to understand what was going on, I realized that both the Helm Chart and the Docker image available in ECR were out of date. Looking at the s3-controller repository, changes were actively being made but there were no releases indicating what commit the v0.0.2 versions were tied to.

As a next step, I decided to build a newer Docker image and use the latest Helm chart from the repository to deploy the controller. This ended up being a bit more difficult than I anticipated. The ACK solution is designed to be very modular. The pro of this design is that it's very easy to add new services to ACK. The biggest con is that it is a bit confusing to figure out how everything fits together as someone new to the project.

After reading through the documentation a bit more and reviewing all of the different repositories in the aws-controllers-k8s org, I found the code-generator repository which contains the proper scripts to generate new Docker images for services based on the latest commits on a service repository.

To build a new S3 image:

Run the following to clone the code-generator repo and run the build script for s3:

git clone https://github.com/aws-controllers-k8s/code-generator.git
cd code-generator/scripts/
./build-controller-image.sh s3

Tag the newly generated image and push it to my own ECR repo.

docker tag <image-tag> public.ecr.aws/<repository>/s3-ack:<tag>
docker push public.ecr.aws/<repository>/s3-ack:<tag>

Run the following to clone the s3-controller repository:

git clone https://github.com/aws-controllers-k8s/s3-controller.git
cd s3-controller/helm/

Assuming you've already deployed the Helm chart earlier, you will need to upgrade the chart using the same name as before but instead pointing to the locally cloned copy. The example below upgrades a Helm chart called ack-s3-controller installed in namespace ack-system and sources the Helm chart from the current directory (.). It also sets the image.repository and image.tag values to the new location in ECR from step #2.

helm upgrade ack-s3-controller .  --namespace ack-system \
     --set image.repository=public.ecr.aws/<repository>/s3-ack,image.tag=<tag>

From there, you should now be able to create a new S3 bucket leveraging the latest features available in the documentation. In my experience, as soon as I deployed the new version and used the S3 Bucket YAML file listed above, a new S3 bucket was deployed with the expected tags.

Suggestions

The ACK project has a very in depth review of their releases and versioning process here. It is very thorough and follows a lot of standards you would expect for a solution of this nature. One of the items the documentation makes very clear is that each controller is on its own release and maintenance cycle. This also makes sense, given the amount of different service controllers that will be needed.

All of this being said, my suggestion is:

Follow a standardized release process for all of the controllers. Some controllers have releases defined in GitHub which match up with Docker images and Helm charts, some don't (as we saw with S3).
If the service controller documentation is being updated on the Docs website, make sure there is an actual released Docker image and Helm chart available which allows the usage of what the Docs are referencing.
Document the release versions in a centralized location on the Docs website. This page would be a great place to list the latest released version.

I don't think it's a huge ask to attach some simple release processes to each of the controller repositories which handle the above. I'm completely cognizant of the fact that these are all still in alpha and may not be ready for "stable" tags, but I think this simple change significantly lowers the barrier to entry and will allow others in the community to try out these controllers.

Governing resource creation by ACK with Gatekeeper

You may want to govern that AWS resources are created using specified guardrails when using ACK. One way you could do this is to try to customize the IAM policy attached to the controllers to restrict creation of resources unless they meet certain conditions. This works well but different resources offer different conditions meaning you may or may not be able to restrict creation based on the parameters you would like. Also, error messages returned from AWS for Access Denied are not passed through to the end user, leaving them blind to issues caused by IAM restrictions.

In the world of Kubernetes, Gatekeeper can fill this gap for us when using ACK. Because AWS resources to be deployed with ACK are translated into a standard JSON format when submitting to the Kubernetes API, we can write policies in rego which are then enforced using Gatekeeper.

I will leave a full Gatekeeper tutorial for other posts that already exist, but will post a sample Gatekeeper Template and Constraint below.

Example Gatekeeper Template and Constraint - S3 Standards

template.yaml - this defines the Constraint Template. You can see we are checking that the S3 bucket name starts with a specific string and that a specific tag is present.

apiVersion: templates.gatekeeper.sh/v1beta1
kind: ConstraintTemplate
metadata:
  name: bucketdefaultrequirements
  annotations:
    description: Requires S3 buckets created by ACK match standards.
spec:
  crd:
    spec:
      names:
        kind: bucketDefaultRequirements
      validation:
        # Schema for the `parameters` field
        openAPIV3Schema:
          type: object
          properties:
            bucketStartsWith:
              type: string
            requiredTagKey:
              type: string
  targets:
    - target: admission.k8s.gatekeeper.sh
      rego: |
        package bucketdefaultrequirements

        violation[{"msg": msg}] {
          namingConventionStartsWith := input.parameters.bucketStartsWith

          value := input.review.object.spec.name

          # Check if the Bucket Name follows our naming convention
          not startswith(value, namingConventionStartsWith)

          # Construct an error message to return to the user.
          msg := sprintf("Bucket name does not follow proper format; found `%v`; needs to start with `%v`.", [value, namingConventionStartsWith])
        }

        violation[{"msg": msg}] {
          requiredTagKey := input.parameters.requiredTagKey
          value := input.review.object.spec.tagging.tagSet
          not contains(value, requiredTagKey)
          msg := sprintf("%v tag is missing.", [requiredTagKey])
        }

        contains(tagKeys, elem) {
          tagKeys[_]["key"] = elem
        }

constraint.yaml - this defines the actual constraint which uses the template created above. We can create a new constraint to be used across the entire cluster (like below) or we can create specific namespace scoped constraints for this rule.

apiVersion: constraints.gatekeeper.sh/v1beta1
kind: bucketDefaultRequirements
metadata:
  name: s3-buckets-must-meet-base-requirements
spec:
  match:
    kinds:
      - apiGroups: ["s3.services.k8s.aws"]
        kinds: ["Bucket"]
  parameters:
    bucketStartsWith: "matt-s3-"
    requiredTagKey: "CostCenter"

When the above constraint fails and I try to create an S3 bucket called test-s3-matt-bucket without a CostCenter tag, I get the following friendly message returned by the Kubernetes API:

error when creating "bucket.yaml": admission webhook "validation.gatekeeper.sh" denied the request: 
[s3-buckets-must-meet-base-requirements] CostCenter tag is missing.
[s3-buckets-must-meet-base-requirements] Bucket name does not follow proper format; found `test-s3-matt-bucket`; needs to start with `matt-s3-`.

In Conclusion

I hope that this has been helpful in some way! ACK is a great step forward for integrating AWS resources with the Kubernetes deployment lifecycle which many developers are already very familiar with. It's still early days, and like everything that comes out of AWS, I am positive customer obsession will continue driving the ACK project towards full API parity when deploying AWS resources in Kubernetes.

Top comments (2)

Jay Pipes • Aug 30 '21

Hi Matt!

Awesome article. Thank you for the honest and thoughtful feedback about ways we can improve the user experience in ACK. We will be taking all of your suggestions and creating Github issues for them (if there isn't already an issue logged about that particular suggestion).

Check out the latest S3 controller release (v0.0.3) which has fixes for the issues you ran into.

As you mention, it's still relatively early days for ACK. We're continually improving our automation and release processes and hope that you might try us out again in the near future in a followup article :)

Also, we welcome you to join us on Slack (myself and a number of other ACK contributors hang out on the Kubernetes Slack community's #provider-aws channel) and attend our weekly community Zoom meetings.

Details for the Zoom meeting can be found here:

github.com/aws-controllers-k8s/com...

All the best,
-jay

Matt Adorjan • Oct 6 '21

Hi @jaypipes ! I apologize that I completely didn't see this comment until now! I will take a look at the updates here and update my post. Definitely look forward to the Zoom meetings, too.

Thanks!
Matt