DEV Community

Cover image for Easy Chaos Engineering with AWS Organizations SCP
Shuichi
Shuichi

Posted on

Easy Chaos Engineering with AWS Organizations SCP

I have attached a permission-robbing SCP to an experiment OU, moved the target account to that OU, and observed its behavior.

What is AWS Organizations?

AWS Organizations is an account management service that enables you to consolidate multiple AWS accounts into an "organization."AWS Organizations makes it very useful for managing multiple accounts.
You can integrate and control supported AWS services to accounts that are members of an organization.

AWS Organizations

Chaos with SCP

Simulate the behavior of AWS services failure on a target workload.
Attach a permission-robbing SCP to an experimental OU, move accounts into it, and observe workload behavior.

Chaos with SCP

Target workload

I set the In-house email notification integrated system as the target workload this time.

The rough architecture.
Target workload

permission-robbing SCP

Set the following policies for SCP

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "Statement1",
      "Effect": "Deny",
      "Action": [
        "SNS:*",
        "SES:*",
        "S3:*",
        "Lambda:*",
        "Dynamodb:*",
        "SQS:*"
      ],
      "Resource": "*"
    }
  ]
}
Enter fullscreen mode Exit fullscreen mode

Let's take a guess

What would be the behavior if I moved a workload account to an OU with a permission-robbing SCP attached?

What part of the architecture would fail?

What part of the architecture would fail?

Experimental outcome

I sent an email that this workload should notify Slack.
However, the email has not been sent to Slack.

I found that the (3)Converter failed when it was going to read the email data.

[ERROR] ClientError: An error occurred (AccessDenied) when calling the GetObject operation: Access Denied Traceback (most recent call last):
Enter fullscreen mode Exit fullscreen mode

(1) SES -> (2) SNS -> (3) Lambda calls are connected by events and do not use IAM role access, so they worked correctly.

After that, the Lambda in (3) read email data from S3 using IAM role access, so it failed here. (Got failed by Deny "S3:*")

outcome

Incidentally, I checked the Lambda function from the management console, but it displayed "~with an explicit deny in a service control policy," I could not see the function. It is because "Lambda:*" is denied.

It reminded me that if the service fails, I may not be able to check the status of that service resource either.

Conclusion

I have tried easy chaos engineering with AWS Organizations SCP.

Good points of this method

Easy to start, easy to clean up

If it is through an IAM role, you can simulate, to some extent, the failure to be able to access a specific service.

Bad points of this method

SCP with strong restrictions will be attached, and a mistake can cause an accident.
However, limiting the account IDs allowed by the "Organizations:MoveAccount" permission may reduce the incident risk.

Only IAM users and IAM roles can be deny by SCP, so the situations that can be simulated are limited.

At the end

Can I detect the failure? Can I identify the cause?
I thought "Easy Chaos Engineering with AWS Organizations SCP" would be a good starting point for checking.

This post is an English rewrite of a post I wrote in Japanese.

Top comments (0)