Nicolas El Khoury for AWS Community Builders

Posted on Mar 24, 2023

AWS Certified DevOps Engineer Professional: Content Summary and Important Notes

#aws #certification #devops #professional

Introduction

The AWS DevOps Engineer Professional Certification is a certificate offered by Amazon Web Services (AWS), designed to test your proficiency in deploying, managing, and maintaining distributed applications on AWS using DevOps principles and practices.

In addition, it is a great way to verify your knowledge of industry best practices and enhance your profile and competencies in a competitive job market.

To increase your chances of passing AWS DevOps the exam, it is advised that you have a minimum of two years of experience in designing, setting up, managing, and running AWS environments. In addition, you should have hands-on experience with AWS services such as EC2, S3, RDS, Cloudwatch, CodePipeline, etc, and a good understanding of DevOps principles and practices.

Professional-level certificates are quite different from Associate-level ones. As a matter of fact, they are more difficult, require more professional experience in AWS, more maturity, and a higher level of thought process. In brief, obtaining the AWS DevOps Professional certificate is not a walk in the park.

In addition to that, chances are that you are employed, with a busy life. Therefore, preparing for the certificate is not going to be your full-time job.

In light of all the above, I am writing this article to serve the purposes below:

Summarize the requirements needed to pass the exam.
Summarize Stephane's amazing Udemy course content.
List additional preparation steps.

Definitely, this article does not constitute a single source of truth that will guide you to pass the exam. Nonetheless, it aims to be a summary, and a memory refresher.

Certification Requirements

The AWS DevOps Engineer Professional certification exam evaluates your skills and knowledge in essential areas related to deploying, managing, and operating distributed application systems on the AWS platform by implementing DevOps principles and practices. The exam will test your understanding of several key topics:

SDLC Automation

Continuous delivery and deployment is a process that automates the building, testing, and deployment of software in a smooth and continuous way. This leads to faster delivery of updates and better customer satisfaction. The exam may test your knowledge of AWS tools such as AWS CodePipeline, AWS CodeDeploy, and AWS Elastic Beanstalk.

Configuration Management and IaC

Infrastructure as code (IaC) is a practice that involves automating the deployment and management of infrastructure using code rather than manual processes. This approach enables teams to manage infrastructure more efficiently, reduce errors, and increase agility. The AWS DevOps Engineer Professional certification exam may test your understanding of IaC tools and services such as AWS CloudFormation

Resilient Cloud Solutions

The Resilient Cloud Solutions domain assesses your ability to build and manage systems that are able to cope with potential failures or disasters, including, creating backup and restore strategies, designing for disaster recovery, and managing scaling and elasticity on AWS.

Monitoring and Logging

The exam may assess your skills in designing and implementing effective logging and monitoring systems for AWS services and applications. You may be tested on your ability to identify and troubleshoot issues in these systems and set up alarms and notifications to ensure the efficient operation of the infrastructure.

Incident and Event Response

The exam evaluates your knowledge of incident management and response processes, such as identifying, categorizing, and resolving incidents and implementing and testing disaster recovery plans. You may also be tested on your ability to effectively communicate and collaborate with stakeholders during such incidents.

Security and Compliance

This domain evaluates your understanding of security and compliance best practices for AWS services and applications. This includes topics revolving around implementing security controls, managing access and authentication, and ensuring compliance with regulatory standards. You may also be tested on your knowledge of AWS security services such as AWS IAM and AWS KMS.

Course Summary and Important Notes

An important step in my preparation for the exam is Stephane Maarek's Udemy course. The course is highly interactive, well designed, and contains important information, explained in a simple and clear way.

Nonetheless, the course is (veeeeeery) long. Remembering, therefore, all the important information explained in the course may be quite difficult.

In light of the above, the next section of this article lists and explains most of these important concepts to remember. This article constitutes, in no way, a replacement for the course. Rather, it can be used to refresh your memory, only after having carefully studied the Udemy course.

Important Notes

CodeCommit

Using IAM, you can deny certain users from pushing to master by creating an explicit DENY policy
You can set up notification rules and triggers to either one of SNS or lambda. Examples include the creation or deletion of a repository, branch, Pull Request, etc. You can also configure such events using Cloudwatch Events.

CodeBuild

You can pass environment variables to Codebuild in many ways:
1. In the buildspec file as key-value pairs.
2. In Codebuild using plain-text key-value pairs.
3. In Codebuild or the buildspec file as a secret, using the Systems Manager Parameter Store.

CodeDeploy

There are multiple deployment strategies for EC2's using CodeDeploy:

In-place deployment: Deploy on the existing EC2 machines:
a. AllAtOnce: Deploys on all the existing EC2 machines at the same time.
b. HalfAtOnce: Deploys on half of the existing EC2 machines per batch.
c. OneAtOnce: Deploys on one EC2 machines at a time.
d. Custom Rules: You can specify your own custom rule.
Blue/Green deployment: Provision new instances to deploy the new application version. This deployment requires a new Load balancer. There are two ways to perform such type of deployment:
a. Manually provisioning instances
b. Automatically copy autoscaling groups.

appspec.yml:

The appspec.yml file contains all the necessary information to be processed by CodeDeploy to perform a certain deployment. Below is a sample appspec.yml file:

version: 0.0
os: linux  # The Operating system
files:
  source: /index.html # The location of the file(s) to be copied
  destination: /var/www/html # The location in which the file(s) must be copied to on the servers
hooks: # The list of hooks available for CodeDeploy
  ApplicationStop:
    location: scripts/stop_servers.sh # The location of the script to run when this hook is triggered
    timeout: 300 # The timeout for this hook (in seconds)
    runas: root # The user with which the script will be executed 
  BeforeInstall:
  AfterInstall:
  ApplicationStart:
  ValidateService:

Unlike Codebuild, in order to collect Codedeploy logs and display them in Cloudwatch, the Cloudwatch logs agent must be installed on the machines.
Rollbacks can be done in two ways:
1. Manually
2. Automatic: When a deployment fails, or when a threshold is crossed, using specifically set alarms.
Registering an On-premise instance to CodeDeploy can be done in multiple ways:
1. For a small number of instances: create an IAM user, install the CodeDeploy agent, register the instance using the register-on-premise-instance API
2. For a large number of instances: Use an IAM role and AWS STS to generate credentials
CodePipeline possesses different deployment strategies for lambda function:
1. Canary: Traffic shift in two increments. For example, pass 15% of traffic to the newly deployed version in the first 15 minutes post-deployment, then switch all the traffic to it afterward.
2. Linear: Traffic shifts in equal increments. For example, add 10% of traffic to the newly deployed version every 5 minutes.
3. AllAtOnce: Move all the traffic to the new version at once.

CodePipeline

There are two ways CodePipeline can detect source code changes:
1. Amazon Cloudwatch Events: A change triggers an AWS Cloudwatch Event. This is the preferred way.
2. AWS CodePipeline: Periodically check for changes.
Artifacts:
1. Each stage uploads its artifacts to S3 in order to be used by the later stage.
2. We can use the same S3 bucket for multiple pipelines.
3. Objects can be encrypted using AWS KMS or Customer Managed Keys.

Cloudformation

!Ref can be used to reference parameters or other resources.
Pseudo parameters are variables offered by AWS, for example ACCOUNT_ID, etc.
Mappings are a set of fixed variables

Mappings
  RegionMap:
    us-east-1:
      "32": "id-1"
      "64": "id-2"
    us-west-1:
      "32": "id-3"
      "64": "id-4"
Resources:
  MyEC2: 
    Type: "AWS::EC2::Instance"
    ImageId: !FindinMap [ RegionMap, !Ref "AWS::Region", 32 ] # Returns the ID based on the region in which the script is executed.

Outputs can be exported and used in other stacks
You cannot delete stacks which outputs are referenced by other stacks.
!ImportValue is used to import an output
Conditions: and, equals, if, not, or
!Ref:
1. When used against a parameter, it returns the value of the parameter.
2. When used against a resource, it returns the ID of the resource.
!GetAtt: Returns a specific attribute for any resource. For example:

Type: "AWS::EC2::Volume
Properties:
  AvailabilityZones:
    !GetAtt MyEC2.AvailabilityZone # Retrieves the Availability Zone attribute from the **MyEC2** resource

!FindInMap returns the value of a specific key in a map !FindInMap [ MapName, TopLevelKey, SecondLevelKey ]
!ImportValue retrieves the value of an output
!Join [ ":" , [ a, b, c ] ] --> "a🅱️c" joins an array of strings using a delimiter
!Sub substitutes values
UserData can be passed as a property using the Base64 function. The output of the UserData can be found under /var/log/cloud-init-output.log file in Linux
cfn-init is similar to userData, with some differences. cfn-signal and waitConditions are used after cfn-init to signal the status of a userData script.
Troubleshooting steps in case the wait condition did not receive the required signal:
1. Ensure the AWS Cloudformation helper scripts are installed.
2. Verify that cfn-signal and cfn-init commands ran by checking the /var/log/cloud-init.log and /var/log/cfn-init.log files.
3. Verify that the instance has internet connections
4. Note that such troubleshooting cannot be done if rollbacks are enabled.
By default, Cloudformation deletes all the resources after a failure. You can disable rollbacks, but you must make sure you delete all the resources.
Nested stacks are available. Always modify the parent stack, and the changes will propagate to the nested stacks.
Change sets allow you to understand the changes that will be done between two cloudformation stack versions, but it cannot show if it works or not.
Cloudformation has many deletion policies for resources:
1. Retain: does not delete the resource when deleting the stack
2. Snapshot: works on some resources, such as RDS.
3. Delete.
You can protect a stack from being deleted by enabling the deletion protection policy.
Cloudformation parameters can be fetched from SSM parameters. It is a good practice to store global parameters, such as AMI IDs. Cloudformation is able to detect changes to such parameters and perform necessary updates when needed.
DependsOn is used to create dependencies between resources.
Lambda functions can be deployed through Cloudformation in many ways:
1. Write the function inline in the Cloudformation script.
2. Upload the code in a zipped file to S3 and reference it in the Cloudformation script.
Lambda changes can be detected by Cloudformation in many ways:
1. Upload the code to a new bucket.
2. Upload the code to a new key in the same bucket.
3. Upload to a new versioned bucket and reference the version in the Cloudformation script.
Cloudformation cannot delete a bucket unless it is empty.
Use cases for Cloudformation custom resources:
1. Resource still not covered by Cloudformation
2. Adding an on-premise instance
3. Emptying an S3 bucket before deletion
4. Fetch AMI ID
Cloudformation drift detection is a tools that checks if the current resources that were created by Cloudformation were modified or not.
Cloudformation Status Codes:
1. CREATE_IN_PROGRESS
2. CREATE_COMPLETE
3. CREATE_FAILED
4. DELETE_IN_PROGRESS
5. DELETE_COMPLETE
6. DELETE_FAILED
7. ROLLBACK_COMPLETE
8. ROLLBACK_IN_PROGRESS
9. ROLLBACK_FAILED
10. UPDATE_COMPLETE
11. UPDATE_COMPLETE_CLEANUP_PROGRESS: when the update is complete but still cleaning up old resources
12. UPDATE_ROLLBACK_COMPLETE
13. UPDATE_ROLLBACK_IN_PROGRESS
14. UPDATE_ROLLBACK_FAILED
Potential causes for UPDATE_ROLLBACK_FAILED:
1. Insufficient permissions
2. Invalid credentials for Cloudformation
3. Limitation error
4. Changes done to the resources outside Cloudformation
5. Resources not in a stable state yet
INSUFFICIENT_CAPABILITIES_EXCEPTION: Cloudformation requires the CAPABILITY_IAM permission to create IAM resources.
Stack policies is a JSON object that specifies allow/deny rules for updates on resources in the Cloudformation script.

Elastic Beanstalk

Important CLI commands:

eb status: display information about the application.
eb logs: Display the application logs
eb deploy: Updates the application with a new version
eb terminate: Terminates the environment
There are two ways to modify elastic beanstalk configuration:

Saved Configurations:
a. eb config save <env name> --cfg <configuration name> --> saves a configuration of an environment locallly.
b. eb setenv KEY=VALUE --> Creates an environment variable in elastic beanstalk
c. eb config put <configuration file name> --> Uploads a configuration file to elastic beanstalk saved config
d. eb config <env name> --cfg <config name> applies a configuration to an elastic beanstalk environment
YML config files under .ebextensions. After the configuration is added, it can be applied using the eb deploy command

Configuration precedence is as follows:

Settings applied directly on an environment
Saved configurations
Configuration files (.ebextensions)
default values

Using the .ebextensions files, we can upload configuration files with additional resources added to the environment (e.g., RDS, DynamoDB, etc).
A database or resource that must outlive the ElasticBeanstalk environment must be created externally, and referenced in ElasticBeanstalk through environment variables for example.
Commands and Container Commands for .ebextensions:
1. Commands: Execute commands on EC2 machines. The commands run before the application and webserver are set up.
2. Conatainer commands: Execute commands that affect the application. Runs after the application and webserver are set up, but before the application is deployed. leader_only flag runs the command on a single machine only.
Important ElasticBeanstalk features:
1. When creating a webserver environment, there are two configuration presets: a. Low Cost: Creates a single instance with EIP. Good for testing. b. High Availability: Creates a ELB and autoscaling group. Good for production
Application versions are saved under the Application Versions section, limited to 1000 versions. We can create lifecycle policies to manage these versions.
Clone Environment is a quick way to create a new environment from an existing one.
Deployment Modes:
1. AllAtOnce: Fastest way, but brings all the instances down.
2. Rolling: Updates a subset of instances at a time.
3. Rolling with Additional Batches: Similar to Rolling but creates new instances for each batch.
4. Immutable: New instances created in a new Autoscaling Group, deploys the new version, and swaps environments when all is done
5. Blue/Green: can be achieved by having two environments, and then either swapping URLs, or creating weighted records using Route 53 records.
Worker Environment: is dedicated for long running background jobs (e.g., video processing, sending emails, etc). This environment creates SQS queues by default. Cron jobs can be specified in the cron.yml file.

Lambda

We can store environment variables in Lambda in 3 ways:
1. Stored as plaintext.
2. Stored as encrypted. Lambda needs enough permissions, and a KMS key to encrypt/decrypt the variable.
3. Stored as a parameters in SSM. Lambda needs to have enough permissions to fetch the secret.
By default, lambda uses the latest version, which is mutable. We can create versions, which are immutable. Each version will have its own ARN. Aliases can be created to point out to versions. In this way, we can preserve the same alias ARN, but change the underlying lambda version.
Serverless Application Model (SAM) allows us to create, manage, and test lambda functions locally, as well as uploading the functions to AWS.
Using SAM, we can create CodeDeploy projects to continuously deploy code.
Step function is a workflow management tool that coordinates the work between several lambda functions.

API Gateway

Two protocols available: REST and Websocket.
Endpoint type can be:
1. Regional: In one region
2. Edge Optimized: Across all regions
3. Private: Accessed within a VPC internally
API Gateway + Lambda proxy: The gateway can point to an alias for a lambda function, which allows us to do canary or blue/green deployments.
API Gateway Stages: Changes must be deployed in order to take effect. Stages are used to divide between environments (dev, test, prod). Stages can be rolled back, and possess a deployment history.
Stage variables are like environment variables but for the API gateway. Use cases include:
1. Configure HTTP endpoints with stages programmatically.
2. Pass parameters to lambda through mapping templates.
Canary deployment can be achieved in two ways across the API gateway:
1. From the canary deployment feature of the API gateway
2. Linking the stage to a lambda alias, and perform canary deployment on the lambda functions.
API gateway gas a limit of 10000 requests per second across all APIs in an AWS account.
You can add throttling or usage plans to limit API usage across many levels: lambda, stage, or API gateway levels.
You can also front a step function with API gateway. The response would be the ARN of the step function, since the API gateway does not wait for a response from the step function.

ECS

Task Definition: JSON document that contains information on how to run a container (e.g., container image, port, memory limit, etc).
Tasks need task roles to interact with other AWS resources.
If the host port is specified to be "0", a random port will be assigned.
ECR is the AWS container registry. If unable to interact with it, check the IAM permissions.
Fargate is the serverless ECS service. Using Fargate, we only need to deal with containers.
You can run Elastic Beanstalk environments in container mode:
1. Single Container Mode: One container per EC2.
2. Multi Container Mode: Multiple containers per EC2.
ecsInstanceRole: roles attached to the EC2 instances to pull images and other managerial stuff.
ecsTaskRole: roles for the container to interact with other AWS services.
Fargate does not have ecsInstanceRoles.
For ECS classic, autoscaling policies for the instances and autoscaling policies for the tasks are two different things.
The containers in ECS can be configured to send logs to Cloudwatch from their task definition.
The EC2 instances need the cloudwatch agent to be installed and configured on the VMs.
Cloudwatch metrics supports metrics for:
1. the ECS cluster
2. the ECS service (not per container).
3. ContainerInsights (per container). This option must be enabled and costs additional money.
CodeDeploy can be used to do Blue/Green deployment on ECS.

OpsWorks Stacks

It is the AWS alternative for Chef
There are 5 lifecycle events:
1. Setup: After the instance has finished booting
2. Configure: Instance enter or leave | Add/remove EIP | add/remove ALB
3. deploy: deploys an app
4. undeploy: undeploys an app
5. shutdown: right before the instance is terminated

Each event can have its own recipe.

Autohealing feature stops and restarts EC2 instances. An instance is considered down if the OpsWorks agent on it cannot reach the service for a period of 5 minutes.

Cloudtrail

Logs every API made to AWS
Logs can be sent to either S3 or Cloudwatch logs
Log files on S3 are by default encrypted using SSE-S3
Log files contain info: what is the call, who made it, to who, what is the response?
We can verify the integrity of cloudwatch files using the command: aws cloudtrail validate-logs.
We can aggregate cloutrail trails from multiple accounts:
1. Configure a trail in each account to send logs to a centralized S3 bucket in one account. Modify the S3 bucket permissions to allow objects to be pushed from all these accounts.

Amazon Kinesis

Amazon Kinesis Limits:
1. 1 MB/s or 1000 messages/s at write per shard or else we will receive a "ProvisionThrougputException"
2. 2 MB/s at read per shard across all consumers
3. Data retention of 1 day by default. Can be extended to 7 days
Producers can be: Kinesis SDK, Cloudwatch logs, 3rd party
Consumers can be: Kinesis SDK, Firehose, Lambda
Kinesis Data Streams vs FireHose:
1. Kinesis Data Streams: Requires custom code, realtime, users manage scaling using shards, data storage up to 7 days, used with lambda for realtime data
2. Firehose: fully managed, data transformation with lambda, Near realtime (±60 seconds), automated scaling, No data storage
Serverless realtime analytics can be done using queries on Kinesis data streams using SQL. New streams can be created using these queries.

Cloudwatch

Cloudwatch Metrics classic provides one data point per minute. Detailed monitoring can be enabled and provides one data point per second
To add custom metrics, use the put-metric-data API. A metric can be of:
1. Standard resolution: 1 minute granularity
2. High resolution: 1 second granularity
The get-metric-statistics API can be used to get the data of a metric. We can automate this to export the metrics to S3.
Cloudwatch alarms can accommodate for one metrics per alarm.
Alarm actions can be SNS notication, Autoscaling action, EC2 action
Billing alarms can be created in North Virginia only.
The unified Cloudwatch agent can be installed to collect logs and metrics from EC2 and on-premise instances.
You can create a metric from filtered logs and then create an alarm out of them.
You can export log data to S3, The bucket must have enough permissions. This can be automated using cloudwatch events and lambda.
Realtime processing of logs can be done using subscriptions, and having the logs delivered to Kinesis Streams, Data Firehose, or lambda.
S3 events can send notifications to SNS, SQS, and lambda (object level only).
Cloudwatch events has bucket and object level events.

Amazon X-Ray

An AWS service that allows API tracing, and service maps.

Amazin ElasticSearch (ES)

AWS managed Elastic Search, Logstash, and Kibana.

AWS Systems Manager (SSM)

Manages EC2 and on-premise instances, such as applying patch management, maintenance, automation, etc.
Either the AMI has the SSM agent installed or we have to install it manually before registering an instance to SSM.
If SSM is not working, it may be a problem with either the agent or the permissions.
The ID of EC2 instances registered with SSM start with "i-", while those of on-premise instances start with "mi-".
To register an on-premise instance:
1. Download the SSM agent on the instance.
2. Create an activation key
3. Register the instance using the CLI, activation ID and activation key.
SSM run command is used to configure something on a bunch of machines.
SSM Parameter store: Stores key value pairs. It is better to store the name of a variable as a path, since we can query one or more parameters using paths.
SSM patch manager: Creates patch rules for different operating systems
SSM Inventory: Collects applications running on our instances.
SSM automations: Allows the automation of a lot of steps. For instance: Create a VM, patch it, create a new AMI, delete old VM.

AWS Config

Tracks configuration changes for resources in our account.
Config rules allow to track the compliance of specific resources agains this rule.
Multi account and multi region can be aggregated into a single config account.

AWS Service catalog

Create and manage a suite of products.
Each product is a cloudformation template.
Each set of products is assigned to a portfolio.
Each user can be assigned to a portfolio.
Users can only manage products in their catalogs.

AWS Inspector

Continuously scans EC2 and ECR for vulnerabilities

AWS Service Health dashboard

Displays the health of every AWS service in every region

AWS Personal Health dashboard

Health of services related to you. Notifications can be set using cloudwatch events.

AWS Trusted Advisor

Provides recommendations related to cost optimization, performance, security, fault tolerance, and service limits. You can refresh the recommendations:

Using the refresh button in the console (Once every 5 minutes).
Using the refresh-trusted-advisor-check API

AWS Guardduty

An intelligent threat detection system to protect AWS accounts. No need to install anything.
Performs checks on: Cloudtrail logs, VPC flow logs, DNS queries. Can be integrated with Lambda and Cloudwatch events.

AWS Secrets Managers

Similar service to SSM parameter store. Specialized in managing and rotating secrets. The secrets can be integrated with Lambda and managed databases.

AWS Cost Allocation tags

Can be either AWS generated or user-defined. These tags are used for budget and reports by tags.

Autoscaling Groups - revisited

Launch Configuration: Specifies metadata to be used when creating Autoscaling Groups.
Launch Template: Specifies metadata to be used by Autoscaling groups, EC2, and other options. Supports a mix of on-demand and spot instances. Overall, it is a better option than Launch configurations.
Autoscaling Group Suspended Processes: Processes to suspend (for troubleshooting purposes)
1. Launch: Does not add new instances.
2. Terminate: Does not remove instances.
3. Healthchecks: No more healthchecks. The states of the machines are no longer changes.
4. replaceUnhealthy: Bad instances are no longer replaced.
5. AZRebalance: No longer rebalances instances across Availability Zones.
6. Alarm Notifications: No longer answers to alarms, including scaling alarms.
7. Scheduled Actions
8. Add to Load Balancer: Instances that are created are no longer added to the Target Group.
Scaling in can be disabled on specific instances of the Autoscaling group. This means the instance never gets terminated.
Autoscaling Groups Termination Policies: Determine which instances are terminated first:
1. Default: Check for the Availability Zone with the largest number of instances, and terminate one instance. Finally, terminate the instance with the oldest launch configuration.
2. Oldest Instance.
3. OldestLaunchConfiguration
4. NewestInstance
5. NewestLaunchConfiguration
6. ClosestToNextInstanceHour
7. OldestLaunchTemplate
8. AllocationStrategy
Integrating SQS with ASG: Autoscaling Groups can be integrated with SQS, by specifying the number of SQS messages per instance, and use it as a scaling policy for the ASG. To prevent VMs that are processing requests from being deleted, we can create a script that enables deletion protection on an instance, and then disables it when the VM is not processing any messages.
ASG Deployment Strategies:
1. in-place: Deployment on the same VM.
2. Rolling Update: Creates a new instance with the new version.
3. Replace: Creates a new autoscaling group.
4. Blue/Green: Creates a new ASG and ALB. We might need to shift traffic using route 53.

DynamoDB

When creating a table, it is mandatory to either create a unique partition key, or composite key(partition key and sort key).
Local Secondary Indexes can be created at table creation only. They are composite key formed of the same partition key as that of the table, but different sort key.
Global Secondary Indexes is when the primary key is different than that of the partition. It can be created and managed after table creation.
DAX clusters are a form of caching for DynamoDB.
DynamoDB Streams are used for realtime operations on the table.
To enable global tables, streams must be enabled, and the table must be empty. Global tables are replica tables in different regions.

Disaster Recovery

Recovery Point Objective (RPO): How much data is lost between a disaster and a successful backup.
Recovery Time Objective (RTO): How much time it takes to recover.
Disaster Recovery Strategies:
1. Backup and Restore: High RTO and RPO, but cost is low.
2. Pilot Light: Small version of the application is always running in the cloud. Lower RTO and RPO and managed cost.
3. Warm Standby: Full system is up and running but at minimum size. More costly, but even lower RTO and RPO.
4. Multisite / Hot Site: Full production scale on AWS and on-premise.

Additional Information

Below is a list of information gathered from different sources online, including, but not limited to, AWS tutorials and posts, AWS Documentation, forums, etc:

AWS Trusted Advisor is integrated with Cloudwatch Metrics and Events. You can use Cloudwatch to monitor the results generated by Trusted Advisor, create alarms, and react to status changes.
CodePipeline can be integrated with Cloudformation in order to create continuous delivery pipelines to create and update stacks. Input parameters, parameter overrides, and mappings can be used to ensure a generic template which inputs vary based on the environment.
The AWS Application Discovery Service helps plan a migration to the AWS Cloud from On-premise servers. Whenever we want to migrate application from on-premise servers to AWS, it is best to use this service. The discovery service can be installed on the on-premise servers in two ways:
1. Agentless discovery, works VMWare environments only, through deploying a connector to the VMware vCenter.
2. Agent-based discovery, through deploying the Application Discovery agent on the VM (Windows or Linux). The service collects static information, such as CPU, RAM, hostname, IP, etc. Finally, the service integrates with the AWS Migration Hub, which simplifies the migration tracking.
AWS Config allows you to Evaluate AWS resources against desired settings, retrieve (current and historical) configuration of resources in the account, and relationship between resources. AWS Config aggregators allow the collection of compliance data from multiple regions, multiple accounts, or accounts within an organization. Therefore, when you need to retrieve compliance information across regions, accounts or organizations, use AWS Config rules with aggregators.
Lambda functions can be used to validate part of the deployment on ECS. In this case, CodeDeploy can be configured to use a load balancer with two target groups, for test and production environments for example. Tests should be performed in either BeforeAllowTestTraffic or AfterAllowTestTraffic hooks.
While Canary Deployment is automatically supported in CodePipeline while deploying to lambda, it cannot be done for application in autoscaling groups. In order to do Canary deployments in autoscaling groups, we need to have two environments, and the traffic percentage is controlled by Route 53 for example.
AWS proactively monitors popular repository websites for exposed AWS credentials. Once found, AWS generates an AWS_RISK_CREDENTIALS_EXPOSED event in Cloudwatch events, with which an administrator can interact. aws.health can be used as an event source.
Cloudtrail can log management events, for example logging in, creating resources, etc, and data events, such as object level operarions.
When performing updates in an Autoscaling group, you can do a ReplacingUpdate by setting the AutoScalingReplacingUpdate and WillReplace flag to true, which will create a new autoscaling group, or a rolling update using the AutoScalingRollingUpdate property, which will only create new EC2 machines within the same ASG.
AWS Config is not capable of tracking any changes or maintenance initiated by AWS. AWS Health can do so.
Subscriptions can be used to access a real time feed of logs, and have it sent for other services for processing.
Amazon Macie is a security tool that uses ML to protect sensitive data in AWS (S3 in particular)
If you have a set of EC2 machines in an ASG, and some of the machines are terminating with no clear cause. You can add a lifecycle hook to the ASG, to move the instance in terminating state to Terminating:Wait state. Then you can configure Systems Manager Automation document to collect the logs of the machine.

Practice Exams

In addition to Stephane's course, a good approach would be to solve a few practice exams. This will familiarize you with the nature of questions to be expected on the exam.

As a matter of fact, studying the course alone is not a guarantee to pass the exam. The exam's questions are mostly use cases that require strong analytical skills, in addition to a good experience in providing DevOps solutions on AWS.

Jon Bonso's Practice tests are a great way to further apply the knowledge you learned, and better prepare for the exam. Two things make Jon's practice tests a must have:

They are close to the AWS exams, giving you a great overview on what to expect.
There is a great explanation to each use case. Jon explains in great detail each question, along with the correct choice of answers.

In addition, AWS posts sample questions with explanations.

Finally, AWS SkillBuilder posts a free set of sample exam questions for its users.

Conclusion

In conclusion, the AWS DevOps exam is a great way of testing your DevOps skills on AWS. However, achieving this certificate is not a walk in the park, and requires a lot experience, as well as preparation. Nonetheless, it is absolutely worth it!!

Best of luck!

Top comments (5)

Sherry Day • Mar 24 '23

Thanks a lot for this insightful post on the AWS DevOps Engineer Professional Certification! Your summary of key topics and exam requirements is really helpful for those preparing. Keep up the great work!

Indika_Wimalasuriya AWS Community Builders • Mar 25 '23

Thank you for sharing such a comprehensive article that covers a lot of details. It was very informative and insightful.

EVOLUTION25 • Jun 10 '25

Thanks for the excellent summary! 🙌 The structured breakdown and key takeaways make the AWS Certified DevOps Engineer Professional content feel much more digestible. I especially appreciate the clear pointers on exam domains and real-world context—it’s both practical and insightful. A great resource for anyone prepping seriously!

EVOLUTION25 • Jun 12 '25

Thank you for putting together such a concise and helpful summary of the AWS Certified DevOps Engineer – Professional exam. The breakdown of key topics and notes makes it easier to understand what to focus on during preparation. Did you find any specific section more challenging or requiring extra attention?

CWMalherbe • Sep 26 '23

Great read! Have you seen Adrian Cantrill's course material?