DEV Community

Gilad David Maayan
Gilad David Maayan

Posted on • Updated on

Incident Response for AWS MySQL

What is AWS MySQL?

AWS supports MySQL in various ways, including the fully-managed Amazon Relational Database Service (Amazon RDS). It simplifies configuration, operation, and scaling tasks, allowing you to easily operate a relational database, like MySQL, in the AWS cloud.

Amazon RDS supports DB instances that run several versions of MySQL. It provides flexible capacity for your database and can help manage common database administration tasks. Once you create an Amazon RDS for MySQL DB instance, you can perform various tasks, including:

  • Resize your DB instance.
  • Authorize connections to your DB instance.
  • Create Multi-AZ secondaries and read replicas.
  • Restore data from snapshots and backups.
  • Monitor your DB instance’s performance.
  • Use MySQL apps and utilities to access and store the DB instance’s data.

What Is AWS Systems Manager Incident Manager?

Incident Manager is a feature of AWS Systems Manager that provides an incident response and management console. AWS Systems Manager helps you centralize operational data from various AWS services and automate tasks across AWS resources.

What is AWS Systems Manager?
AWS Systems Manager lets you create logical groups of resources, for example, applications, application stack layers, production, and development. It allows you to choose a resource group and then view various related metrics, including:

  • Recent API activity
  • Related notifications
  • Resource configuration changes
  • Operational alerts
  • Software inventory
  • Patch compliance status

AWS Systems Manager provides a centralized console to view and manage AWS resources. You can take action according to the specific operational needs of each resource group. This tool aims to provide you with comprehensive visibility and control over resources.

What is Incident Manager?
Incident Manager provides an incident management console for mitigating and recovering from incidents affecting applications hosted in the AWS cloud. An incident can be an unplanned interruption to services or an unexpected reduction in quality.

You can use AWS tools to extend the capabilities of Incident Manager. For example, adding CloudWatch alarms and metrics, AWS Chatbot, and AWS CloudTrail to facilitate rapid incident response. You can create and automate response plans and let Amazon EventBridge events or CloudWatch alarms initiate them.

Incident Manager lets you define runbooks and then use AWS Systems Manager automation to automate critical responses. These runbooks also provide detailed steps to first responders. Using various contact methods, Incident Manager can automatically engage the relevant stakeholders for each incident.

Incident Manager can automatically escalate through responders to ensure visibility and active participation during each incident and employ an AWS Chatbot client to ensure incident responders actively respond to incidents. It also provides incident tracking features, letting responders review incident details, follow runbooks, and create and remediate follow-up items.

Securing Amazon RDS

Amazon offers various features you can use to manage information security for AWS RDS resources and databases hosted on RDS instances. Depending on the type of task users need to perform with RDS, you can use any of the following access management methods:

  • Network access control—you can achieve the greatest network access control by running a DB instance within a virtual private cloud (VPC) via the Amazon VPC service.
  • User access control—AWS Identity and Access Management (IAM) policies enable you to assign permissions that allow only specific users to manage RDS resources. Resource management privileges include creating, describing, modifying, and deleting DB instances, modifying security groups, and tagging resources.
  • DB instance access control—security groups can help you restrict connectivity to databases on a DB instance only to specific IP addresses and EC2 instances.
  • Secure protocols—you can establish secure connections with DB instances running MySQL databases through Transport Layer Security (TLS) or Secure Socket Layer (SSL) protocols.
  • Data encryption—RDS encryption enables you to secure DB instances and snapshots at rest. It encrypts the data on the hosting server using the industry standard AES-256 encryption algorithm.

AWS Systems Manager Incident Lifecycle

AWS Systems Manager Incident Manager is an incident lifecycle management tool, which lets you recover AWS-hosted applications from a cyber attack as quickly as possible. It provides tools and best practices for every stage of the event lifecycle

Alerting and Engagement
Incident Manager supports the alerting and engagement phases of incident response by generating alerts about events in your application. This step starts before any event is detected and requires a deep understanding of the application. You can use Amazon CloudWatch metrics to monitor data about application performance that might indicate a cyber attack.

Triage
When an alert comes in, first responders need to determine its potential impact. Incident Manager lets you classify an incident into five levels of impact:

  1. Critical impact—complete application failure that significantly affects all customers.
  2. Partial application failure—incident impacting many customers Moderate impact—incident that results in reduced service to customers
  3. Low impact—incident that is felt by customers but has minimal impact
  4. No impact—incident that does not currently affect customers, but urgent action is required to avoid impact.

Investigation and Mitigation
Incident Manager provides an incident details view that provides the following capabilities:

  • Runbooks—these are automated processes that can investigate the incident, automatically discover data, or try common solutions. The playbook also provides clear, repeatable steps to help the team mitigate this type of incident.
  • Timeline—shows the actions taken, both automated and manual. Each intervention by the team is automatically recorded with a timestamp and automatically generates details.
  • Metrics —displays automatically selected metrics and metrics manually added by the team. This provides important information about application activity during an event.
  • Chat—you can use chat channels to interact with events directly through AWS chatbots. AWS Chatbot lets you use Incident Manager API operations in automated chat channels. You can resolve the incident directly from the chat channel, from any device.

Post-incident Analysis
After the event is over, Incident Manager enables post-incident analysis. This provides a structure for teams to develop new response methods and improve their process.

In this part of the process, Incident Manager helps teams discover and implement the following optimizations:

  • Changes needed to applications related to the event—hardening applications and improving fault tolerance.
  • Incident response plan changes—incorporating lessons learned over time.
  • Changes to runbooks—changing the steps needed to address a specific type of cyber incident.
  • Changes to alerting—after an incident, the team may have discovered an important metric or threshold that can be used to alert to an incident earlier.

Conclusion

In this article, I explained the basics of hosted MySQL on AWS via the Relational Database Service (RDS), and showed how to manage incident response for MySQL using the AWS Systems Manager Incident Manager feature. I hope this will be useful as you improve the security posture of your cloud-hosted databases in AWS.

Top comments (0)