Let's assume our worst nightmare occurred: One of the most important production environments that are built on AWS EC2 services was hacked. What are the required steps for returning to your normal application behavior? What needs to be done for your forensic analysis? What steps should be done to not should be repeated? Do you prepare your incident response plan? In this blog, we'll answer these questions. Let's start together!
This step needs to be considered before the hacking. The preparation step is one of the critical tasks of your cloud security assessment and incident response plan. You need to be sure about your controls are in place that help you in the detection of anomalies within your AWS EC2 environment.
- Some examples of preparation that need to configure:
- Ensure your logging services are enabled on AWS such as CloudTrail, VPC Flow Logs, and AWS ELB Access Logs.
- Always think of encryption in rest and encryption in transit. Use AWS KMS encryption wherever possible (in an EC2 scenario, think AWS ELB, EFS, etc.)
- Review your attack surfaces and read AWS Security Reference Architecture for following security best practices in your environment.
This step can be achieved if you're all set in Step 1. Otherwise, you've been hacked for months, and attackers could be possible in your environment to wait for the right time to get full access or exploit you.
Detection is so critical; we need to configure AWS security services to do this:
- Based on Step 1, to gain visibility of your possible attack surfaces and your activities, you need to enable logging and monitoring services.
- For anomaly detection, you should think about implementing notifications and alarming with AWS EventBridge.
- Enable AWS GuardDuty which is a threat intelligence service.
- Enable AWS Config for analyzing all the changes in your AWS environment.
- Use AWS Detective for analyzing the hacking scenario. Which IP? What happened?
After Step 1 and Step 2, you can only detect the attack. But you're still hacked. What should you do as a cloud security engineer?
You need to do some configurations and changes after you understand that you're under attack. One of the most things that have been done is terminating compromised instances immediately. You should not do that; we need this instance for investigation and forensic analysis. Automation is the key here because it's a more simple and quick way to deny attackers.
These are the following steps to do:
- Detach this instance from any autoscaling group and ELB target group immediately. You don't want your customers connecting the compromised instance.
- Remove the earlier security group that is attached to your instance. Create a new security group. The new security group has included 0.0.0. 0/0 ingress and egress rule. Attach the new security group to the compromised instance. Delete the ingress and egress rule. You should think removing all rules in the existing security group is a solution, but it's not a solution for tracked connections. You should automate this process. Maybe you can create a Python function with AWS Lambda that can be invoked with an instance ID parameter.
- If you're using AWS EC2 Instance roles to access your AWS resources, roles create temporary credentials for you. If you're not disabling these, an attacker can still use them. To remove all actions from the temporary credentials, you should attach an explicit deny policy to your AWS EC2 role. If you're using hardcoded AWS credentials in your instance (you should not do this by the way), you need to disable them. Deleting AWS credentials directly can occur production issues if you're using the same credentials on different environments.
After completing the containment, you need to take a snapshot of the compromised EC2 instance immediately. You should not shut down the instance (We need information that can be deleted during the shutdown process).
Besides all of these:
- Take a memory dump of all processes in your instance.
- You need to create a new EC2. Install all your forensic and analysis tools in it. Also, you can create a new instance with an EBS snapshot that you've created from the compromised instance. For the security group, whitelist only newly created instance IP. You need to connect the EC2 instance with an EBS snapshot from the newly created EC2.
- If there is any log file in the server, extract all of them for detailed analysis. If not, use AWS CloudWatch, CloudTrail, and other logging options.
- Analyze and list the IP list that attacks you. If these IP addresses belong to AWS, report this abuse from here.
- Use AWS Config details for all your resource state changes.
- If you're using 3rd party tools, you should also review their details.
- Terminate the compromised instance after completing the EBS volume step.
Recovery & Lessons Learned
In Step 4, you did all your analysis about why this attack happened, how much did it affect our application and what should we do if it happens again. So, it's recovery time with a healthy, non-compromised instance. You need to create your AWS EC2 instance again.
Infrastructure as Code options such as AWS CloudFormation can help you about building a new server with configurations that you've determined. Also, if you have AMI from the backups (not compromised ones), you can create your AWS EC2 instance also. But please, you should not do mistakes that are configured before the incident happened.
After all these steps, you should monitor regularly for detecting something happening in your environment. In Step 5, you should learn your security issues, vulnerabilities that cause attacks, and your attack surfaces. Maybe you opened to an SSH port with a simple username/password combination, maybe you have a library that can be affected by a Log4j attack, or maybe you have an admin panel that is available all over the world. Who knows? You should know. You need to think about all the aspects of security incidents and hacking scenarios. Before all of these happened, you should use all logging and monitoring options, and all security services wherever possible. Also, you need to create an incident response plan for answering the "What should we do if it happens again?" In addition to this, you need to practice, practice and practice. You need to set "game days" to simulate your attack scenarios. This is not only a technical view, but we also need to see the team's reactions to a security attack.
Thanks for reading! Stay safe in the cloud! 🤞 ⛅️
Oldest comments (0)