One of our customer-facing websites got attacked with a huge DDoS attack recently, to the magnitude of 80M-100M requests per hour. For context, our website usually just receives 30,000 requests per minute. That's a 3000x increase in traffic. Luckily, we were able to respond promptly and we stopped the DDoS attack within 3 days. But those 3 days were the most stressful times of my year (so far).
I created this field guide for developers so I can save all of you the firefighting and research I had to do on the fly during those 3 days. While this is by no means a comprehensive guide, it will give you the tools necessary to respond to a DDoS attack.
What's a DDoS attack?
At its core, DDoS attacks are denial-of-service attacks. They aim to overwhelm your servers with so much fake traffic that your legitimate end-users won't be able to access your application. This results in downtime, and with your website down, your customer won't be able to buy anything, and your revenue grounds to a halt.
There are 3 types of DDoS attacks: Application Layer, Protocol Attacks, and Volumetric attacks. We will focus on application layer attacks for this article.
Starting Architecture
Let's start with this typical network architecture. We have an eCommerce website called jambyswags.com
. Both its PHP backend and NuxtJS frontend applications are hosted in EC2. Once the customer access the website, the request is routed to a load balancer that distributes traffic between 2 EC2 instances.
Confirm if you are being attacked
A DDoS attack usually starts with your website becoming unavailable to all of your users. First, check the CPU utilization of the EC2 instances of your backend and frontend applications. If your EC2's CPU utilization is overloaded, then this is a sign it's a DDoS attack.
The next step is to go to your ALB's CloudWatch Metrics to check your application's request count. In our case, our website typically takes in 500 requests per minute. Then, we suddenly experienced 1M - 1.5M requests per min. That's a 3000x increase from our baseline, a big sign of a DDoS attack.
The final step is to check if the traffic is legitimate. You don't want to stop your legitimate users from going bananas shopping on your website, especially during a sale. One way to do this is to enable VPC Flow Logs for the ENI of the load balancer. From here, you can see the IP addresses connecting to your application. If it's all from the same set of IP addresses, that's credible proof that you are experiencing DDoS Attack.
Another indicator is if you're not having a promo or major ad push for that day but are experiencing a big jump in traffic.
First Response: AWS WAF
Now that you are sure you have a DDoS attack on your hands, it's time to bring the big guns. As a first response, it is essential to add AWS WAF behind your load balancers. With AWS WAF, we create a web ACL that contains rule groups. Rule groups can be managed by AWS or customized by you. It can also be a regular rule group, which checks requests based on their contents, or a rate-based rule group, which sets a cap on the requests per minute coming in from each IP address.
Before the request goes to your ALB, the web ACL scans the requests based on the rule groups you add to it. For web applications, we recommend adding the following rule groups:
-
AWS-AWSManagedRulesAdminProtectionRuleSet
- checks your requests for paths that may be trying to get access to your admin pages -
AWS-AWSManagedRulesAmazonIpReputationList
andAWS-AWSManagedRulesAnonymousIpList
- checks your requests if it comes from suspicious IP addresses -
AWS-AWSManagedRulesCommonRuleSet
- checks common exploits found in the OWASP 10 AWS-AWSManagedRulesKnownBadInputsRuleSet
-
AWS-AWSManagedRulesSQLiRuleSet
- checks against SQL injection
These rules check against suspicious IP addresses and potentially malicious request bodies. Meanwhile, the Bot Control is a rule that should be last in your web ACL. It has an additional cost of 1USD per million but protects your application from malicious bots trying to access your application.
- AWS-AWSManagedRulesBotControlRuleSet
Another thing you can do is add a rate-based rule. In this example, you can block IP addresses if the rate at which they access the site exceeds 500 requests per minute.
The caveat with AWS WAF is it can potentially break your application. For instance, the CommonRuleSet has a rule against request bodies exceeding a certain size. If your API was uploading 7MB files, it's probably going to hit this rule and your end users won't be able to upload files. The key here is to test it out first in staging to identify the rules that break your application. Then, you can adjust those rules from "BLOCKED" to "COUNT"
First Response: Route 53 Geolocation
A quick and dirty way to limit DDoS attacks is to limit who can resolve your website's domain name (i.e jambyswags.com) to the IP address of your load balancer. In our case, our customer's website engages audiences only in Singapore and the Philippines. And the DDoS attack featured servers from Asia, Europe, Canada, S.America, and Africa. With this, it makes sense to only make our website accessible only to people from Asia. This effectively stopped all attacks from non-Asian countries.
While Asia is still a big audience, the attacker now becomes limited in what servers he/she can use. It won't stop the attack, but it can help reduce its volume.
To make it even more effective, try shifting your API to a different domain name to force your attackers to use the DNS resolver. For example, move your backend from "jambyswags.com" to "prod.backend.jambyswags.com"
First Response: File Abuse Teams
With the two actions above, your applications are more protected against DDoS attacks. However, you may see your AWS WAF costs spike. As of writing, AWS WAF charges 0.60USD per million requests. In our case, we were attacked with 90 million requests per hour over 55 hours. That's 4.95 billion requests, or 2,970USD. That's still a huge price to pay for protection.
One way to permanently stop these attacks is to those IP addresses to the Abuse Team of AWS, GCP, and Azure. This process usually takes a few days, but the sooner you get the ticket filed, the sooner it can process. To get those abuser's set of IP addresses, you can browse the Sampled Request section of your AWS WAF.
You will also have to provide application logs that these IP addresses are disrupting your website
Second Response: Reduce Surface Area
Your application is protected, for now. While the intruder can't brute force your website anymore because of the rate limits of AWS WAF, they can still try to penetrate your application. One way to prevent this is by reducing the surface area exposed to attackers.
First, move your application servers to the private subnet. With this, they can no longer be accessed directly, only through the load balancer. If you need to SSH to them, use AWS Sessions Manager via the AWS Console.
Second, audit the security group of each resource inside your VPC. Make sure only the ALB exposes a port to the world (0.0.0.0/0). The rest of the services exposes only the port they need to expose to only the resource that needs to access them. For instance, the Backend EC2 should have port 80 open only for the ALB that forwards requests to it.
Third, create a Database Private Subnet. It is a private subnet that has much less network access than the private subnet where your application is hosted in. Its NACL rules are also tighter in that it opens a few ports inbound and outbound.
Second Response: Use private communications for API to API comms
Sometimes one of your backend APIs will have to connect to another backend API in your network. Typically, your first backend API will send a request that traverses the open internet, back to your second backend's ALB, and to the second backend API. Aside from additional network costs you might incur, this link may be tagged as a DDoS attack by WAF, especially if this link is high volume.
To solve this, create an internal ALB that resides in your application private subnet. With this, your first backend API sends a request to the internal ALB within the same network and the traffic doesn't have to traverse e the open internet.
Third Response: Migrate FE to a Single Page Application
In our case, our FE application was a NuxtJS application deployed to EC2 with the command nuxt serve
running. With minimal adjustments, we were able to able to generate a single-page application using the npm generate
command. We uploaded the generated dist folder to S3, hosted it as a static website, and connected it with CloudFront.
Depending on how your FE application is written, this may not be as easy to do or even possible. Some FE applications are baked into the Backend API and require a full rewrite to become an SPA. Some FE applications are purposedly hosted as a Server Side Rendered (SSR) application and are hard to migrate to an S3-CloudFront setup.
CloudFront offers additional DDoS protection as the first point of contact for your application is the edge location of CloudFront, and CloudFront has DDoS protection built in. It is also much cheaper to serve requests via CloudFront since it is cached nearer to the user.
Third Response: No-Cache CloudFront for the BE
With this technique, you will add a no-cache CloudFront distribution to your application. This way, your API's first point of contact will be an AWS edge location, and CloudFront can use its anti-DDoS features. Visit this AWS blog to learn more.
Fourth Response: Consider AWS Shield Advanced
If your DDoS attack is sophisticated, the tactics we mentioned above may not be enough. Here are some common work throughs they can do:
- Use thousands of IP Addresses from Asia to rain down hell on your application (and evade rate-based rules)
- Determine your application's most expensive API endpoints and target that with a low volume DDoS - For example, if your add-to-cart functionality takes 3s to load and is using SQL statements that take much effort from your DB to fulfill, they can get maybe 100 IP Address target that endpoint with 100 requests each. You'll be down in no time.
- Overwhelm your WAF and CloudFront with millions and tens of millions of requests per minute such that even if your website doesn't go down, you will burn through your AWS budget.
For the first and third scenarios, you can count on AWS Shield Advanced. With this service, you will have a dedicated Shield Response Team who will proactively make changes to your AWS environment to protect your assets. You will also be not liable for any bill spike resulting from the provisioning of excess assets during a DDoS attack.
The downside is that AWS Sheild Advanced is 3000 USD per month, with a 12-month commitment. Hence, this service makes more sense to avail if you are in an enterprise with deep pockets and a lot of assets to protect.
Conclusion
While I don't wish a DDoS attack happens to your applications, it's best to be prepared for this possibility. This article provided you with 4 layers of responses you can do to keep your website protected.
Top comments (3)
This doesn't really make much difference. A SPA needs to talk to a backend. That means the backend endpoints need to be publicly accessible. Meaning those endpoints then become the next target and those are the ones that need to be secured. Serving SPA assets from an S3 bucket is just a stopgap.
Q4 is the season of increased security attacks, so this article is very timely. Thanks Jamby!
Thanks Kaye <3