Recently I helped my friend migrate her eShop from WordPress.com to AWS. Since her business keeps growing to a larger scale, a third-party eCommerce hosting platform no longer fulfill her business needed. To satisfy the growing business needed, I have designed, transform, enhance, and migrated to a high availability architecture on AWS for her WordPress eCommerce business.
To reuse the infrastructure and simplify the deployment, I have used the AWS CDK. Cloud Development Kit is an open-source framework to define the infrastructure as code using familiar programming languages. I have written the infrastructure using Typescript, and after the CDK is compiled, a CloudFormation template will be generated and deployed on AWS.
During the migration, I find out there aren't many documents or information about hosting WordPress on AWS, that why I written this blog post.
In this post, I won't explain so much about what those AWS services usage and benefit, and only talk about the reason I use it and how I configure it. This blog post is more like the extension of the WordPress: Best Practices on AWS.
AWS has provided the Best Practice document WordPress: Best Practices on AWS. After I read the document, I find out it still has a lot of room for improvement, it also doesn't take care much of the security of the WordPress application.
Other than that, there are some interesting ideas I have implement in this solution. Like:
- How to secure the admin page(wp-admin)?
- How to ensure your internet-facing application load balancer only allow traffic from CloudFront?
❗❗❗Please ensure you have read through the Best Practice document before reading down.❗❗❗
❗❗❗Also please ensure you have the basic knowledge of AWS cloud or any other cloud providers.❗❗❗
Basically, the architecture can split it into multiple pillars:
According to the best practice document, it used EC2 for hosting. But I choose to containerize the WordPress application because Docker runs above OS, which means the container is much lighter than an EC2 instance, also the startup speed is faster and uses much less memory.
For running a containerized application on AWS, I am hosting the WordPress application using Fargate on ECS. By using Fargate, I don't need to provision and manage any EC2 servers, only need to allocate how much vCPUs and memory for the container. And recently, Fargate Spot has announced which can take up to 70% of cost-saving.
This architecture takes advantage of the capability of the capacity provider on the ECS cluster, using the default capacity provider strategy to launch multiple tasks mixed with Fargate and Fargate Spot launch type. For the first 3 tasks in the services, it will be standard Fargate launch type, which provided a baseline for high availability. And for the rest of the tasks, for every 3 tasks, 2 tasks will be the regular Fargate launch type, and the rest 1 task will be the Fargate Spot launch type. This strategy which can provide high availability service but also maximize the cost optimization.
To distribute the incoming request to the multiple running containers equally, an application load balancer has been set up. I have chosen to use the LOR algorithm instead of the regular round-robin algorithm. Because the round-robin algorithm will not consider the capacity or utilization of the target containers, this led to over-utilization or under-utilization of the target containers when the request has long process times. Whit the LOR algorithm, now the application load balancer can router the requests to the least number of an outstanding request, which further reduces the response time and balanced the utilization for each container.
There are 3 containers for each task, Nginx, PHP FPM, and X-Ray daemon. The reason I choose Nginx instead of Apache is the plugin W3 Total Cache support set up the page caching on Nginx instead of the application level. The X-Ray daemon was created for the AWS X-Ray plugin in WordPress, the plugin which collects the performance and information of each request and sends it to AWS X-Ray through the daemon.
WordPress is a stateful application, which means for each time you do some changes to the configuration, the files of WordPress will be changed. But Docker is a stateless service, which will lose all of it changes when every time you restart the container. If the WordPress application running in such an environment, its configuration will be lost when scale-in or scale-out events. To handle such a situation, I have used AWS EFS to store the WordPress application. EFS is a distributed file system, which means when every time the WordPress application file changed, the other running container will still using the same file instead of a local isolated file. Also, I have set up the lifecycle rule for the files where hasn't use over the past 90 days, to change the storage class to infrequent access, which to lower the cost.
To start a WordPress application, a MySQL database is required. Amazon RDS is a managed database service, which means I don't need to take care of database setup, patching, and backup. RDS support multiple databases, Aurora is one supported database in RDS. Amazon Aurora is developed by AWS and it is a relational database with MySQL and Postgres-compatible, It provides 5-times faster than a regular MySQL database, and the security, availability, and reliability of commercial databases at 1/10th the cost.
Aurora has 2 types of database - Provisioned and Serverless. The provisioned cluster is a regular database cluster, a master node, and multiple read replica. Another type is Serverless and I have chosen to use Aurora Serverless to running the MySQL database because I am not sure what size of the node type should be used. The Aurora Serverless was a little different from the provisioned type, it simular to Fargate. I only need to allocate how many ACU it need on startup and the maximum ACU it can scale, it natively provides auto-scaling features according to the CPU usage and number of connections of the database. Also, the Aurora Serverless support auto-pause, when your database has zero activity for a period of time, the database will be closed, and the database will back online when there is an activity to the database. I have disabled the auto-pause features since when the database went sleep, it needs to take at least 30 seconds to back online.
But be noticed! Aurora Serverless database divided into 3 layers:
- Storage Although the proxy and storage layer are multi-AZ, the compute layer was SINGLE-AZ, which means when your database went down at the compute layer, it takes more time to back online compare to the provisioned cluster.
IF YOUR REQUIRE FOR RUNNING A HIGH AVAILABILITY DATABASE, DO NOT USE AURORA SERVERLESS. AND PLEASE USE THE AURORA PROVISIONED CLUSTER.
I have chosen Memcached as the in-memory cache storage instead of Redis. And using Amazon ElastiCache for running the Memcached cluster. Amazon ElastiCache is a fully managed in-memory data store compatible with Redis and Memcached, which means I don't need to take care of the setup and patching.
The Memcached has better performance than Redis, according to these 2 blog posts.
Should I use Memcached or Redis for WordPress caching?
and Redis & Memcached Cache for WordPress on VPS or Cloud
To reduce the response time and put the cache into Memcached from WordPress, I have installed W3 Total Cache in WordPress. This plugin can setup different cache like
Object Cache, and
Fragment Cache into Memcached. As above mentioned, I am using Nginx for running the WordPress application, and the W3TC support setting up the
Page Cache at Nginx level instead of WordPress application, which means when a request comes in, the Nginx will first search for the Memcached for the page caching, if any cache hit, Nginx will return the hit page cache instead forward the request to PHP-FMP, which further reduce the response time.
To further reduces the response time, a CloudFront distribution has set up in front of the application load balancer. The distribution will cache the static content or requests into the edge location around the globe. When any user browsing the WordPress application, if the requested content is hit at the edge location, the distribution will return the hit cache instead of forward the request to the servers. The edge locations are usually located around the user which provided the lowest latency and faster delivery speed.
There are 2 hosted zones in Route 53, a public and a private. The public hosted zone will have multiple records including an alias A record of the WordPress and an alias A record of the static file of the CloudFront distributions.
The private hosted zone is associated with the VPC and in the private hosted zone, multiple A records were created for the different AWS resources hostname, including the Aurora Serverless cluster, the EFS file system, the ElastiCache Memcached cluster, the Elasticseach domain, and the private application load balancer. It provides identical and memorizable names With these records to various created AWS resources.
Unlike the usual subnet division, the VPC is consists of 3 types of the subnet,
Isolated. The first 2 types of subnets are the same as usual, the outbound-traffic in the
Public subnet route through internet gateway, the outbound-traffic in the
Private subnet route through NAT gateway. The traffic in the
Isolated subnet will not have any internet access and only have local routing accessibility, which can ensure the resources inside are absolutely safe and isolated away from the internet.
Also, a network access control list has been set up for the created VPC, which can ensure any traffics comes in are for browsing WordPress application.
To encrypt data in transit, both application load balancer and CloudFront distribution have set up using the signed TLS certificate in ACM. With the benefit of the L2 constructor in CDK, these TLS certificates will create in ACM also create DNS records in Route 53 hosted zone for validation. For further to restrict user must connect to WordPress application using HTTPS, both CloudFront distribution and application load balancer has set up redirect HTTP to HTTPS.
For security consideration, all of the stored data will be encrypted using the AWS managed keys in AWS KMS. The AWS managed keys are fully managed by AWS including the rotation. The Aurora Serverless cluster, EFS file system, static and logging file bucket in the S3 bucket all are encrypted using different AWS managed keys in KMS.
To secure the WordPress application away from the common web exploits, web ACLs have been set up on the CloudFront distribution. For the web ACL on the CloudFront distribution, I have assigned multiple AWS managed rule groups, they are
AmazonIpReputationList, please be reminded, part of the rules in some of the rule groups has been excluded, otherwise, it will block most of the requests to your WordPress application. With the above rules groups, the web ACL now can provide general protection against a wide variety of common threats and vulnerabilities.
DDoS is the most popular attack over the internet, with AWS Shield it can provide always-on detection and automatic inline mitigations that minimize application downtime and latency. AWS Shield Standard is automatically enabled for the CloudFront distribution and Route 53 hosted zone, which can protect against all known infrastructure (Layer 3 & 4) attack.
AWS Config has enabled to record and evaluate the configuration of the created resource. Multiple managed rule in AWS Config has been configured. A tag
aws-config:cloudformation:stack-name will be associated with every created resource during the CDK deployment. And for every rule in AWS Config was set to use tag-based policy to record the tagged resources. If the AWS Config recorded any configuration changed and non-compliance to the changed configuration, it will send an email using SNS to the administrator to notify the non-compliance situation.
In the following section, I will talk about
- How to ensure your internet-facing application load balancer only allow traffic from CloudFront?
- How to secure the admin page(wp-admin)?
The value of the custom header will be defined during the CDK deployment, and it will be a base64 encoded domain name of the public hosted zone, the hash value will be the same at the CloudFront distribution and the associated WAF web ACL, else all requests will be blocked.
Anyone who has ever used WordPress will know
wp-admin is the admin page, which causes huge loopholes in web security, which allow hackers to use brute force cracking methods similar to unlimited loops to crack the admin password.
There are 2 methods were implement to secure the admin page in the solution. The first is to add a whitelist IP addresses list, second is set up a client VPN to access the private application load balancer.
For the first method, the whitelist IP addresses list need to fill in before the CDK deploy. After deployment, an IP set will be created in WAF. The WAF web ACL associated with the application load balancer will check the inbound request that comes from the CloudFront distribution, will also validate the IP address of the request with the 'admin' prefix in the path. If any other IP addresses that not on the whitelist attempt to enter the admin page will be blocked.
This should be the easiest way to protect the WordPress admin page, but it requires the end-user to have a static public IPV4 address. The whitelist needs to update whenever the IP address update or changes.
For the second method, it should be the most secure, feasible, and flexible. AWS client VPN is a managed VPN client-based services that allow end-user to access the private or restricted resource on AWS or on-premise securely, and it can run on any OpenVPN-based client. An AWS client VPN will be created in the VPC, for every end-user who connected to this VPN will have internet access through NAT and access to the private resources. A private application load balancer has been created, and its created for the connected VPN end-user which allows access to the WordPress application including the admin page securely.
Although Aurora Serverless cluster the snapshot and back in RDS, AWS Backup is set up to manage backups across the AWS services that WordPress used, including the Aurora Serverless cluster and the EFS file system.
AWS Serverless WordPress
Please read the blog post for introduction and explanation Dev.to: Best Practices for Running WordPress on AWS using CDK
WordPress Plugin Used
- W3 Total Cache
- WP Offload Media Lite
- Multiple Domain
- HumanMade - AWS-XRay (Working on making it work...)
Deployment - (To be update)
Before getting started
Please make sure you have/are
- Using bash
- Node.js installed
- NPM and Yarn installed
- Installed and running Docker
- Installed and configured AWS CLI
- Installed the latest version of AWS CDK CLI
Please be notice, this stack only can deploy into us-east-1 0. You should have a public hosted zone in Route 53
- Initialize the CDK project, run
- Deploy the CDK Toolkit stack on to the target region, run
cdk bootstrap aws://AWS_ACCOUNT_ID/AWS_REGION --profile AWS_PROFILE_NAME
- Copy the
config.sample.tomland rename to
make easy-rsa-init gen-cert import-certto generate the certificate for the Client VPN
- Modify the…