So a few weeks ago I was asked what my strategy was for accessing internal AWS resources such as S3, DynamoDB etc. where it is possible to access over VPC endpoints as well as the internet. My first point of reference for them was the great map by Corey Quinn, Chief Cloud Economist at The Duckbill Group which looks at the costs for moving data around AWS.
In my mind I also had thoughts around ease of use for developers, security and speed in addition to the cost that lead me to a single solution. However, when discussing with the individual I realize that I'd never formally assessed my view. So that's what I hope to achieve here. I'll look at each option and compare against the Well-Architected Framework and see how my initial thoughts hold up.
So what are the various options? I personally see 3 common patterns, and while there are probably more, these are the one's I'll focus on:
- Direct internet via Internet Gateway
- Internet access via NAT Gateway
- Access via VPC Endpoint
I'm also not going to look at centralized solutions. For me there is a significant impact on cost, security and operations for these solutions and unless a really justifiable reason I would not recommend them.
When calculating costs I will be basing calculations on transferring 500GB to/from AWS S3 in 4 different regions (N. Virginia, Ireland, London and Sydney) This should give enough comparison as most costs are higher outside N. Virginia.
So this is the simplest solution, and 15yrs ago was the only option for EC2 to access anything. With implementation of VPCs in 2009 this was still the simplest solution and any other access had to be via a EC2 based proxy or NAT solution.
Direct Internet is also the cheapest solution at $5 per 500GB with no change in costs across different regions. As such there is no arguing the 2 stars for Cost Optimization.
Arguably it is the most reliable and performant of the solutions as there are no additional devices in the network path to fail or impact traffic. However it is subject to general internet congestions and someone else's popular website could slow down API calls or transfers from AWS services. As such each of these pillars is given a single star in my view.
However, it is the most insecure. All instances would have to be in a public subnet and have a public IP. This is a huge attack vector for the environment. I have heard in the past that with no addition controls EC2 instances with public IPs will be compromised in less than 5 minutes. This means that the management of instances becomes very onerous and increases the operational controls and efforts required to maintain a secure and functioning.
In my view security is so fundamental that it carries more weight than other pillars. Is a solution is not secure it doesn't really matter to some extent if it is high performance and resilient. So in my view lack of security drops 2 stars along with dropping 2 stars for operational overheads.
This gives Direct Internet Access a score of 4 out of 10 stars.
In 2015 when NAT Gateways were introduced they were seen as a major step for networking. Self built NAT solutions that relied on alarms and user data scripts could be removed and a fully managed NAT solutions could be dropped in. Although a small cost difference compared to a self-managed solution, the improvements in operations meant most large organizations migrated quickly to NAT Gateways.
This improved the security of access to the Internet and AWS services dramatically by removing the need for a public IP. It also removed the self managed instances some organizations were using. However there is still little concern as NAT Gateways do not offer any content filtering. However NAT Gateways gets a respectable 1 star for the performance pillar.
So what's the impact on cost? Based on the 5 scenarios, NAT Gateways come in 3rd with a cost of $70 a month for 500GB of data. While not a significant amount if you are transferring a lot of data between your VPC and AWS services this could soon mount up so something to be aware of. Another thing to be aware of is that NAT gateways are multi-AZ but have a single route-table entry. So in some scenarios traffic might traverse AZs. This is a 10% overhead on charges and again while not significant at low data volumes if you are moving a petabyte of data that would be approximately $2000 extra. So for the cost optimization pillar NAT Gateway gets 1 star.
So what about reliability and performance. Although a managed service improving these areas over a self managed solution, as with direct internet traffic to AWS services through a NAT gateway go over the internet and can be subject to variance in performance. So again a each of these pillars is given a single star.
Finally operations. Well in this areas things are drastically improved. There is a reduction in operational management of security but still a lot that needs to be done to manage access to services. As such for me NAT gateways get 1 start for Operational Excellence.
This gives Internet access via NAT Gateway a respectable score of 5 out of 10 stars.
2015 was a good year for networking and 7 months before NAT Gateways we released AWS released the VPC Endpoint service. Initially only for S3, 108 services as of 5th December 2021, it allowed for traffic to be routed over AWS' private network rather than the internet. As a managed service it simplified routing to S3 as well a host of other functionality.
For example, in addition to private routing, 86 of the current 108 services supporting VPC endpoints also support VPC endpoint policies. This means that access to AWS Services can be controlled at the endpoint level as well as the IAM level. For some services such as S3 VPC endpoints can also be used in the resource policy. This restricts access to the resource from specific VPC endpoints. For me this increased security pushes the security pillar to 2 stars as resource, principle and route can all have policies applied ensuring the highest level of control.
For performance as traffic is now routed over AWS' private network there is less impact from other customers or general internet loads. In addition each VPC Endpoint has 1oGbps capacity with bursting to 40Gbps. If we compare this to NAT Gateway that has a throughput of 5Gbps with bursting to 45Gbps this is a significant improvement as each service now has dedicated bandwidth which is not only more performant but more consistent. For me this pushed performance pillar and reliability pillar both to 2 stars.
For operations, whether traditional or DevOps, I see great benefits in VPC endpoints. They can be deployed along with a VPC to provide access to multiple AWS services and can remove the need for any internet access. In addition if provisioned in the same pipeline as the resources being access they can be referenced in security policies. There are then overheads for managing all these policies and ensuring they meet wider standards and for that reason I think VPC endpoints drops a star giving it 1 star in the operational excellence pillar.
So what's the impact on cost? Well surprisingly it is cheaper than a NAT Gateway. For 500GB of data a month the average cost is $13. Not a huge increase from the $5 for direct internet access and significantly cheaper than the $70 for NAT Gateway. Taking into account the increase functionality and performance this is a fair trade-off and for me gives VPC endpoints 2 stars for Cost Optimization.
All this gives access via VPC Endpoints a well deserved score of 9 out of 10 stars.
So I am glad that my initial recommendation of VPC endpoints stood up to scrutiny, even if my own. Many of my recommendations are based on using AWS services for a significant amount of time and seeing incremental changes. As such I sometimes don't always dig into why I am making a recommendation.
I've enjoyed writing this so might do a few more of this type of post so please let me know if you found it useful, interesting or if you disagree with some of the views.