Yan Cui for AWS Heroes

Posted on Sep 9, 2023 • Originally published at theburningmonk.com on Sep 9, 2023

Static IP for Lambda: ingress, egress and bypassing the dreaded NAT Gateway

#aws #lambda #serverless

Many vendors require you to have a static IP address for your application. Such that all requests to their API must originate from an allow-list of IP addresses.

In some cases, they even mandate that you use a static IP address for ingress traffic too. So they can communicate with your system through a trusted IP address.

These practices are a throwback to the heyday of on-premise software and represent a hurdle for modern, cloud-native applications. But nonetheless, sometimes we are stuck with a vendor and have to do what they ask…

So for those who are building serverless applications and struggling with these archaic requirements, I hope this post offers you some solace.

Static IP for egress

The “proper” way to assign a static IP for egress traffic is to:

Put the Lambda function inside a private subnet in a VPC
Assign a NAT Gateway to the VPC. This grants internet access to the resources inside the private subnets.
Attach an Elastic IP to the NAT Gateway instance. This associates a static, public IP address to the NAT Gateway.

Now you can add the static IP address to the 3rd party service’s allow-list.

I also published a YouTube video about this topic.

The video also talked about the cost overhead for this setup. Because a NAT Gateway has an update cost of ~$33/month even if you don’t use it.

But in the YouTube comments, kl9560 gave me an ingenious idea on how to bypass NAT Gateway altogether and do away with the associated costs.

Bypassing the NAT Gateway for egress

Lambda functions run on bare-metal EC2 instances managed by the AWS Lambda team. These EC2 instances run inside an AWS-managed VPC. When you connect a function to your own VPC, a Hyperplane ENI is created to allow the function access to resources within your VPC.

You can read more about how private networking works with Lambda on this official documentation page.

Note this important paragraph:

“Lambda creates a Hyperplane ENI when you define a unique subnet plus security group combination for a VPC-enabled function in an account. Lambda reuses the Hyperplane ENI for other VPC-enabled functions in your account that use the same subnet and security group combination.”

This means that (caveats aside) we can identify the ENI for a Lambda function by its subnet and security group configurations.

Take this Lambda function, for example, this is its VPC configuration.

It’s placed within a pair of public subnets, with a route to the public internet via an Internet Gateway. Internet Gateway is free to use, but you still need to pay the data transfer costs.

This is NOT how you would normally configure a Lambda function in VPC. This setup does not grant the function access to the internet because it doesn’t have a public IP address.

But remember what I said earlier about finding the ENI for a function based on its subnet and security group settings?

If we go to the EC2 console and look under “Network Interfaces”. We can find the ENIs that the function would use – one for each subnet.

Now we can create two Elastic IPs and associate them with these two ENIs. It will give the function access to the internet without needing a NAT Gateway. The egress traffic would originate from one of the two Elastic IPs that are associated with the ENIs.

This approach removes the need for NAT Gateway by taking advantage (ahem… exploiting) of a piece of Lambda implementation detail.

As always, there are dangers when you depend on implementation details within a service and walk outside the well-trodden paths.

This approach breaks down in two ways:

ENIs are created for each unique combination of subnet and security group. If you change either configuration on a function then you would get a new set of ENIs. You’d have to repeat the process of associating the new ENIs to your elastic IPs. This switchover would require downtime as it’s not an atomic operation. During this, the function would lose access to the internet. But you can minimize this time window through automation – e.g. using a CloudFormation custom resource.
There’s a limit of 65,000 connections per ENI. When the connection limit is reached, Lambda would create a new ENI. In this really unlikely event, some instances of the function would use a different ENI because they are the 65001st or 65002nd connection.

These are not necessarily showstoppers. But there might also be potential security risks with this approach. Although I can’t think of any myself, so please let me know if you do!

Nonetheless, I don’t recommend using this approach in production. But for the lower environments, it can be a helpful trick to help you save on those dreaded NAT Gateway costs.

Static IP for ingress

As stated at the start of this post, some vendors also require a static IP for callbacks/webhooks. This can be tricky because neither the API Gateway nor the Application Load Balancer (ALB) provide static IP. And Lambda doesn’t work with the Network Load Balancer (NLB).

Instead, you can use AWS Global Accelerator in front of the ALB.

There are additional costs for Global Accelerator:

$0.025/hour
Data transfer premium fee on top of the regular data transfer out costs. This varies by destination, so please see the official pricing page for more details.

Despite the additional moving part and the associated costs, Global Accelerator is the easiest way to create static IPs for ingress traffic for Lambda.

I hope you’ve enjoyed this article. If you want to level up your serverless game, why not check out the Production-Ready Serverless workshop? I will teach you everything I know about building serverless applications. From structuring projects, testing, deployment and security, to monitoring and troubleshooting in production.

Oh, and when you sign up for one of the upcoming workshops right now, you will also get access to my “Testing Serverless Architectures” course at no extra charge.