The title sounds like a bunch of random nonsensical words strung together, but bear with me. AWS announced general availability for RDS Proxy at re:Invent 2019 - a highly available proxy for (you guessed it) RDS that pools database connections for your serverless functions to improve performance and decrease load on your databases. In this post I’ll be discussing more about the purpose of RDS and how to set it up for your Lambda functions.
Serverless has gained so much popularity in the last few years due to the scalable nature of running code in ephemeral functions. This is great because you only need to run code when you absolutely need to rather than having an always-on server sitting idle when it’s not receiving any traffic. However, one of the things that has taken some time to catch up is making database connections from a serverless function.
When running code on an always-on server, if you’re connecting to a database (E.g. MySQL or Postgres) usually your application will automatically handle database connection pooling.
Connection pooling is when a certain number of database connections are kept alive so that when a request comes in that needs to query the database it doesn’t need to create a new connection every time, it can borrow one of the existing connections from the connection pool. This saves on CPU and memory usage for the database since the only calls to the db are the ones that query data, rather than creating/closing connections concurrently. In Lambda functions we don’t have that luxury, and this has been a problem for some time.
Since Lambda functions are stateless and asynchronous, meaning that state from one function doesn’t carry over to any subsequent Lambda calls, database connections can’t be kept alive. Every database connection must be opened and closed during the runtime of the function. This has been a problem for many and the only real way to mitigate this issue was to control the concurrency of your lambdas to prevent them from flooding the database when they scale. There were some solutions that managed to reuse connections while the lambda was still warm but this was far from ideal and seemed more of a hack than a solution to a very clear problem in serverless architecture.
Introducing RDS Proxy
To solve this problem, AWS introduced RDS Proxy. When using RDS Proxy in front of your database, the Proxy will manage connection pools for you and allow your Lambda functions to borrow existing connections rather than create new ones. Then your Proxy will forward your request to RDS and return the result. Besides offering speed and scalability to your Lambda functions, Proxy also automatically fails over to standby databases if your primary database fails while preserving application connections. Proxy helps to mitigate database failures due to too many connections by allowing you to control the number of connections Proxy will allow. This means you could set it to only use 80% of the allowed maximum connections to prevent overloading your database. While it does help to scale, it is still limited by the maximum number of connections your database can allow.
RDS Proxy can also add additional security to your databases. It uses database credentials stored in Secrets Manager to connect to the database, but your Lambda can use IAM authentication by using the AWS SDK to generate a signed password to connect to the Proxy
This is a good summary of the authentication process that occurs when your application queries a database through RDS Proxy.
Credit: https://aws.amazon.com/blogs/compute/introducing-the-serverless-lamp-stack-part-2-relational-databases/
How to use RDS Proxy for MySQL RDS
To use RDS Proxy, you'll need to have a couple of core components:
- RDS Proxy (obviously)
- An RDS Proxy Target Group
- A compatible RDS database instance
- Database credentials in Secrets Manager
I'll be using CloudFormation templates to demonstrate the RDS Proxy configuration, but will assume you already have a database instance set up. Here is a very simple RDS Proxy configuration:
DbProxy:
Type: AWS::RDS::DBProxy
Properties:
Auth:
- AuthScheme: SECRETS # Proxy will use credentials in Secrets Manager to authenticate with the DB
SecretArn: !Ref: DbUserSecret
IAMAuth: REQUIRED # Connections to the Proxy must use IAM auth
DebugLogging: true
DBProxyName: my-db-proxy
EngineFamily: MYSQL
IdleClientTimeout: 1800
RequireTLS: true
VpcSubnetIds:
- !Ref: PrivateSubnet
VpcSecurityGroupIds:
- !Ref: ProxySG
RoleArn: !GetAtt: [ProxyRole, Arn]
Something to note here is that in order for RDS to authenticate with your database you need to reference credentials stored in Secrets Manager. This also means that you must create a database user that uses those credentials. What I would recommend is to make one database user for regular querying of the database with limited SELECT, UPDATE, INSERT, DELETE permissions, and another to run database migrations with permissions to change tables and such. The credentials for both of those users would need to be stored in Secrets Manager and referenced here in the Proxy configuration.
Here's a template for creating a secret in Secrets Manager that you can reference in the above config
AppDbUser:
Type: AWS::SecretsManager::Secret
Properties:
Name: app-db-user
Description: Application user for serverless app
GenerateSecretString:
SecretStringTemplate: '{ "username": "appUser"}'
GenerateStringKey: "password"
PasswordLength: 30
ExcludeCharacters: '"@/\' # MySQL doesn't like these characters
Next, you'll need to tell your proxy which database you'd like to forward requests to. To do this you need to create a Proxy Target Group:
ProxyTargetGroup:
Type: AWS::RDS::DBProxyTargetGroup
Properties:
DBProxyName: !Ref: DbProxy
DBInstanceIdentifiers:
- !Ref: MasterDbInstance
TargetGroupName: default
ConnectionPoolConfigurationInfo:
MaxConnectionsPercent: 50
MaxIdleConnectionsPercent: 12
ConnectionBorrowTimeout: 30
Target groups are what connects your Proxy to your databases. You can have multiple target groups with different configurations.
Now, all you need to do is connect to the proxy endpoint the same way you would connect to a regular database instance. The only difference is that rather than using regular database credentials, you'll use the AWS sdk to generate a signed password:
const { DB_HOST, DB_USERNAME, DB_NAME } = process.env;
const signer = new AWS.RDS.Signer({
region: '[REGION]',
hostname: DB_HOST, // The proxy endpoint
port: 3306,
username: DB_USERNAME
});
const token = signer.getAuthToken({
username: DB_USERNAME
});
const connectionConfig = {
host: DB_HOST
user: DB_USERNAME,
database: DB_NAME
ssl: { rejectUnauthorized: false },
password: token,
authSwitchHandler: function ({pluginName, pluginData}, cb)
// See here: https://dev.mysql.com/doc/internals/en/clear-text-authentication.html
console.log("Setting new auth handler.");
}
};
You can now use this connection config (or a similar version) in the database library of your choosing.
Your Lambdas are now ready to see very real decreases in their database connection times and you can now sleep soundly. This was a very brief and shallow introduction to RDS Proxy, but I hope I was able to demonstrate it's advantages and clear up some of the gotchas I ran into when I first started using it. For anyone interested in a much deeper dive into RDS Proxy, here's a fantastic resource that I hope helps:
Top comments (0)