When I was working on the license key management solution for my application CloudPouch I had to face the deferred cancelation of paid subscriptions problem.
When a user cancels his/her subscription for some reason, the license key must work until the end of the current billing period. Since the use case is not complicated, I decided to solve it as simple as possible, using the available Serverless services, following the principles of architecture-controlled architecture.
Serverless scheduler
The topic of the serverless scheduler has been appearing fairly regularly for years. In my opinion, this is a repetitive problem that AWS should provide us with a managed solution. Discussions in the AWS Community Builders Slack channel have not brought any effects, and we still need to implement it by ourselves.
Fortunately, AWS provides several primitives that you can use while building your own scheduler. The most popular options are:
- DynamoDB TTL
- CRON in EventBridge (used to be in CloudWatch)
- StepFunctions.
Selection of a solution for a business problem
The solutions mentioned above differ significantly from each other.
Foremost, they offer different accuracy of operation (delay) in terms of the period between the designated and the actual time of calling. For example, for Dynamodb TTL it can be up to 48 hours. While using CRON in EventBridge, we can invoke a Lambda function every minute. Huge difference, right?
This is the most important, functional characteristic because it directly affects the implementation of business requirements. In numerous instances, it is difficult to imagine business stakeholders accepting a two-day hold-up in a response to a user action.
Other characteristics that we can describe these solutions are also important. For many, the maximum number of scheduled actions will be as important as accuracy. Another feature will be the maximum time to postpone the action in the future. And of course, whether the action is cyclical or one-off.
Going further, one cannot forget about the cost of running, and the level of complexity of the solution, directly affecting the time of implementation.
Taking all these characteristics into account, it quickly turns out that there are many options and the final solution depends on the requirements.
Business case
The CloudPouch tool is a desktop application for analysis and optimization of the AWS cloud costs. It is available in the subscription model, for a small monthly or annual fee. Each customer has the opportunity to cancel their subscription at any time. In such case, the license key must be valid until the end of the current billing period, for which the fee has already been charged. Take a look at t1 time point presented in the diagram.
The mechanism must work analogously for annual subscriptions.
Selection of the solution
Given the business requirements and the characteristics of AWS services, I decided to choose a solution that will be the easiest to implement and use. It was a classic architectural trade-off because the simplicity of implementation was obtained at the cost of accuracy.
Choosing the Dynamodb TTL (Time to Live) mechanism turned out to be the best in this case because:
- Accuracy (delay) is not the most important for me, in the worst-case scenario my customer will receive 2 extra days of subscription for free π
- I do not need cyclical calls - a particular license can be canceled only once
- It's simple to implement - the DynamoDB table itself and its stream are all you need
- It is in line with the Event-Driven Architecture, AWS will automatically trigger scheduled action in the future - push instead of pull approach.
- Allows you to easily check which licenses are to be canceled in the future - just view the elements in the DynamoDB table
- is the cheapest, although, with my scale, every solution would be free π
Implementation of the solution
The solution is very simple and consists of a Lambda function and a DynamoDB table with a stream.
In response to the cancelation of a subscription performed by the user, an event is sent to the eventBus
in the EventBridge. Thanks to the defined rules, this event is redirected to the Lambda function Scheduler
(in the real solution, other components consume this event as well). The Lambda function 'Scheduler' saves in the Scheduling
table information about the license to be canceled. This element ("record") has a basic structure, it consists of information that allows you to identify the license in the table Paidlicenses
and the time when this is going to happen.
The cancelation date is saved in the Unix time format under the attribute specified in the configuration of the DynamoDB table. I called this attribute ttl
, it was defined in CloudFormation definition of the table, at line 10
:
SchedulingTable:
Type: AWS::DynamoDB::Table
Properties:
AttributeDefinitions:
- AttributeName: PK
AttributeType: S
KeySchema:
- AttributeName: PK
KeyType: HASH
TimeToLiveSpecification: # ttl definition
AttributeName: ttl
Enabled: true
BillingMode: PAY_PER_REQUEST
TableName: Scheduling
StreamSpecification:
StreamViewType: OLD_IMAGE
When using Node.js (JavaScript), pay attention to the ttl
calculation. It must be provided in seconds and not milliseconds. Hence, the division by 1000:
Math.floor(new Date(cancelationDateAsString).getTime() / 1000)
How does it work
The AWS DynamoDB service constantly monitors our table and when the ttl
value is older than the current time, it will delete the element.
For the whole solution to make sense, we must react to the deletion events of elements. We do it with a stream, which triggers the DeactivatePaidLicense
function. The payload sent to this function contains all the data of the element that was previously stored in the 'Scheduling' table, thanks to which the function knows which license to cancel by making the appropriate update in the table PaidLicenses
.
The connection between the stream and the Lambda function is defined in the serverless.yml
:
functions:
deactivatePaidLicense:
handler: src/deactivatePaidLicense/function.handler
description: Deactivate license upon Paddle event
events:
- stream:
type: dynamodb
arn: !GetAtt SchedulingTable.StreamArn
maximumRetryAttempts: 1
batchSize: 1
filterPatterns:
- eventName: [REMOVE]
I used a filter here, thanks to which the function will be called only as a result of removing the element from the table. In this way, we transfer logic from our code to the configuration of the AWS infrastructure, which is of course the best practice π
Note:
The
Scheduling
table is an independent table that only stores scheduled cancelations. I didn't use the single-table design approach here, so I don't have to worry about removals of other entity types.
Solution in action
My cursory tests have shown that the DynamoDB in the us-east-1
region deletes the elements with a delay of 10 to 12 minutes after a designated time in the ttl
attribute. It is much faster than over mentioned 48 hours limit, but still may not be acceptable for many solutions. In addition, I want to highlight that while those are typical delay times, we have no guarantee that they will always be like that.
My observations coincide with tests carried out by Yan Cui some time ago.
In summary, I must conclude that minimal time spent on implementation delivered a fully functional solution that meets my business needs & is easy to operate. And that's what I was aiming for π
CloudPouch
If you are curious about the CloudPouch application, which I've built, please use a free 7-day trial or just watch this short demo video (1:30).
Top comments (5)
Nice will be part of my newsletter <3.
Thanks. Which newsletter it is?
like martinmueller.dev/newsletter-19/ or dev.to/aws-builders/newsletter-mar...
Super interesting post! Iβm using TTL on DDB for storing expiring sessions and access tokens.
Thank you π