Problem Statement
Recently, due to a misconfiguration in the source system, the ETL developed by the team failed and all the extracted data ended up in the retry mechanism because the ETL function was not expecting this kind of information. However, not all the information was supposed to be discarded, as some of it was genuine and had a customer impact. Therefore, it was not possible to avoid or discard all the information at once by deleting all records.
So, it ended up after to and fro discussion, that all the retry records (~2500) will need to be checked manually and delete which are not genuine and update retry count in DynamoDB Table which needs to be retried as failed records will happen in sequence for each type and was given strict instruction that you can’t just “delete files”.
All this information and data was stored in the AWS cloud (DynamoDB, S3). Therefore, to check all 2500 files one by one is very tedious and cumbersome process. So, after using some Sheldon brain, we came up with plan that we will create some code using TypeScript and AWS-SDK as we had access so, it can be automated. So, when we wrote a script, it came as blocker that due to Multi-Factor Authentication (MFA) it was not allowing us to process it as we had specific roles and profiles to be used. Thereby, so we figure it out a way, we can use a MFA and automated this manual work which will be in step-by-step in this blog.
Pre-requisite for the solution to work
Now that you have the problem at hand and you so want to have the solution but hold on! There are bunch of items you need to have configured to make it work for you.
- AWS CLI: You need to have AWS CLI installed on your machine. This is required to set up the profile to use for AWS-SDK. Here’s the link to install AWS CLI on your system.
- AWS Profile: After installing AWS CLI, AWS Profile needs to be configured in the CLI so that AWS-SDK can use it to make API calls. Here’s a guide on how to configure profiles in AWS CLI.
- Note: Please make sure you have both config and credentials file inside your aws directory else AWS SDK would complain for not finding the config file.
The solution to make our lives easier
Okay, now that you know both the problems and pre-requisite to solve it, let’s dive deep into the codebase that’s doing all the magic.
First we need to have clients installed for DynamoDB, which will be used to make API calls to the table and other module to generate credentials. So, let’s do it:
yarn add @aws-sdk/client-dynamodb @aws-sdk/lib-dynamodb @aws-sdk/credential-providers
Cool, now you have got all the things required to make your lives easier. So let’s start writing some TypeScript.
First we need to generate the credentials to pass it to the DynamoDB client. So, let’s write the function that returns the credentials.
import { fromIni } from "@aws-sdk/credential-providers";
function getCreds() {
try {
return fromIni({ // returns the credentials using your AWS Profile
profile: "aws-profile",
mfaCodeProvider: async serial => {
const mfaCode = await prompt(`Type your AWS MFA code: ${serial}`);
return mfaCode;
},
});
} catch (err) {
console.error(err);
throw err;
}
}
What’s happening here?
fromIni
function is reading your credentials and config files that are stored in your aws directory. It will have your access and secret keys and other necessary configuration. Mixes it with your MFA code and voila, you have the credentials to make the API calls for which you have access to.
Hmm, but how would you make the terminal ask you for the MFA code? For that NodeJS has an internal package called readline
. Using this we will prompt for the MFA code. Let’s look into the prompt
function:
import * as readline from "readline";
function prompt(query: string): Promise<string> {
// first create the readline interface
const rl = readline.createInterface({
input: process.stdin,
output: process.stdout,
});
// prompt to answer a question
return new Promise(resolve =>
rl.question(`${query}\n`, ans => {
rl.close();
resolve(ans);
})
);
}
Just there. Let’s add the functions to make API calls to DynamoDB table and update the records in bulk, which is not natively supported by DynamoDB.
import { DynamoDBClient } from "@aws-sdk/client-dynamodb";
import {
ExecuteStatementCommand,
DynamoDBDocumentClient,
UpdateCommand,
} from "@aws-sdk/lib-dynamodb";
const updateOrder = async (
order: SomeSchema,
docClient: DynamoDBDocumentClient
): Promise<void> => {
const updateCommand = new UpdateCommand({
TableName: "some-table-name",
Key: {
id: order.id,
},
// update retryCount and bucketName for each unique record
UpdateExpression: "set retryCount = :x, bucketName = :y",
ExpressionAttributeValues: {
":x": 0,
":y": "someS3BucketName",
},
});
await docClient.send(updateCommand);
};
async function fetchAndUpdateOrders(): Promise<void> {
const client = new DynamoDBClient({
credentials: getCreds(), // retrieve the credentials
});
const docClient = DynamoDBDocumentClient.from(client);
const command = new ExecuteStatementCommand({
Statement: 'SELECT * FROM "some-table-name"',
});
const data = await docClient.send(command); // store all the retrieved data in this variable
const orders = data.Items as Array<SomeSchema>;
console.log(`Found a total of ${orders.length} orders`);
await Promise.all(
// loop through all the orders and update them
orders.map(order => {
console.log(
`Updating order with id: ${order.id}`
);
return updateOrder(order, docClient);
})
);
}
That’s it. You have the solution to your problem. Now just call the fetchAndUpdateOrders
function and your problem’s gone forever(until something else breaks again!).
fetchAndUpdateOrders()
.then(() => console.log("Finished updating the records!"))
.catch(err => {
console.error("Something bad happened!");
console.error(err);
});
At the end, you will have the whole solution looking like this:
import { fromIni } from "@aws-sdk/credential-providers";
import * as readline from "readline";
import { DynamoDBClient } from "@aws-sdk/client-dynamodb";
import {
ExecuteStatementCommand,
DynamoDBDocumentClient,
UpdateCommand,
} from "@aws-sdk/lib-dynamodb";
function getCreds() {
try {
return fromIni({ // returns the credentials using your AWS Profile
profile: "aws-profile",
mfaCodeProvider: async serial => {
const mfaCode = await prompt(`Type your AWS MFA code: ${serial}`);
return mfaCode;
},
});
} catch (err) {
console.error(err);
throw err;
}
}
function prompt(query: string): Promise<string> {
// first create the readline interface
const rl = readline.createInterface({
input: process.stdin,
output: process.stdout,
});
// prompt to answer a question
return new Promise(resolve =>
rl.question(`${query}\n`, ans => {
rl.close();
resolve(ans);
})
);
}
const updateOrder = async (
order: SomeSchema,
docClient: DynamoDBDocumentClient
): Promise<void> => {
const updateCommand = new UpdateCommand({
TableName: "some-table-name",
Key: {
id: order.id,
},
// update retryCount and bucketName for each unique record
UpdateExpression: "set retryCount = :x, bucketName = :y",
ExpressionAttributeValues: {
":x": 0,
":y": "someS3BucketName",
},
});
await docClient.send(updateCommand);
};
async function fetchAndUpdateOrders(): Promise<void> {
const client = new DynamoDBClient({
credentials: getCreds(), // retrieve the credentials
});
const docClient = DynamoDBDocumentClient.from(client);
const command = new ExecuteStatementCommand({
Statement: 'SELECT * FROM "some-table-name"',
});
const data = await docClient.send(command); // store all the retrieved data in this variable
const orders = data.Items as Array<SomeSchema>;
console.log(`Found a total of ${orders.length} orders`);
await Promise.all(
// loop through all the orders and update them
orders.map(order => {
console.log(
`Updating order with id: ${order.id}`
);
return updateOrder(order, docClient);
})
);
}
fetchAndUpdateOrders()
.then(() => console.log("Finished updating the records!"))
.catch(err => {
console.error("Something bad happened!");
console.error(err);
});
Say hello to seamless bulk updates & goodbye to manual tweaks!
Top comments (0)