Ryan Nazareth for AWS Community Builders

Posted on Sep 14, 2022

AWS Fraud Detector for classifying fraudulent online registered accounts - Part 2

#api #aws #machinelearning #serverless

In the first part of this blog, we trained and evaluated the fraud detector model performance. Now we will need to make it active by deploying it and then make predictions. We will also setup a REST API with API gateway (Lambda Integration) for making realtime predictions. The architecture diagram below shows the workflow. All the references to code snippets are in Github.

In the AWS Fraud Detector console, choose the model we have trained from the Models page. Scroll to the top of the Version details page and choose Actions and Deploy model version. On the Deploy model version prompt that appears, choose Deploy version.

The Version details shows a Status of 'Deploying'. When the model is ready, the Status changes to Active. Once the model has finished deploying and status changed to active, we will need to associate the model with Fraud Detector for predictions. However, we will also need to update the rule expressions as the default Fraud Detector version 1 created from Cloudformation uses the variable amt in the rule expression as seen in the screenshot below.

We need to change this to model insight score which is a new variable created after model training has completed. This variable is not available when the Cloudformation stack is created as the model has not been trained yet so we needed to have a placeholder existing variable so the rule expression is valid to avoid the stack for throwing an error. We can run the following custom script from the command line to update the detector rules and associate the new model with it.

python projects/fraud/deploy.py --update_rule 1 --model_version 1.0 --rules_version 2

This will carry out two steps:

Update the existing rule version with the correct expression based on the number passed to the --update_rule argument. It will create a new rule version (incremented from the original version number).
Then it will create a new detector version and associate the model version (--model_version argument) and the rules_version (--rules_version argument) which should be set to be the increment of the existing rule version. This will automatically increment the detector version to 2.0 as the existing version is 1.0

If the script runs successfully, we should see the following output streamed to stdout.

26-06-2022 04:50:34 : INFO : deploy : main : 121 : Updating rule version 1
26-06-2022 04:50:34 : INFO : deploy : update_detector_rules : 71 : Updating Investigate rule ....
{'detectorId': 'fraud_detector_demo', 'ruleId': 'investigate', 'ruleVersion': '2'}

26-06-2022 04:50:35 : INFO : deploy : update_detector_rules : 80 : Updating review rule ....
{'detectorId': 'fraud_detector_demo', 'ruleId': 'review', 'ruleVersion': '2'}

26-06-2022 04:50:35 : INFO : deploy : update_detector_rules : 89 : Updating approve rule ....
{'detectorId': 'fraud_detector_demo', 'ruleId': 'approve', 'ruleVersion': '2'}

26-06-2022 04:50:35 : INFO : deploy : main : 123 : Deploying trained model version 1.0 to new detector version 
{'detectorId': 'fraud_detector_demo', 'detectorVersionId': '2', 'status': 'DRAFT', 'ResponseMetadata': {'RequestId': 'da37d973-2c43-4c56-93e5-f9b9bd132bb3', 'HTTPStatusCode': 200, 'HTTPHeaders': {'date': 'Sun, 26 Jun 2022 03:50:36 GMT', 'content-type': 'application/x-amz-json-1.1', 'content-length': '77', 'connection': 'keep-alive', 'x-amzn-requestid': 'da37d973-2c43-4c56-93e5-f9b9bd132bb3'}, 'RetryAttempts': 0}}

The image below show the model associated with the new version and the correct rules expressions which use the fraud_model_insightscore variable.

In the next section, we will set up API gateway to create a rest api endpoint to serve http requests, with lambda in the backend.

Setting up API gateway

In this section, we will walk through the steps to create the REST API

Open the API Gateway console, and select CreateAPI and the type as RestAPI.
To create an empty API, select Create New API and then New API. In Settings, choose a API name such as FraudLambdaProxy and optional Description. Then choose Create API.
Choose the root resource (/) just created and select Create Method from the Actions menu. Then Select Get from the dropdown menu.
For Integration Type select Lambda Function and choose Use Lambda Proxy integration. The Lambda function should already have been created via Cloudformation in part 1). Select the Lambda Region where the function was created (us-east-1) and for the Lambda Function field, select PredictFraudModel from the dropdown, and then click Save.
In the Method Execution pane, choose Method Request.
In settings, set Request Validator to Validate query string parameters and headers. Leave Authorization as None.
Expand the URL Query String Parameters dropdown, then choose Add query string.
Enter the following variables one by one as a separate name field. Mark all as required except for flow_definition variable

 amt, category, cc_num, city, city_pop, event_timestamp, first, flow_definition, gender, job, last, merchant, state, street, trans_num, zip

Go back to the Method Execution pane.It should look like below.

Choose Integration Request.
Choose the Mapping Templates dropdown and then choose Add mapping template.
For the Content-Type field, enter application/json and then choose the check mark icon.
In the pop-up that appears, choose Yes to secure this integration.
For Request body passthrough, choose When there are no templates defined (recommended).
In the mapping template editor, copy and replace the existing script with the following code:

#if("$input.params('flow_definition')" != "")
#set( $my_default_value = "$input.params('flow_definition')")
#else
#set ($my_default_value = "ignore")
#end


{
  "variables": {
        "trans_num":"$input.params('trans_num')",
        "amt":"$input.params('amt')",
        "zip":"$input.params('zip')",
        "city":"$input.params('city')",
        "first":"$input.params('first')",
        "job":"$input.params('job')",
        "street":"$input.params('street')",
        "category":"$input.params('category')",
        "city_pop":"$input.params('city_pop')",
        "gender":"$input.params('gender')",
        "cc_num":"$input.params('cc_num')",
        "last":"$input.params('last')",
        "state":"$input.params('state')",
        "merchant":"$input.params('merchant')"
  },
  "EVENT_TIMESTAMP":"$input.params('event_timestamp')",
  "flow_definition":"$my_default_value"
}

Choose Save, and go back to MethodExecution pane. Click on Test button on the left.
In the Query Strings box paste the following

trans_num=6cee353a9d618adfbb12ecad9d427244&amt=245.97&zip=97383&city=Stayton&first=Erica&job=Engineer, biomedical&street=213 Girll Expressway&category=shopping_pos&city_pop=116001&gender=F&cc_num=180046165512893&last=Walker&state=OR&merchant=fraud_Macejkovic-Lesch&event_timestamp=2020-10-13T09:21:53.000Z&flow_definition=arn:aws:sagemaker:us-east-1:376337229415:flow-definition/fraud-detector-a2i-1656277295743

If successful you should see the response and logs as in screenshot below. You can also navigate to CloudWatch log stream group for Lambda invocation and check it has run successfully.

We can then proceed to deploying the API. Go back to Resources, Actions and Deploy API. Select Deployment Stage as New Stage and choose name as dev. You should see the API endpoint to invoke on the console. Finally make sure logging is setup to allow debugging errors in the REST API, by following the instructions here. The setup should look like below. Note that when you add the IAM role to gateway console, it should automatically add the log group in the format API-Gateway-Execution-Logs_apiId/stageName. The Arn for the log group end with dev:*. You need to only include the Arn upto the stagename dev as shown in the image below. Otherwise it will throw issues with the validation checks.

To test the API's new endpoint, we can use postman for sending an API request. Create a postman account and select GET from the list of request types. Since the GET method is configured in / root resource, we can invoke the api endpoint https://d9d16i7hbc.execute-api.us-east-1.amazonaws.com/dev with the query string parameters appended at the end (key=value format and separated by &). Paste the following command in the box as in screenshot below. You should see the parameters and values automatically parsed and populated in the KEY/VALUE rows below.
Click send and you should see the response body at the bottom.

https://d9d16i7hbc.execute-api.us-east-1.amazonaws.com/dev?trans_num=6cee353a9d618adfbb12ecad9d427244&amt=245.97&zip=97383&city=Stayton&first=Erica&job='Engineer, biomedical'&street='213 Girll Expressway'&category=shopping_pos&city_pop=116001&gender=F&cc_num=180046165512893&last=Walker&state=OR&merchant=fraud_Macejkovic-Lesch&event_timestamp=2020-10-13T09:21:53.000Z

You can check the log streams associated with the latest invocation in the Cloudwatch log group for API gateway. This will show messages with the execution or access details of your request.

Note: If any changes are made to the api configuration or parameters - it would need to be redeployed for the changes to take effect.

Generate Predictions

You can use a batch predictions job in Amazon Fraud Detector to get predictions for a set of events that do not require real-time scoring. You may want to generate fraud predictions for a batch of events. These might be payment fraud, account take over or compromise, and free tier misuse while performing an offline proof-of-concept.

You can also use batch predictions to evaluate the risk of events on an hourly, daily, or weekly basis depending upon your business need. If you want to analyze fraud transactions after the fact, you can perform batch fraud predictions using Amazon Fraud Detector. Then you can store fraud prediction results in an Amazon S3 bucket. Although beyond the scope of this example, we could have also used additional services like Amazon Athena to help analyze the fraud prediction results (once delivered in S3) and Amazon QuickSight for visualising the results on a dashboard.Copy the batch sample file delivered in the glue_transformed folder (following successful glue job run) to batch_predict folder. This will trigger notification to SQS queue which has Lambda function as target, which starts the batch prediction job in Fraud Detector

$ aws s3 cp s3://fraud-sample-data/glue_transformed/test/fraudTest.csv s3://fraud-sample-data/batch_predict/fraudTest.csv

We can monitor the batch prediction jobs in Fraud Detector. Once complete, we should see the output in S3. An example of
a batch output is available here

In realtime mode, we will make use of the API gateway created and integrated with the lambda function which makes the
get_event_prediction api call to FraudDetector. In this example we are using the same lambda for batch and realtime predictions. The code in lamdba checks the checks the event payload to see if certain keys are present which are expected from a request from API gateway (.i.e after the request
is transformed via the mapping template in api gateway). We have configured the mapping template to create a variables key, so we can check if the payload has 'variables' key, to run realtime prediction. If the event payload has 'Records' key, it indicates the event is coming from SQS and will run a batch prediction job.

Ideally, separate lambdas for realtime and batch prediction could have been used, to make it easier to manage.
To run realtime prediction, API gateway REST API has been configured to accept query string parameters and send the request to lambda as explained in the previous section.

Teardown resources

The custom bash script below can be executed to teardown all the fraud detector resources, including the trained fraud model, detector (including rules), event type (including outcomes, variables, labels).

#!/bin/bash

variables=( "trans_num" "amt" "city_pop" "street" "job" "cc_num" "gender" "merchant" "last" "category" "zip" "city" "state" "first")
labels=("legit" "fraud")
rules=("investigate" "review" "approve")
outcomes=("high_risk" "low_risk" "medium_risk")
event_type="credit_card_transaction"
entity_type="customer"
detector_name="fraud_detector_demo"
model_name=fraud_model

echo "Delete model versions"    
aws frauddetector  delete-model-version --model-id $model_name --model-type ONLINE_FRAUD_INSIGHTS --model-version-number 1.0
aws frauddetector  delete-model-version --model-id $model_name --model-type ONLINE_FRAUD_INSIGHTS --model-version-number 2.0

echo ""
echo "Delete model"
aws frauddetector  delete-model --model-id $model_name --model-type ONLINE_FRAUD_INSIGHTS


echo ""
echo "Deleting detector version id 1"
aws frauddetector delete-detector-version --detector-id $detector_name --detector-version-id 1

echo ""
for var in "${rules[@]}";
    do
        echo "Deleting rule $var"
        aws frauddetector  delete-rule --rule detectorId=$detector_name,ruleId=$var,ruleVersion=1
    done;

echo ""
echo "Deleting detector id $detector_name"
aws frauddetector delete-detector --detector-id $detector_name

echo ""
echo "Deleting event-type $event_type"
aws frauddetector delete-event-type --name $event_type

echo ""
echo "Deleting entity-type $entity_type"
aws frauddetector delete-entity-type --name $entity_type

echo ""
for var in "${variables[@]}";
    do
        echo "Deleting variable $var"
        aws frauddetector  delete-variable --name $var
    done;


echo ""
for var in "${labels[@]}";
    do
        echo "Deleting label $var"
        aws frauddetector  delete-label --name $var
    done;

echo ""
for var in "${outcomes[@]}";
    do
        echo "Deleting outcome $var"
        aws frauddetector  delete-outcome --name $var
    done;

echo ""
echo "Deleting cloud formation stack"
aws cloudformation delete-stack --stack-name FraudDetectorGlue