loading...

Reduce Lambda invocations and code boilerplate by adding a JSON Schema Validator to your API Gateway

dvddpl profile image Davide de Paolis ・8 min read

If you are implementing a RESTful API, it is very likely that you need to validate the request parameters or the request body.

This can be done in multiple ways.

Manually validate everything in your handler

The first approach is having parsing and validation methods in your handler:

  • Do I have a property playerId in my body?
  • Is that property an integer?
  • Do I have a parameter game in my URL string?
  • Is this within a certain range of values?

If not, return an error for an invalid request.

You can imagine that depending on the size of the payload and its constraints it is a lot of boilerplate code to run even before you got to the juice of your handler.

And maybe that has to be done in a similar way in all your handlers.

But there is middleware for that!

True!

Replace with a Middleware

What is a middleware? It is a piece of code which is "magically" run for you just as soon as the lambda handler is invoked, and just before it returns.
By doing this you can just define a bunch of validators to run before, (and a bunch of error responses to be used after), and what you are left in your handler is just the code necessary to your business logic.

If you ever used Express, you should be already accustomed to the concept of middleware, for Lambdas, a very interesting and useful solution is Middy - which comes with lots of useful middleware ( from parsing, to caching, to handle warmups, to retrieve values from SecretManager and ParameterStore )

With such a middleware, the implementation of that boilerplate validation is definitely more structured, elegant, and reusable.

const schema = {
  required: ['playerId', 'game'],
  properties: {
    playerId: {
      type: 'integer'
    },
    game: {
      type: 'string'
    }
  }
}

handler.use(validator({
  inputSchema: schema
}))

Still, it does not solve one major issue: the costs raising from invocations of lambdas that have invalid requests.

Yes.

As soon as you receive a request in your API Gateway, (which you pay for) it is forwarded to your lambda ( which you pay for); once you are there though, you validate the body or parameters and you realize the request is invalid and return immediately with an error!

What a waste...

Of course the execution was quick and you prevented invalid requests to be passed down to queues or database... but couldn´t that be improved?

Is there a way to prevent a lambda to be invoked if we know already the request is invalid?

Of course there is, we can move validation directly at the level of the API Gateway!!

API Gateway request validation with JSON Model Schema.

Yes, you can define a validation written in JSON and give it to the API Gateway so the if the body doesn´t match, the request is immediately bounced back.
You will still pay for the usage of APIGateway, but you won´t have any unnecessary lambda invocations, your code in the lambda will be cleaner and you can reuse easily those schema for multiple similar endpoints.

There are lots of useful resources explaining that in details,

Since in a couple of projects we decided to try out AWS CDK to describe our Stacks ( which is very nice, but we have not decided yet over Serverless framework...) here I´d like to explain how to add Validation to your API using AWS CDK.

Add JSON validation with AWS CDK

Create your Validator

Create an instance of a Request validator and pass its reference to your Gateway API:

const requestValidator = new apigateway.RequestValidator(this, "MyPayloadValidator", {
            restApi: API,  // <-- this is the reference to your API
            requestValidatorName: `myproj-${stage}-payload-validator`,
            validateRequestBody: true,
        })

Create your Model:

  const myModel = new apigateway.Model(this, "myValidationModel", {
            restApi: api,
            contentType: "application/json",  // <-- this is necessary - even thought they mention it's an optional param   VERIFY WITH LATEST VERSIONS OF CDK
            description: "Payload used to validate existing player and game",
            schema: {
                type: JsonSchemaType.OBJECT,
                properties: {
                    playerId: {
                        type: JsonSchemaType.INTEGER
                    },
                    game: {
                        description: "Codename of our games, usually 3 letters",
                        type: JsonSchemaType.STRING,
                        enum: ["AAA", "BBB "CCC"]
                    },
                    lang: {
                      description: "Language Code / Locale used by player, like it_IT or de_DE", 
                        type: JsonSchemaType.STRING
                    },
                },
                required: [
                    "game",
                    "playerId",
                    "lang"
                ]
            }
        })

Here there are a couple of tricky things.

First, I started by checking how JSON Schema look like and created a JSON Schema like this:

{
  "$schema": "http://json-schema.org/draft-07/schema#",
  "title": MyAwesomeValidation",
  "description": "Payload used to validate existing player and game",
  "type": "object",
  "properties": {
    "playerId": {
      "type": "integer"
    },
    "gameId": {
      "type": "string"
    }
  },
  "required": [
    "gameId",
    "playerId"
  ]
}

and I thought it was going to be possible to import it directly.

It turns out you have to convert that JSON into the CDK object mapping
and the documentation was not so clear

Second, I was getting an error for ContentType even though the documentation states that contentType is an optional parameter

1 validation error detected: Value null at 'createModelInput.contentType' failed to satisfy constraint: Member must not be null

Usually, when I get stuck I always run CDK synth to check the Cloudformation template being generated and then I check directly the documentation of Cloudformation. There, I discovered that you need to pass "ContentType": "application/json", and that would also have been quite straightforward from the UI Console.

Alt Text

Something I do when I really can´t figure out how to create a specific resource, either with CDK or Serverless Framework or CloudFormation, is doing it manually in the UI console: going through all the hundreds clicks and visually see all the options, steps and properties necessary is really helpful to understand the process.

Assign Model and Validator to your resource

Every endpoint of your Rest API - normally referred to as Resource - can have specific methods (GET, POST, etc). To have the payload or params in that methods validated, assign the reference to your model and validator to that API Resource Method.

I thought this to be quite straightforward but it is exactly where I found myself wasting quite some time.

The docs for Method Options states that request models is a Map of the validation models.

The models which describe data structure of request payload

but what am I supposed to add as a key?

        myEndpoint.addMethod("POST", myLambdaIntegration, {
                requestValidator: requestValidator,
                requestModels: {"????": myValidationModel}  <-- what is the key to be used here?
            }
        )

Again checking the docs directly from CloudFormation instead of the CDK, gives some hint:

The resources that are used for the request's content type. Specify request models as key-value pairs (string-to-string mapping), with a content type as the key and a Model resource name as the value.

Even though the word resource here is a bit misleading, because it is not the API gateway Resource ( your endpoint ) but more generically any AWS resource, the important part here is: a content type as the key

After some googling I found this article on AWS:
How do I associate a model with my API in API Gateway? and I to try adding the model directly from the Console, and there I saw that the mapping was indeed done with the Content-Type, and the exact thing I add to specify as the key was, well, as stated, a content type...

requestModels: {"application/json": myValidationModel}

Test your validator

Once I was finally able to deploy, and I double-checked in the console that everything was at its place and properly configured, I tried sending a request with the wrong payload from Postman.

....and it went through. Lambda was triggered and crashing / returning an error from its own validation

Why!?!
what the heck

Trying from the test console ( API / Resources / your_method / Test ) I was getting as expected:

Alt Text
Alt Text

and it was logging

Request body does not match model schema for content type application/json: [object has missing required properties (["theRightOne"])]

After staring for a while at the Postman screen, I noticed that for some reason the Body Type and the Header Content-Type was set to application/text !!
After changing it to application/json the request was rejected ( as expected!)

But that means that if the client would send with fetch/axios whatever a request with the wrong type... Gateway API would let it reach the lambda!!

I also tried out this option "passthroughBehavior": "never" but it seems that it is not working for Proxy Integration.

The solution, for now, is adding in the key mapping above the other types of content-types for which we want the validation to be applied.
(honestly though. I'd prefer to have the possibility to exclude something, or reject everything but json.. will see in the next CDK versions)

So in the end this is what you need to add a JSON Schema Validator to your API Method:

        const api = new apigateway.RestApi(this, "AwesomeApi", {  
            restApiName: `my-awesome-api`,  // <--- this is the name of the gateway api in Console
            description: "api to do awesome stuff"
        })
 const resource = api.root.addResource("doStuff")  // this is the endpoint!  otherwise lambda is called directly under api root )
 const awesomeIntegration = new apigateway.LambdaIntegration(myLambda)
 const requestValidator = new apigateway.RequestValidator(this, "AwesomePayloadValidator", {
            restApi: api,
            requestValidatorName: `my-payload-validator`,
            validateRequestBody: true,
        })
const model = new apigateway.Model(this, "AwesomeValidationModel", {
            modelName: "myValidationModel",//`myproject-${stage}-validate-payload-model`,
            restApi: api,
            contentType: "application/json",  // this is necessary - even thought they mention it's an optional param
            description: "Payload used to validate your requests",
            schema: {
                type: JsonSchemaType.OBJECT,
                properties: {
                    foo: {
                        type: JsonSchemaType.STRING
                    },
                     bar: {
                        type: JsonSchemaType.INTEGER,
                        enum: [1, 2, 3]
                    },
   fizz: {
                        type: JsonSchemaType.STRING
                    },
                },
                required: [
                    "foo",
                    "bar"                
                ]
            }
        })
        resource.addMethod("POST", awesomeIntegration, {
                requestValidator: awesomeValidator,
                requestModels: {"application/json": awesomeModel}
            }
        )

With serverless is a bit less straightforward. you need a couple of plugins but it's definitely doable.
https://www.npmjs.com/package/serverless-reqvalidator-plugin
https://stackoverflow.com/questions/49133294/request-validation-using-serverless-framework


So,

should you implement a Model Schema Validator for your API Gateway?

Sure!

Why?

to avoid duplication and awkward boilerplate validation inside your handlers and most importantly to have more control and prevent unnecessary (and expensive) invocations of your lambda.

Imagine for some reason there was a wrong update in your frontend and the requests are being sent wrong. You got a peak of requests, and all of them bounced back because the payload is not valid. At least with an API GAteway Request Validation you can mitigate the costs on the server.

Or worse, imagine some malicious attack hitting your endpoint: you could at least mitigate it by preventing the invalid requests to reach the lambda. (Of course, in case of a malicious attack is very likely a valid payload could be used, so it's even better if you add some security measures through Throttling and ACL Rule: have a look at AWS WAF )

Hope it helps.
If you have tips or alternatives for the implementation with the CDK or Serverless feel free to add them to the comments.

Posted on May 15 '19 by:

dvddpl profile

Davide de Paolis

@dvddpl

Sport addicted, productivity obsessed, avid learner, travel enthusiast, expat, 2 kids. 🏂✈🚞🌍📷🖥🤘👨‍👩‍👦‍👦🚀 (Opinions are my own)

Discussion

markdown guide