Tom Milner for AWS Community Builders

Posted on May 21, 2021

Build a Serverless Data API with AppSync and DynamoDB

#serverless #graphql #aws #dynamodb

Introduction

Please find below a tutorial to create a serverless data api using AWS DynamoDB as the backend data store and AWS AppSync to create a GraphQL interface. All the steps and code samples shared are defined in YAML. I've used the AWS SAM syntax where it's supported and CloudFormation where not. SAM does not provide any syntax for AppSync but the good thing about SAM is that you can still use it to build and deploy a template that contains CloudFormation. This means that while you don't get the shorthand that SAM provides for other services you can continue to use the same SAM setup and commands. I think there is good scope for SAM to support AppSync in the same way it supports API Gateway and I hope that AWS adds it in the future.

You will need the AWS SAM CLI installed on your machine to complete this tutorial. If you don't have it, you can follow the instruction provided by AWS in this link.

https://docs.aws.amazon.com/serverless-application-model/latest/developerguide/serverless-sam-cli-install.html

I have kept this example very simple but it is still powerful. The application outlined allows the creation, update, delete and selection of a JSON document through a GraphQL interface. The document is stored in a map datatype in DynamoDB and indexed with a partition key and sort key. With some clever thinking around how you construct these keys, it can have many applications. You can use composite Each item (record) in DynamoDB can be up to 400kb so you can also store quite a lot of data per item.

1) DynamoDB - create your table.

This table will be the backend data store for the API. APIName is a parameter passed into the template that will be re-used as table name and API name. As DynamoDB is a NoSQL database, you don't need to define the schema upfront, only the primary key columns. The column to store the JSON will be added on writing the first item to the table.

  DynamoDBTable:
    Type: AWS::DynamoDB::Table
    Properties:
      TableName: !Ref APIName
      ProvisionedThroughput:
        WriteCapacityUnits: 5
        ReadCapacityUnits: 5
      AttributeDefinitions:
        - AttributeName: "pk1"
          AttributeType: "S"
        - AttributeName: "sk1"
          AttributeType: "S"
      KeySchema:
        - AttributeName: "pk1"
          KeyType: "HASH"
        - AttributeName: "sk1"
          KeyType: "RANGE"

2) IAM

You need to create a number of AWS IAM Policies and Roles to allow AppSync to access both the DynamoDB table and log to CloudWatch.

2.1) AppSync to DynamoDB

This policy will allow the attached principle to query the DynamoDB table created. The CloudFormation Sub function allows you construct a single string from multiple inputs. If you wish to reference variables in the Sub function they need to be wrapped in ${}. Keeping to the principle of least privilege, I have included only the actions needed by the resolvers in AppSync and no others.

  PolicyDynamoDB:
    Type: AWS::IAM::ManagedPolicy
    Properties:
      Path: /service-role/
      PolicyDocument:
        Version: 2012-10-17
        Statement:
          - Effect: Allow
            Action:
              - dynamodb:Query
              - dynamodb:GetItem
              - dynamodb:PutItem
              - dynamodb:DeleteItem
            Resource:
              - !Sub arn:aws:dynamodb:${AWS::Region}:${AWS::AccountId}:table/${DynamoDBTable}

Using sts:AssumeRole, attach the policy created to the AppSync service by creating a new role.

  RoleAppSyncDynamoDB:
    Type: AWS::IAM::Role
    Properties:
      ManagedPolicyArns:
        - !Ref PolicyDynamoDB
      AssumeRolePolicyDocument:
        Version: 2012-10-17
        Statement:
          - Effect: Allow
            Action:
              - sts:AssumeRole
            Principal:
              Service:
                - appsync.amazonaws.com

2.2) Log to CloudWatch

To allow AppSync access CloudWatch, you use sts:AssumeRole to attach the provided AWSAppSyncPushToCloudWatchLogs policy to the AppSync service. This is an AWS Managed Policy that you can use when creating a new role. Once the AppSync service assumes the role, it will have access to create log groups and streams and log events to CloudWatch.

  RoleAppSyncCloudWatch:
    Type: AWS::IAM::Role
    Properties:
      ManagedPolicyArns:
        - "arn:aws:iam::aws:policy/service-role/AWSAppSyncPushToCloudWatchLogs"
      AssumeRolePolicyDocument:
        Version: 2012-10-17
        Statement:
          - Effect: Allow
            Action:
              - sts:AssumeRole
            Principal:
              Service:
                - appsync.amazonaws.com

That is the IAM resources and DynamoDB table created. Now we can start on creating the AppSync resources.

3) AppSync

Now we can start on the AppSync resources. You need to create the following resources to have a fully working API:

API Header - holder for the API components below with details of name, authentication and monitoring level.
API schema – This is where the API definition is modeled in a GraphQL schema definition language (SDL)
DataSource – A datasource is the component that provides the details about where the data is stored.
Resolvers – Resolvers link the parts of the schema with the matching data sources. They also provide any transformation necessary between the two using Apache Velocity Template Language (VTL).
API Key - In this tutorial, I am using an API Key to control access to the API. You can also use IAM or Cognito.

It does seem like a lot to specify and I think there is opportunity for SAM to abstract some of this away as it does for API Gateway. Through the console, you can create a new AppSync API with a wizard by pointing it at a DynamoDB table so it's really just getting something similar in an IAC format. As of now, you have to specify each of these individually with CloudFormation.

3.1) API Header

This resource creates the API header using the passed in parameter, the same one we used for the DynamoDB TableName parameter. The LogConfig section sets up CloudWatch logging using the role we created earlier. I have set it to the highest level of logging but you can reduce it down is this is too much by adjusting the parameters of the ExcludeVerboseContent and FieldLogLevel properties. Reduce the amount of logs written by setting FieldLogLevel to NONE, ERROR or ALL.

  GraphQLApi:
    Type: AWS::AppSync::GraphQLApi
    Properties:
      Name: !Ref APIName
      AuthenticationType: API_KEY
      LogConfig:
        CloudWatchLogsRoleArn: !GetAtt RoleAppSyncCloudWatch.Arn
        ExcludeVerboseContent: FALSE
        FieldLogLevel: ALL

3.2) API Schema

The GraphQL schema is fundamental to all GraphQL platforms. This can be embedded directly in the YAML template or stored in S3 and referenced within the template. I have left it in the template for simplicity.
Within the schema, you are specifying the data types and the mutation and query interfaces. Query is using for reading data out of the API and mutations are for manipulating the underlying data. Each of the mutations and the readData query all operate on a single record. The readAllPKData query will return all items for a particular pk1 value. Depending on how you are indexing the JSON item, you could for example use this query to return all child records for a particular parent key.

  GraphQLApiSchema:
    Type: AWS::AppSync::GraphQLSchema
    Properties:
      ApiId: !GetAtt GraphQLApi.ApiId
      Definition: |
        schema {
          query: Query
          mutation: Mutation
        }
        type Data {
          data: [AWSJSON]
          pk1: String
          sk1: String
        }
        type DataCollection {
          items: [Data]
          nextToken: String
        }
        input WriteDataInput {
          pk1: String!
          sk1: String!
          data: [AWSJSON]!
        }
        input UpdateDataInput {
          pk1: String!
          sk1: String!
          data: [AWSJSON]!
        }
        type Mutation {
          writeData(input: WriteDataInput!): Data
          updateData(input: UpdateDataInput!): Data
          deleteData(pk1: String!, sk1: String!): Data
        }
        type Query {
          readData(pk1: String!, sk1: String!): Data
          readAllPKData(pk1: String!): DataCollection
        }

3.3) DataSource

Once the API is created, we can attach the DynamoDB table as the data source for the resolvers. This tutorial only specifies one data source but the beauty of GraphQL is that you can have multiple data sources within the same API. AppSync also supports Aurora, AWS Elasticsearch Service and Lambda as native data sources. You can also reference other AWS services via a Lambda data source.

  GraphQLDataSource:
    Type: AWS::AppSync::DataSource
    Properties:
      ApiId: !GetAtt GraphQLApi.ApiId
      Name: !Ref APIName
      Type: AMAZON_DYNAMODB
      ServiceRoleArn: !GetAtt RoleAppSyncDynamoDB.Arn
      DynamoDBConfig: 
        TableName: !Ref DynamoDBTable
        AwsRegion: !Sub ${AWS::Region}

3.4) Resolvers

Resolvers contain the logic mapping each query and mutation to an underlying data source with any transformation or logic needed. The name of the query or mutation is specified in the FieldName parameter. You can only attach one datasource per resolver. Resolvers use a scripting language called Apache Velocity Template Language (VTL) to encode any logic. The RequestMappingTemplate parameter specifies any transformation between the request and datasource. The ResponseMappingTemplate specifies any transformation between the datasource and response.

  AppSyncResolverReadData:
    Type: AWS::AppSync::Resolver
    DependsOn: GraphQLApiSchema
    Properties:
      ApiId: !GetAtt GraphQLApi.ApiId
      TypeName: Query
      FieldName: readData
      DataSourceName: !GetAtt GraphQLDataSource.Name
      RequestMappingTemplate: >
        {
          "version": "2017-02-28",
          "operation": "GetItem",
          "key": {
            "pk1": $util.dynamodb.toDynamoDBJson($ctx.args.pk1),
            "sk1": $util.dynamodb.toDynamoDBJson($ctx.args.sk1),
          },
        }
      ResponseMappingTemplate: $util.toJson($context.result)

  AppSyncResolverReadAllPKData:
    Type: AWS::AppSync::Resolver
    DependsOn: GraphQLApiSchema
    Properties:
      ApiId: !GetAtt GraphQLApi.ApiId
      TypeName: Query
      FieldName: readAllPKData
      DataSourceName: !GetAtt GraphQLDataSource.Name
      RequestMappingTemplate: >
        {
            "version" : "2017-02-28",
            "operation" : "Query",
            "query" : {
                "expression": "pk1 = :pk1",
                "expressionValues" : {
                    ":pk1" : $util.dynamodb.toDynamoDBJson($ctx.args.pk1),
                }
            }
        }
      ResponseMappingTemplate: $util.toJson($context.result)

  AppSyncResolverWriteData:
    Type: AWS::AppSync::Resolver
    DependsOn: GraphQLApiSchema
    Properties:
      ApiId: !GetAtt GraphQLApi.ApiId
      TypeName: Mutation
      FieldName: writeData
      DataSourceName: !GetAtt GraphQLDataSource.Name
      RequestMappingTemplate: >
        {
          "version": "2017-02-28",
          "operation": "PutItem",
          "key": {
            "pk1": $util.dynamodb.toDynamoDBJson($ctx.args.input.pk1),
            "sk1": $util.dynamodb.toDynamoDBJson($ctx.args.input.sk1),
          },
          "attributeValues": $util.dynamodb.toMapValuesJson($ctx.args.input),
          "condition": {
            "expression": "attribute_not_exists(#pk1) AND attribute_not_exists(#sk1)",
            "expressionNames": {
              "#pk1": "pk1",
              "#sk1": "sk1",
            },
          },
        }
      ResponseMappingTemplate: $util.toJson($context.result)

  AppSyncResolverUpdateData:
    Type: AWS::AppSync::Resolver
    DependsOn: GraphQLApiSchema
    Properties:
      ApiId: !GetAtt GraphQLApi.ApiId
      TypeName: Mutation
      FieldName: updateData
      DataSourceName: !GetAtt GraphQLDataSource.Name
      RequestMappingTemplate: >
        {
          "version": "2017-02-28",
          "operation": "PutItem",
          "key": {
            "pk1": $util.dynamodb.toDynamoDBJson($ctx.args.input.pk1),
            "sk1": $util.dynamodb.toDynamoDBJson($ctx.args.input.sk1),
          },
          "attributeValues": $util.dynamodb.toMapValuesJson($ctx.args.input),
        }
      ResponseMappingTemplate: $util.toJson($context.result)

  AppSyncResolverDeleteData:
    Type: AWS::AppSync::Resolver
    DependsOn: GraphQLApiSchema
    Properties:
      ApiId: !GetAtt GraphQLApi.ApiId
      TypeName: Mutation
      FieldName: deleteData
      DataSourceName: !GetAtt GraphQLDataSource.Name
      RequestMappingTemplate: >
        {
          "version": "2017-02-28",
          "operation": "DeleteItem",
          "key": {
            "pk1": $util.dynamodb.toDynamoDBJson($ctx.args.pk1),
            "sk1": $util.dynamodb.toDynamoDBJson($ctx.args.sk1),
          },
        }
      ResponseMappingTemplate: $util.toJson($context.result)

3.5) APIKey

Access to the API is controlled by an API Key. AppSync also supports access via AWS Cognito and IAM. The length of time the key is valid for is controlled by parameter and it is in Epoch time. You basically pass in a number that maps to the Epoch time of when you want it to expire.

  AppSyncAPIKey:
      Type: AWS::AppSync::ApiKey
      Properties:
        ApiId: !GetAtt GraphQLApi.ApiId
        Expires: !Ref APIKeyExpiration

4) Parameters

APIName and APIKeyExpiration must be supplied at the time of deployment. APIName is used to generate name of API and the DynamoDB source table.

Parameters:
  APIName:
    Type: String
  APIKeyExpiration:
    Type: Number

5) Output

To use the API, you'll need the API Key and GraphQL URL. These will be output at the end of the SAM deployment.

Outputs:
  APIKey:
    Description: API Key
    Value: !GetAtt AppSyncAPIKey.ApiKey

  GraphQL:
    Description: GraphQL URL
    Value: !GetAtt GraphQLApi.GraphQLUrl

6) Testing

Using the APIKey and URL output from the SAM template, you can call the API. I've formatted examples of the inputs below that should help you.

6.1) Mutations

Use this example to write data to the API. Use Query Variables in the next section to specify input data.

mutation ($WriteDataInput: WriteDataInput!, $UpdateDataInput: UpdateDataInput!) {
   writeData(input: $WriteDataInput) {
    pk1
    sk1
    data
  }
  updateData(input: $UpdateDataInput) {
    pk1
    sk1
    data
  }
  deleteData(pk1: "DBS", sk1: "6") {
    data
    pk1
    sk1
  }
}

6.2) Query Variables

Construct your input data using Query Variables.

{
    "UpdateDataInput": {
    "pk1": "DBS",
    "sk1": "3",
    "data": [
    "{\"M\":{\"answer\":{\"S\":\"0\"},\"choice\":{\"S\":\"Add read replicas to the database.\"}}}",
        "{\"M\":{\"answer\":{\"S\":\"0\"},\"choice\":{\"S\":\"Put an Elasticache Redis cache in front of the database.\"}}}",
        "{\"M\":{\"answer\":{\"S\":\"1\"},\"choice\":{\"S\":\"Put an Amazon SQS queue in front of the database.\"}}}",
        "{\"M\":{\"answer\":{\"S\":\"0\"},\"choice\":{\"S\":\"Put an Elasticache Memcached cache in front of the database.\"}}}"
    ]
    },
  "WriteDataInput": {
    "pk1": "DBS",
    "sk1": "4",
    "data": [
    "{\"M\":{\"answer\":{\"S\":\"0\"},\"choice\":{\"S\":\"Add read replicas to the database.\"}}}",
        "{\"M\":{\"answer\":{\"S\":\"0\"},\"choice\":{\"S\":\"Put an Elasticache Redis cache in front of the database.\"}}}",
        "{\"M\":{\"answer\":{\"S\":\"1\"},\"choice\":{\"S\":\"Put an Amazon SQS queue in front of the database.\"}}}",
        "{\"M\":{\"answer\":{\"S\":\"0\"},\"choice\":{\"S\":\"Put an Elasticache Memcached cache in front of the database.\"}}}"
    ]
  }
}

6.3) Query

Use these examples to read data via the API.

query {
  readData(pk1: "DBS", sk1: "1") {
    pk1
    sk1
    data
  }
  readAllPKData(pk1: "DBS") {
    nextToken
    items {
      data
      pk1
      sk1
    }
  }
}