Introduction
Please find below a tutorial to create a serverless data api using AWS DynamoDB as the backend data store and AWS AppSync to create a GraphQL interface. All the steps and code samples shared are defined in YAML. I've used the AWS SAM syntax where it's supported and CloudFormation where not. SAM does not provide any syntax for AppSync but the good thing about SAM is that you can still use it to build and deploy a template that contains CloudFormation. This means that while you don't get the shorthand that SAM provides for other services you can continue to use the same SAM setup and commands. I think there is good scope for SAM to support AppSync in the same way it supports API Gateway and I hope that AWS adds it in the future.
You will need the AWS SAM CLI installed on your machine to complete this tutorial. If you don't have it, you can follow the instruction provided by AWS in this link.
I have kept this example very simple but it is still powerful. The application outlined allows the creation, update, delete and selection of a JSON document through a GraphQL interface. The document is stored in a map datatype in DynamoDB and indexed with a partition key and sort key. With some clever thinking around how you construct these keys, it can have many applications. You can use composite Each item (record) in DynamoDB can be up to 400kb so you can also store quite a lot of data per item.
1) DynamoDB - create your table.
This table will be the backend data store for the API. APIName is a parameter passed into the template that will be re-used as table name and API name. As DynamoDB is a NoSQL database, you don't need to define the schema upfront, only the primary key columns. The column to store the JSON will be added on writing the first item to the table.
DynamoDBTable:
Type: AWS::DynamoDB::Table
Properties:
TableName: !Ref APIName
ProvisionedThroughput:
WriteCapacityUnits: 5
ReadCapacityUnits: 5
AttributeDefinitions:
- AttributeName: "pk1"
AttributeType: "S"
- AttributeName: "sk1"
AttributeType: "S"
KeySchema:
- AttributeName: "pk1"
KeyType: "HASH"
- AttributeName: "sk1"
KeyType: "RANGE"
2) IAM
You need to create a number of AWS IAM Policies and Roles to allow AppSync to access both the DynamoDB table and log to CloudWatch.
2.1) AppSync to DynamoDB
This policy will allow the attached principle to query the DynamoDB table created. The CloudFormation Sub function allows you construct a single string from multiple inputs. If you wish to reference variables in the Sub function they need to be wrapped in ${}. Keeping to the principle of least privilege, I have included only the actions needed by the resolvers in AppSync and no others.
PolicyDynamoDB:
Type: AWS::IAM::ManagedPolicy
Properties:
Path: /service-role/
PolicyDocument:
Version: 2012-10-17
Statement:
- Effect: Allow
Action:
- dynamodb:Query
- dynamodb:GetItem
- dynamodb:PutItem
- dynamodb:DeleteItem
Resource:
- !Sub arn:aws:dynamodb:${AWS::Region}:${AWS::AccountId}:table/${DynamoDBTable}
Using sts:AssumeRole, attach the policy created to the AppSync service by creating a new role.
RoleAppSyncDynamoDB:
Type: AWS::IAM::Role
Properties:
ManagedPolicyArns:
- !Ref PolicyDynamoDB
AssumeRolePolicyDocument:
Version: 2012-10-17
Statement:
- Effect: Allow
Action:
- sts:AssumeRole
Principal:
Service:
- appsync.amazonaws.com
2.2) Log to CloudWatch
To allow AppSync access CloudWatch, you use sts:AssumeRole to attach the provided AWSAppSyncPushToCloudWatchLogs policy to the AppSync service. This is an AWS Managed Policy that you can use when creating a new role. Once the AppSync service assumes the role, it will have access to create log groups and streams and log events to CloudWatch.
RoleAppSyncCloudWatch:
Type: AWS::IAM::Role
Properties:
ManagedPolicyArns:
- "arn:aws:iam::aws:policy/service-role/AWSAppSyncPushToCloudWatchLogs"
AssumeRolePolicyDocument:
Version: 2012-10-17
Statement:
- Effect: Allow
Action:
- sts:AssumeRole
Principal:
Service:
- appsync.amazonaws.com
That is the IAM resources and DynamoDB table created. Now we can start on creating the AppSync resources.
3) AppSync
Now we can start on the AppSync resources. You need to create the following resources to have a fully working API:
- API Header - holder for the API components below with details of name, authentication and monitoring level.
- API schema – This is where the API definition is modeled in a GraphQL schema definition language (SDL)
- DataSource – A datasource is the component that provides the details about where the data is stored.
- Resolvers – Resolvers link the parts of the schema with the matching data sources. They also provide any transformation necessary between the two using Apache Velocity Template Language (VTL).
- API Key - In this tutorial, I am using an API Key to control access to the API. You can also use IAM or Cognito.
It does seem like a lot to specify and I think there is opportunity for SAM to abstract some of this away as it does for API Gateway. Through the console, you can create a new AppSync API with a wizard by pointing it at a DynamoDB table so it's really just getting something similar in an IAC format. As of now, you have to specify each of these individually with CloudFormation.
3.1) API Header
This resource creates the API header using the passed in parameter, the same one we used for the DynamoDB TableName parameter. The LogConfig section sets up CloudWatch logging using the role we created earlier. I have set it to the highest level of logging but you can reduce it down is this is too much by adjusting the parameters of the ExcludeVerboseContent and FieldLogLevel properties. Reduce the amount of logs written by setting FieldLogLevel to NONE, ERROR or ALL.
GraphQLApi:
Type: AWS::AppSync::GraphQLApi
Properties:
Name: !Ref APIName
AuthenticationType: API_KEY
LogConfig:
CloudWatchLogsRoleArn: !GetAtt RoleAppSyncCloudWatch.Arn
ExcludeVerboseContent: FALSE
FieldLogLevel: ALL
3.2) API Schema
The GraphQL schema is fundamental to all GraphQL platforms. This can be embedded directly in the YAML template or stored in S3 and referenced within the template. I have left it in the template for simplicity.
Within the schema, you are specifying the data types and the mutation and query interfaces. Query is using for reading data out of the API and mutations are for manipulating the underlying data. Each of the mutations and the readData query all operate on a single record. The readAllPKData query will return all items for a particular pk1 value. Depending on how you are indexing the JSON item, you could for example use this query to return all child records for a particular parent key.
GraphQLApiSchema:
Type: AWS::AppSync::GraphQLSchema
Properties:
ApiId: !GetAtt GraphQLApi.ApiId
Definition: |
schema {
query: Query
mutation: Mutation
}
type Data {
data: [AWSJSON]
pk1: String
sk1: String
}
type DataCollection {
items: [Data]
nextToken: String
}
input WriteDataInput {
pk1: String!
sk1: String!
data: [AWSJSON]!
}
input UpdateDataInput {
pk1: String!
sk1: String!
data: [AWSJSON]!
}
type Mutation {
writeData(input: WriteDataInput!): Data
updateData(input: UpdateDataInput!): Data
deleteData(pk1: String!, sk1: String!): Data
}
type Query {
readData(pk1: String!, sk1: String!): Data
readAllPKData(pk1: String!): DataCollection
}
3.3) DataSource
Once the API is created, we can attach the DynamoDB table as the data source for the resolvers. This tutorial only specifies one data source but the beauty of GraphQL is that you can have multiple data sources within the same API. AppSync also supports Aurora, AWS Elasticsearch Service and Lambda as native data sources. You can also reference other AWS services via a Lambda data source.
GraphQLDataSource:
Type: AWS::AppSync::DataSource
Properties:
ApiId: !GetAtt GraphQLApi.ApiId
Name: !Ref APIName
Type: AMAZON_DYNAMODB
ServiceRoleArn: !GetAtt RoleAppSyncDynamoDB.Arn
DynamoDBConfig:
TableName: !Ref DynamoDBTable
AwsRegion: !Sub ${AWS::Region}
3.4) Resolvers
Resolvers contain the logic mapping each query and mutation to an underlying data source with any transformation or logic needed. The name of the query or mutation is specified in the FieldName parameter. You can only attach one datasource per resolver. Resolvers use a scripting language called Apache Velocity Template Language (VTL) to encode any logic. The RequestMappingTemplate parameter specifies any transformation between the request and datasource. The ResponseMappingTemplate specifies any transformation between the datasource and response.
AppSyncResolverReadData:
Type: AWS::AppSync::Resolver
DependsOn: GraphQLApiSchema
Properties:
ApiId: !GetAtt GraphQLApi.ApiId
TypeName: Query
FieldName: readData
DataSourceName: !GetAtt GraphQLDataSource.Name
RequestMappingTemplate: >
{
"version": "2017-02-28",
"operation": "GetItem",
"key": {
"pk1": $util.dynamodb.toDynamoDBJson($ctx.args.pk1),
"sk1": $util.dynamodb.toDynamoDBJson($ctx.args.sk1),
},
}
ResponseMappingTemplate: $util.toJson($context.result)
AppSyncResolverReadAllPKData:
Type: AWS::AppSync::Resolver
DependsOn: GraphQLApiSchema
Properties:
ApiId: !GetAtt GraphQLApi.ApiId
TypeName: Query
FieldName: readAllPKData
DataSourceName: !GetAtt GraphQLDataSource.Name
RequestMappingTemplate: >
{
"version" : "2017-02-28",
"operation" : "Query",
"query" : {
"expression": "pk1 = :pk1",
"expressionValues" : {
":pk1" : $util.dynamodb.toDynamoDBJson($ctx.args.pk1),
}
}
}
ResponseMappingTemplate: $util.toJson($context.result)
AppSyncResolverWriteData:
Type: AWS::AppSync::Resolver
DependsOn: GraphQLApiSchema
Properties:
ApiId: !GetAtt GraphQLApi.ApiId
TypeName: Mutation
FieldName: writeData
DataSourceName: !GetAtt GraphQLDataSource.Name
RequestMappingTemplate: >
{
"version": "2017-02-28",
"operation": "PutItem",
"key": {
"pk1": $util.dynamodb.toDynamoDBJson($ctx.args.input.pk1),
"sk1": $util.dynamodb.toDynamoDBJson($ctx.args.input.sk1),
},
"attributeValues": $util.dynamodb.toMapValuesJson($ctx.args.input),
"condition": {
"expression": "attribute_not_exists(#pk1) AND attribute_not_exists(#sk1)",
"expressionNames": {
"#pk1": "pk1",
"#sk1": "sk1",
},
},
}
ResponseMappingTemplate: $util.toJson($context.result)
AppSyncResolverUpdateData:
Type: AWS::AppSync::Resolver
DependsOn: GraphQLApiSchema
Properties:
ApiId: !GetAtt GraphQLApi.ApiId
TypeName: Mutation
FieldName: updateData
DataSourceName: !GetAtt GraphQLDataSource.Name
RequestMappingTemplate: >
{
"version": "2017-02-28",
"operation": "PutItem",
"key": {
"pk1": $util.dynamodb.toDynamoDBJson($ctx.args.input.pk1),
"sk1": $util.dynamodb.toDynamoDBJson($ctx.args.input.sk1),
},
"attributeValues": $util.dynamodb.toMapValuesJson($ctx.args.input),
}
ResponseMappingTemplate: $util.toJson($context.result)
AppSyncResolverDeleteData:
Type: AWS::AppSync::Resolver
DependsOn: GraphQLApiSchema
Properties:
ApiId: !GetAtt GraphQLApi.ApiId
TypeName: Mutation
FieldName: deleteData
DataSourceName: !GetAtt GraphQLDataSource.Name
RequestMappingTemplate: >
{
"version": "2017-02-28",
"operation": "DeleteItem",
"key": {
"pk1": $util.dynamodb.toDynamoDBJson($ctx.args.pk1),
"sk1": $util.dynamodb.toDynamoDBJson($ctx.args.sk1),
},
}
ResponseMappingTemplate: $util.toJson($context.result)
3.5) APIKey
Access to the API is controlled by an API Key. AppSync also supports access via AWS Cognito and IAM. The length of time the key is valid for is controlled by parameter and it is in Epoch time. You basically pass in a number that maps to the Epoch time of when you want it to expire.
AppSyncAPIKey:
Type: AWS::AppSync::ApiKey
Properties:
ApiId: !GetAtt GraphQLApi.ApiId
Expires: !Ref APIKeyExpiration
4) Parameters
APIName and APIKeyExpiration must be supplied at the time of deployment. APIName is used to generate name of API and the DynamoDB source table.
Parameters:
APIName:
Type: String
APIKeyExpiration:
Type: Number
5) Output
To use the API, you'll need the API Key and GraphQL URL. These will be output at the end of the SAM deployment.
Outputs:
APIKey:
Description: API Key
Value: !GetAtt AppSyncAPIKey.ApiKey
GraphQL:
Description: GraphQL URL
Value: !GetAtt GraphQLApi.GraphQLUrl
6) Testing
Using the APIKey and URL output from the SAM template, you can call the API. I've formatted examples of the inputs below that should help you.
6.1) Mutations
Use this example to write data to the API. Use Query Variables in the next section to specify input data.
mutation ($WriteDataInput: WriteDataInput!, $UpdateDataInput: UpdateDataInput!) {
writeData(input: $WriteDataInput) {
pk1
sk1
data
}
updateData(input: $UpdateDataInput) {
pk1
sk1
data
}
deleteData(pk1: "DBS", sk1: "6") {
data
pk1
sk1
}
}
6.2) Query Variables
Construct your input data using Query Variables.
{
"UpdateDataInput": {
"pk1": "DBS",
"sk1": "3",
"data": [
"{\"M\":{\"answer\":{\"S\":\"0\"},\"choice\":{\"S\":\"Add read replicas to the database.\"}}}",
"{\"M\":{\"answer\":{\"S\":\"0\"},\"choice\":{\"S\":\"Put an Elasticache Redis cache in front of the database.\"}}}",
"{\"M\":{\"answer\":{\"S\":\"1\"},\"choice\":{\"S\":\"Put an Amazon SQS queue in front of the database.\"}}}",
"{\"M\":{\"answer\":{\"S\":\"0\"},\"choice\":{\"S\":\"Put an Elasticache Memcached cache in front of the database.\"}}}"
]
},
"WriteDataInput": {
"pk1": "DBS",
"sk1": "4",
"data": [
"{\"M\":{\"answer\":{\"S\":\"0\"},\"choice\":{\"S\":\"Add read replicas to the database.\"}}}",
"{\"M\":{\"answer\":{\"S\":\"0\"},\"choice\":{\"S\":\"Put an Elasticache Redis cache in front of the database.\"}}}",
"{\"M\":{\"answer\":{\"S\":\"1\"},\"choice\":{\"S\":\"Put an Amazon SQS queue in front of the database.\"}}}",
"{\"M\":{\"answer\":{\"S\":\"0\"},\"choice\":{\"S\":\"Put an Elasticache Memcached cache in front of the database.\"}}}"
]
}
}
6.3) Query
Use these examples to read data via the API.
query {
readData(pk1: "DBS", sk1: "1") {
pk1
sk1
data
}
readAllPKData(pk1: "DBS") {
nextToken
items {
data
pk1
sk1
}
}
}
7) Conclusion
You can find the full template.yaml in this GitHub repo:
https://github.com/thomasmilner/serverlessdataapi
I would like to credit https://twitter.com/sbstjn?s=20 and his repo as an excellent reference in helping me put this together.
https://github.com/sbstjn/appsync-example-dynamodb
Please reach out with any comments or questions you may have. I'm always happy to help.
Top comments (2)
Very well explained.
Thanks Avinash