Here is a quick post to show how to run DynamoDB locally if you want to test without connecting to the Cloud. As I already mentioned in a previous blog post, it stores the DynamoDB table items in a SQLite database. Yes, a NoSQL database stored in a SQL one... this tells a lot about the power of SQL.
I'll run DynamoDB Local in a Docker container, and define aliases to access it with AWS CLI and SQLite:
# Start DynamoDB local with a SQLite file (not in memory)
docker run --rm -d --name dynamodb -p 8000:8000 amazon/dynamodb-local -jar DynamoDBLocal.jar -sharedDb -dbPath /home/dynamodblocal
# alias to run `sqlite3` on this file
alias sql='docker exec -it dynamodb \
sqlite3 /home/dynamodblocal/shared-local-instance.db \
'
# alias to run AWS CLI with linked to the DynamoDB entrypoint and exposing the current directory as /aws (which is the container home directory)
alias aws='docker run --rm -it --link dynamodb:dynamodb -v $PWD:/aws \
-e AWS_DEFAULT_REGION=xx -e AWS_ACCESS_KEY_ID=xx -e AWS_SECRET_ACCESS_KEY=xx \
public.ecr.aws/aws-cli/aws-cli --endpoint-url http://dynamodb:8000 \
'
Create table
I create a table from the create-table example, and query the internal SQLite DB:
aws dynamodb create-table \
--table-name MusicCollection \
--attribute-definitions AttributeName=Artist,AttributeType=S AttributeName=SongTitle,AttributeType=S \
--key-schema AttributeName=Artist,KeyType=HASH AttributeName=SongTitle,KeyType=RANGE \
--provisioned-throughput ReadCapacityUnits=5,WriteCapacityUnits=5 \
--tags Key=Owner,Value=blueTeam
sql -line -echo "select * from dm;"
Insert Items
I insert some data from the batch-write-items example
cat > request-items.json <<'JSON'
{
"MusicCollection": [
{
"PutRequest": {
"Item": {
"Artist": {"S": "No One You Know"},
"SongTitle": {"S": "Call Me Today"},
"AlbumTitle": {"S": "Somewhat Famous"}
}
}
},
{
"PutRequest": {
"Item": {
"Artist": {"S": "Acme Band"},
"SongTitle": {"S": "Happy Day"},
"AlbumTitle": {"S": "Songs About Life"}
}
}
},
{
"PutRequest": {
"Item": {
"Artist": {"S": "No One You Know"},
"SongTitle": {"S": "Scared of My Shadow"},
"AlbumTitle": {"S": "Blue Sky Blues"}
}
}
}
]
}
JSON
aws dynamodb batch-write-item \
--request-items file://request-items.json \
--return-consumed-capacity INDEXES \
--return-item-collection-metrics SIZE
Here is what is stored in the SQLite table:
Transact Write
Here is an update of 'Happy Day' and Delete of 'Call Me Today' as in the transact-write example:
cat > transact-items.json <<'JSON'
[
{
"Update": {
"Key": {
"Artist": {"S": "Acme Band"},
"SongTitle": {"S": "Happy Day"}
},
"UpdateExpression": "SET AlbumTitle = :newval",
"ExpressionAttributeValues": {
":newval": {"S": "Updated Album Title"}
},
"TableName": "MusicCollection",
"ConditionExpression": "attribute_not_exists(Rating)"
}
},
{
"Delete": {
"Key": {
"Artist": {"S": "No One You Know"},
"SongTitle": {"S": "Call Me Today"}
},
"TableName": "MusicCollection",
"ConditionExpression": "attribute_not_exists(Rating)"
}
}
]
JSON
aws dynamodb transact-write-items \
--transact-items file://transact-items.json \
--return-consumed-capacity TOTAL \
--return-item-collection-metrics SIZE
Scan
I have 2 items remaining, 'Scared of My Shadow' that has not been touched and 'Happy Day' where the title has been updated with 'Updated Album Title':
aws dynamodb scan --table-name MusicCollection --output table
Testing failures
I put back the initial data with 'Call Me Today', 'Happy Day' and 'Scared of My Shadow':
aws dynamodb batch-write-item --request-items file://request-items.json
I have those 3 Items:
aws dynamodb scan --table-name MusicCollection --output text | grep ALBUMTITLE
ALBUMTITLE Songs About Life
ALBUMTITLE Somewhat Famous
ALBUMTITLE Blue Sky Blues
To simulate something that can go wrong, I lock the row for
'Call Me Today' which is the one that by Transac Write should delete ('Somewhat Famous'):
sql
begin transaction;
select rangeKey from MusicCollection;
update MusicCollection set rangeKey='x' where rangeKey like '%Today%';
select rangeKey from MusicCollection;
I try my Transact Write:
aws dynamodb transact-write-items --transact-items file://transact-items.json
It fails with:
An error occurred (InternalFailure) when calling the TransactWriteItems operation (reached max retries: 2): The request processing has failed because of an unknown error, exception or failure.
and all is back to normal:
aws dynamodb scan --table-name MusicCollection --output text | grep ALBUMTITLE
ALBUMTITLE Songs About Life
ALBUMTITLE Somewhat Famous
ALBUMTITLE Blue Sky Blues
This is a simple test of atomicity. But this runs on a software that is different from the real DynamoDB.
Can we test race conditions?
Why did I do that? The Transact API was subject to discussions:
Alex DeBrie@alexbdebrie
I rumbled w/ @houlihan_rick about the DynamoDB Transact API yesterday. Following up w/ thoughts here.
I really like the DDB Txn API! I think it's well designed:
✅ Enables real use cases
✅ Communicates that transactions have a cost
✅ Prevents the most problematic transactions twitter.com/alexbdebrie/st…20:35 PM - 23 Jan 2023Alex DeBrie @alexbdebrie@houlihan_rick @aptrishu @nathankpeck Strong disagree but will gather my thoughts on this!
DynamoDB is a closed source managed service with no possibility to look at the internals. The behavior I see looks good, but how can we be sure?
By "sure", I mean:
- that it works as designed, and documented, without any bug
- that my understanding of this documentation is correct
- that a simple test case can be reproduced later
If you read my blog, you know that this is my way of learning and explaining. Many times I come back to a blog post from the past and copy-paste the simple test to se if something has changed with a new version.
So, how to do the same with a proprietary software, run on a platform where you don't have access, and for which there's no documentation about the internals?
You can read the documentation and trust it, like Alex:
Alex DeBrie@alexbdebrie
@FranckPachot @houlihan_rick @aptrishu @nathankpeck It’s what the docs say the behavior is, so I’d trust it. Agree it’d be hard to test. I still don’t know a concrete case where it would hurt you12:14 PM - 24 Jan 2023
You can stress test and see what happens, like Rick:
Rick Houlihan@houlihan_rick
@alexbdebrie @FranckPachot @aptrishu @nathankpeck Increasing the number of threads and objects touched will eventually create a workload where the vast majority of transactions initiated will simply fail and rollback. Fail or no fail the user will pay.
Transact API is not suitable for highly concurrent workloads.12:27 PM - 24 Jan 2023
I can try on DynamoDB local like suggested by Pete:
Pete Naylor@pj_naylor
@FranckPachot Have you tried the TX APIs with DDB Local?07:08 AM - 25 Jan 2023
So... that's what I did. But now I have to think about the possible ways to reproduce race conditions on small test case. I do that with Open Source software, like PostgreSQL of YugabyteDB. I even did that for proprietary software like Oracle Database, because you can download it and run the same software that they use for their managed services. But for AWS services that are not running an Open Source product, this is impossible.
Of course, there's also experience. You can trust Alex, Rick and Pete as they did troubleshooting AWS customers problems. But from my experience, I've never learned a lot about the internals when troubleshooting production because there's no time to go into the details. On the opposite, I've learned a lot when reproducing those issues in a lab, building test case for support requests, preparing demos, leading a training workshops, or simply investigating by curiosity. You know a topic when you can explain it, and you get the fundamentals when you can demo it.
Top comments (0)