Author: Evan Weaver
Date: December 9, 2020
Fauna and DynamoDB are both serverless databases, but their design goals, architecture, and use cases are very different. In this post, I will overview both systems, discuss where they shine and where they don’t, and explain how various engineering and product decisions have created fundamentally different value propositions for database users.
DynamoDB’s design philosophy: availability and predictability
AWS DynamoDB was developed in response to the success of Apache Cassandra. The Cassandra database was originally open sourced and abandoned by Facebook in 2008. My team at Twitter contributed extensively to it alongside the team from Rackspace that eventually became DataStax.
However, in an odd twist of history, Cassandra itself was inspired by a 2007 paper from Amazon about a different, internal database called Dynamo—an eventually-consistent key-value store that was used for high-availability shopping cart storage. Amazon cared a lot about shopping carts long before they had a Web Services business. Within Amazon, the Dynamo paper, and thus the roots of DynamoDB, predate any concept of offering a database product to external customers.
DynamoDB and Cassandra both focused on two things: high availability and low latency. To achieve this, their initial releases sacrificed everything else one might value from traditional operational databases like PostgreSQL or even MongoDB: transactionality, schema, database normalization or document modeling, indexes, foreign keys, even the idea of a query planner itself. DynamoDB did improve on the original Dynamo architecture by making single-key writes serializable and dropping the baroque CRDT reconciliation scheme, and on Cassandra by having a somewhat more humane API.
DynamoDB’s architecture and pricing
DynamoDB’s architecture essentially puts a web server in front of a collection of B-tree partitions (think BDB databases) into which documents are consistently hashed. Documents are columnar, but do not have a schema.
Within a DynamoDB region, each data partition is replicated three times. Durability is guaranteed by requiring synchronous majority commits on writes. Consistency is only enforced within a single partition, which in practice, means a single document, because partition boundaries cannot be directly managed. Writes always go through a leader replica first; reads can come from any replica in eventually-consistent mode, or the leader replica in strongly consistent mode.
Although DynamoDB has recently added some new features like secondary indexes and multi-key transactions, their limitations reflect the iron law of DynamoDB: “everything is a table”:
- Tables, of course, are tables.
- Replication to other regions is implemented by creating additional tables that asynchronously apply changes from a per-replica, row-based changelog.
- Secondary indexes are implemented by asynchronously projecting data into additional tables--they are not serializable and not transactional.
- Transactionality is implemented via a multi-phase lock—presumably DynamoDB keeps a hidden lock table, which is directly reflected in the additional costs for transactionality. DynamoDB transactions are not ACID (they are not fully isolated or serializable) and cannot effectively substitute for relational transactions. Transaction state is not visible to replicas or even to secondary indexes within the same replica.
As you may predict from the above, the DynamoDB literature is absolutely packed with examples of “single-table design” using aggressive NoSQL-style denormalization. Using the more complex features is generally discouraged. It makes sense that DynamoDB’s pricing is also designed around single-table, eventually-consistent usage, even though in replicated and indexed scenarios individual queries must interact with multiple tables, often multiple times.
Additional challenges lie in the query model itself. Unlike Fauna’s query language FQL or SQL, DynamoDB’s API does not support dependent reads or intra-query computation. Fauna does, allowing developers to encapsulate complex business logic in transactions without any consistency, latency, or availability penalty.
DynamoDB works best for the use cases for which it was originally designed—scenarios where data can be organized by hand to match a constrained set of predetermined query patterns; where low latency from a single region is enough; and where multi-document updates are the exception, not the rule. For example, lock storage, as durable cache for a different, less scalable database like an RDBMS, or for transient data like the original shopping cart use case.
Fauna’s design philosophy: a productivity journey
Fauna, on the other hand, was inspired from our experience at Twitter delivering a global real-time consumer internet service and API. Our team has extensively used and contributed to MySQL, Cassandra, Memcache, Redis, and many other popular data systems. Rather than focus on helping people optimize workloads that are already at scale, we wanted to help people develop functionality quickly and scale it easily over time.
We wanted to make it possible for any development team to iterate on their application along the journey from small to large without having to become database experts and spend their time on caching, denormalization, replication, architectural rewrites, and everything else that distracts from building a successful software product.
Fauna’s architecture and pricing
To further this goal, Fauna uses a unique architecture that guarantees low latency and transactional consistency across all replicas and indexes even with global replication, and offers a unique query language that preserves key relational concepts like ACID transactions, foreign keys, unique constraints, and stored procedures, while also enabling modern non-relational concepts like document-oriented modeling, declarative procedural indexing, and a standards-based GraphQL API.
If everything is a table in DynamoDB, in Fauna, everything is a transaction:
- All queries are expressed as atomic transactions.
- Transactions are made durable in a partitioned, replicated, strongly-consistent statement-based log.
- Data replicas apply transaction statements from the log in deterministic order, guaranteeing ACID properties without additional coordination.
- These properties apply to everything, including secondary indexes and other read and write transactions.
- Read-only transactions achieve lower latency than writes by skipping the log, but with additional tricks, remain fully consistent.
Unlike DynamoDB, Fauna shines in the same areas the SQL RDBMS does: modeling messy real-world interaction patterns that start simply but must evolve and scale over time. Unlike SQL, Fauna’s API and security model is designed for the modern era of mobile, browser, edge, and serverless applications.
Like DynamoDB, and unlike the RDBMS, Fauna transparently manages operational concerns like replication, data consistency, and high availability. However, a major difference from DynamoDB is the scalability model. DynamoDB scales by predictively splitting and merging partitions based on observed throughput and storage capacity. By definition, this works well for predictable workloads, and less well for unpredictable ones, because autoscaling changes take time.
FaunaDB, on the other hand, scales dynamically. As an API, all resources including compute and storage are potentially available to all users at any time. Similar to operating system multithreading, Fauna is continuously scheduling, running, and pausing queries across all users of the service. Resource consumption is tracked and billed, and our team scales the capacity of each region in aggregate, not on a per-user basis.
Naturally, this design has a different cost structure than something like DynamoDB. For example, there is no way to create an unreplicated Fauna database or to disable transactions. Like DynamoDB, Fauna has metered pricing that scales with the resources your workload actually consumes. But unlike DynamoDB, you are not charged per low-level read and write operation, per replica, per index, because our base case is DynamoDB’s outlier case: the normalized, indexed data model, with the transactional, multi-region access pattern.
Higher levels of abstraction exist to deliver higher levels of productivity. Fauna offers a much higher level of abstraction than DynamoDB, and our pricing reflects that as well—it includes by default everything that DynamoDB does not. At Fauna we want to provide a database with the highest possible level of abstraction, so that you don’t have to worry about any of the low level concerns at all.
What is an API worth?
Almost all other databases aside from DynamoDB and Fauna are delivered as managed cloud infrastructure, and billed on a provisioned basis that directly reflects the vendor’s costs and those costs alone. Serverless infrastructure is relatively new—S3 is perhaps the first service with a serverless billing model to reach widespread adoption—and serverless databases are even newer. The serverless model in DynamoDB is a retrofit. It is essentially still a provisioned system with the addition of predictive autoscaling.
Instead, serverlessness to date has mainly been restricted to vertically-integrated, single-purpose APIs. These APIs have been monetized indirectly like Twitter, billed per-action like Twilio, or billed as a percentage of the value exchanged via the API between third parties—like Stripe.
Serverless infrastructure, as we all know, is actually made from servers. It has a more complex accounting challenge than vertically-integrated APIs, and is constrained by:
- Variance in resource utilization per request
- Variance in request volume over time
- Variance in request locality
- Underlying static costs
The multi-tenancy of serverless infrastructure creates a fundamentally better customer experience. Who wants to pay for capacity they aren’t using? Who wants to have their application degrade because they didn’t buy enough capacity in advance? It’s also a better vendor experience, since no vendor wants to waste infrastructure, and it can be more environmentally friendly.
However, the vendor’s aggregate price across all customers must cover the static infrastructure costs, which are tightly coupled and resistant to change. (As a practical matter, a vendor can’t upgrade and downgrade CPUs, memory, disks, and networks independently of either on demand, even when using managed cloud services.) The aggregate price must also correlate with the business value recognized, and it must be appropriately apportioned based on the realization of that value for each individual customer over time.
Compared to simply marking up the incremental cost of a server, this pricing problem is hard. Let’s discuss the solutions that DynamoDB and Fauna have found.
DynamoDB pricing explained
After careful analysis and testing, we believe these formulas correctly summarize DynamoDB pricing, specifically on-demand billing in most US AWS regions:
Read operations assume a data size of 4K or less; each additional 4K costs an additional operation. Write operations assume a data size of 1K or less. Notably, index writes count as entirely separate write operations; they are not included in the document’s 1K.
As a rule, read costs scale with the number of documents multiplied by the consistency level. Write costs scale with the number of documents multiplied by the number of indexes, multiplied again by the number of replicas, plus the consistency level.
This pricing is clear and straightforward in the use cases that DynamoDB was designed for. However, if you add in usage of newer features like global replication, indexes, and transactions, the pricing becomes more opaque, and it can become very difficult to predict costs in advance.
Fauna pricing explained
Fauna’s pricing can be summarized with these formulas:
Read operations assume a data size of 4K or less; each additional 4K costs an additional operation. Write operations assume a data size of 1K or less. Unlike DynamoDB, index writes are charged by size only, not by both size and number of indexes. If a document write and its indexes fit within the 1K limit, there will be no additional charge for the indexes. Since index data is usually small, many indexes can be updated in just a few write operations, greatly reducing costs. Finally, there is no separate charge for replication; in Fauna, data is data.
DynamoDB doesn’t support computation of any kind within a query, but Fauna does. Thus, Fauna charges separately for compute costs. In DynamoDB, since computation for any particular workload can’t be done in the database at all, it must be done application-side in a compute environment like AWS Lambda, which has its own cost.
Fauna compute is charged as such:
The efficiency variance of the possible compute stacks that would substitute for Fauna compute is so high (Ruby in AWS Lambda vs. C++ on EC2?) that it is not possible to make any general comparison. Nevertheless, doing data-dependent computation co-located with the data itself is usually a good thing, so Fauna shows well in most real-world scenarios.
However, note that every API call costs at least one compute operation. In Fauna, it’s best to write sophisticated queries that do as much work as possible in a single request, rather than treat it like a key-value store. This minimizes costs, maximizes performance, and guarantees transactional correctness.
Because it’s more high level, Fauna’s pricing is much more straightforward. The downside, of course, is if you don’t want the higher level functionality, it may be more expensive than DynamoDB. You get what you pay for in serverless!
Let’s take the most basic possible example to start. Imagine a hit counter on a webpage from 1998. It doesn’t have to be recent, it doesn’t have to be fast, but it does have to increase over time. We will do one million reads and one million writes to a single document, at the default consistency level and with the default replication configuration.
In Dynamo, this gives us an eventually consistent read, and a consistent write, in one region. Because of the read default, clients will not even be able to read their own writes: somebody could refresh our webpage and see the count stay the same, which doesn’t make sense. But they probably won’t notice, and after all, we don’t want to spend much money counting hits.
In Fauna, we get more than we asked for. We get global replication, providing higher availability and lower latency to clients. We also get consistency across keys, but since our visitors aren’t in the habit of comparing counts across pages in realtime, nobody will probably know the difference . However, clients will always read their own writes, which is good: refreshing the page will always make the hit count go up.
Our costs for one million reads and writes of this trivial workload are:
Advantage: Dynamo, by 80%.
This seems a bit high for Fauna, but if we batch operations in groups of 50 per request, the compute cost is amortized:
Dynamo remains cheaper, by 51%.
Let’s do something useful now, like the original Dynamo use case, the globally replicated shopping cart. Unlike the implementation described in the paper, we are not going to store entire carts as single records, nor will we rely on complex CRDT schemes to merge conflicting carts. We have better technology now and we can model carts as collections of items in a more natural, relational way.
To make sure users can always access their cart quickly, we will configure DynamoDB with two additional replicas. We will also add three indexes in both DynamoDB and Fauna, so that we can sort the items in the three ways, regardless of how many items there are.
And we will use consistent reads for DynamoDB, so that customers can’t see missing items in the default view. However, DynamoDB indexes are never consistent. So for DynamoDB, we need to make duplicate consistent reads of the items by their primary key after reading them via the index, to verify that they are still really in the cart.
For one million carts where we add a single item and then view the cart, our costs are:
The “tables” have turned. Fauna is cheaper by 66%.
Because our index values are small and fit within the 1K write operation limit, there is no separate charge for indexing. We also do not need to pay extra for multi-region replication.
Adding and viewing multiple items per request would benefit Fauna’s model even more, due to compute cost amortization. Let’s add 5 items and view the result in each request:
Now Fauna is 76% cheaper than DynamoDB.
Finally, let’s imagine we have something more like a typical SaaS application, for example, a CRM. We have an accounts table with 20 secondary indexes defined for all the possible sort fields (DynamoDB’s maximum—Fauna has no limit). We also have an activity table with 10 indexes, and a users table with 5 indexes. Viewing just the default account screen queries 7 indexes and 25 documents. A typical activity update transactionally updates 3 documents at a time with 10 dependency checks and modifies all 35 indexes.
And of course, we have replicated this data globally to two additional regions in DynamoDB. We will also do consistency checks on all data returned from indexes.
In Fauna, we do not need to configure replication, or do any additional consistency checking. And we benefit greatly from Fauna’s index write coalescing.
With that in mind, one million account screen views and one million activity updates will cost us:
Advantage Fauna, fully 92% cheaper.
Even if we assume that Fauna will require multiple write operations per document because of all the indexes, the result does not materially change. Fauna’s query pattern could also be improved by using unique constraints instead of dependency reads, which would reduce costs further.
When less isn’t more
One misconception we see is that DynamoDB is cheap at scale and Fauna is expensive, or that DynamoDB is fast and Fauna is slow. This is untrue, although understandable given that many people evaluate a simple use case and move on.
Because DynamoDB is designed for the simple use case rather than the complex, it always wins this superficial comparison. Along with the pricing model, there are various other pain points in DynamoDB like manual partitioning and the eventually-consistent DDL that reflect its roots as a lower-level system.
This situation is similar, but less malignant, than the MongoDB unsafe commit scandal, where writes appeared faster than the laws of physics allow, because the MongoDB client acknowledged them before they were actually written. Indeed, not doing something is always faster and cheaper than doing it—but if the customer does need to do it, they must take on the implementation burden of supporting what should be a database-level concern in a broken, ad-hoc way in their application.
For example, if you configure a dozen indexes for a global deployment in Dynamo, you will find your write and storage costs have multiplied by an order of magnitude compared to a single region, unindexed table. Make them transactional writes or start doing dependent reads and they go up even more. On the other hand, if you try to use Fauna as an application-adjacent durable cache at scale, you may find you are paying for data replication and transactional consistency that you don’t actually need.
It is more accurate to say that both DynamoDB and Fauna are fast and cheap when used for their correct purposes, and expensive when used incorrectly. This seems like a universal rule, but it actually isn’t. Most databases, even in the managed cloud, are disproportionately expensive for intermittent or variable workloads, which real-world workloads always are. This is the benefit of the serverless model: an order of magnitude less waste for both the customer and the vendor.
At Fauna, we recognize that DynamoDB has pushed the envelope in distributed cloud databases and we are grateful for that. We share the same greater mission. At the same time, we know we as an industry can do better than all existing databases for mission-critical operational workloads, whether key-value, document-oriented, or based on SQL.
I hope this post has provided you clearer understanding of the motivations, architectures, and business value of both DynamoDB and Fauna. And further, that this understanding helps you make more informed decisions about which tool is right for which job.
Top comments (0)