"And how's that working out for you?"
You get used to the awe, horror, skepticism, and (occasionally) curiosity whenever someone learns out that your startup's built on top of DynamoDB. I get it. Starting with Dynamo means you're either supremely confident in your business model, domain model, and the data-access patterns that result--or utterly unconcerned about paying-as-you-go for the batteries that weren't included. Schema design is agony. We sunk far too much development time into JOIN
-ing data and reconstructing other basic functionality. But as we learned to live with it we also got a very flat performance curve and a decent handle on what works and what doesn't.
Live and learn
If we went back and did it all again, two decisions made later in the game would have enormously improved our quality of life at the start.
Decision one: ditching our ORM. DynamoDB is not a relational database, and any library that tries to nudge in that direction is a fast road to the wrong access patterns and a lousy developer experience. Life got much better when Ashwin Bhat started tearing up the edges of that inappropriate abstraction in favor of the DynamoDB DocumentClient
and a successful single-table design.
The other key change was to borrow heavily from effects-based programming for non-trivial state updates. At risk of ruining the punchline, most web apps are variations on the same four-part theme: load data, prepare changes, apply changes, repeat. Expressing the middle two stages in terms of "effects", even in a crude, homebuilt form, had a profound impact on our application as a whole, unlocking performance, improving visibility, and decreasing cycle times for new development.
An example
Before we get to the good stuff, consider a web service responsible for managing a team's roster. It probably includes a way to add a new user to a team, probably using a method like this one:
async function addUserToTeams(teamIds: TeamId[], userId: UserId) {
await Promise.all(teamIds.map(async function (teamId) {
const members = await Teams.getMembers(teamId);
if (!members.includes(userId)) {
await Teams.addMember(team, user);
}
}));
}
This method is neat and concise, but it comes with at least a few, ahem, concerns.
The business logic around adding users who are already part of the team is poorly defined. Right now we simply skip the operation while other changes take place in parallel--we don't make any attempt to roll back, or notify the caller about a partial failure.
Whatever the business logic ought to be, interspersing data access throughout the implementation will make it harder to test.
Likewise, any exception-handling logic must either exist at the data layer (limiting coordination across parallel requests) or implemented as a one-off inside this method.
All of these are solvable problems, however, and teasing out the load-prepare-apply
pipeline implicit in the naïve implementation is a good place to start.
Changes as effects
As a first step, let's separate computing the changes to make (the business logic) from how they're sent to the database. We'll then glue them together using a list of effects.
type MembershipEffect<T> = {
type: T,
teamId: TeamId,
userId: UserId,
}
type Effect =
| MembershipEffect<'ADD_MEMBER'>
| MembershipEffect<'DEL_MEMBER'>;
// etc
Next, the business logic. Instead of writing directly to the datastore as we were before, we'll now produce an 'ADD_MEMBER'
effect representing a user who needs to be added.
async function prepareAddUserToTeams(teamIds: TeamId[], userId: UserId) {
const effects: Effect[] = [];
await Promise.all(teamIds.map(async function (teamId) {
const members = await Teams.getMembers(teamId);
if (!members.includes(userId)) {
effects.push({ type: 'ADD_MEMBER', team, user });
}
}));
return effects;
}
At this point we've addressed one of the shortcomings of the original implementation (separating out a failable DB write) while also gaining the ability to peek into the "performed" effects before some or all of them have been written. More on that in a moment.
Effect processing
Now that we're producing effects we need a way to apply them.
async function applyMemberships(effects: Effect[]) {
await Promise.all(effects.map(function (effect) {
switch (effect.type) {
case 'ADD_MEMBER':
return Teams.addMember(team, user);
default:
throw new Error(`Not implemented: "${effect.type}"`);
}
});
}
applyMemberships
is a minimal effect processor, nothing more. It doesn't care where the events came from. It only cares that some upstream logic coughed them up, and--now that they're here--that they get applied to the application state. And since it's a pure, standalone function, it's easily extended. That could mean providing a general-purpose rollback strategy when event processing fails, or providing alternative "commit" strategies to ensure data is persisted via an appropriate API.
With DynamoDB, for instance, a single call to TransactWriteItems
(if we needed transactional checks or guarantees) or BatchWriteItem
(the rest of the time) will be both safer and cheaper than adding team member in separate PutItem
requests. Making the switch is a small change to applyMemberships
: just batch up the memberships and write them simultaneously:
async function applyMemberships(effects: Effect[]) {
const putItems = effects.flatMap(function (effect) {
if (effect.type === 'ADD_MEMBER') {
return [Teams.asPutMembershipItem(effect)];
}
return [];
});
// in real life we would chunk large-n batches
await docClient.batchWriteItem(putItems);
}
Crucially, we can make this adjustment without changing business logic! The same rules still apply. applyMemberships
only cares about how effects are interpreted. If we need to change how data are written, we just do it. If we need to trigger a welcome email or notify a billing service, we could add an additional effect--or, if inextricably linked to the ADD_MEMBER
effect, we can just tweak the processor to make sure they happen.
Logic and context
We can refactor prepareAddUserToTeams
just as freely. For instance, we might finish decoupling business logic and data access by preloading memberships (or any other relevant context).
type Context = {
membersByTeamId: {
[teamId: TeamId]: UserId[],
},
}
async function loadMembershipContext(teamIds: TeamId[]) {
const teamMembers = await Teams.batchGetMembers(teamIds)
const pairs = teamIds.map(function (teamId, i) {
const members = teamMembers[i];
return [teamId, members];
});
return {
membersByTeamId: Object.fromEntries(pairs),
}
}
With all of the I/O handled externally, the logic around adding new members condenses into an almost-trivial (and blissfully pure!) function.
function prepareAddUserToTeams(context: Context, userId: UserId) {
const effects: Effect[] = [];
const entries = Object.entries(context.membersByTeamId);
for (const [teamId, members] of entries) {
if (!members.includes(userId)) {
effects.push({ type: 'ADD_MEMBER', team, user });
}
}
return effects;
}
Pipes all the way down
The somewhat naive implementation we started with has gotten considerably more verbose. In return, decoupling the load-prepare-apply
stages has yielded reusable solutions to two-thirds of any change related to the team's membership roster. Implement the business logic and you're off!
Using the load
and apply
building blocks, we can now collapse addUserToTeams
down into something much more concise:
async function addUserToTeams(teamIds: TeamId[], userId: UserId) {
const context = await loadMembershipContext(teamIds);
const effects = prepareAddUserToTeams(context, userId);
await applyMemberships(effects);
}
If you're familiar with Node's Stream API (or pretty much any functional programming language) you'll recognize a pipeline in the making. Here's how it would look if the Hack-style pipelines currently favored in the (still-very-unsettled) TC39 Pipeline proposal) were adopted:
async function addUserToTeams(teamIds, userId) {
await loadMembershipContext(teamIds)
|> prepareAddUserToTeams(%, userId)
|> await applyMemberships(%)
}
Besides increasing testability, reusability, and confidence in each step, we can now add additional business logic independent of the data layer. For example, the same building blocks easily recombine into a new API method for populating
an entire team's roster:
async function populateTeam(teamId: TeamIds, userIds: UserId[]) {
const context = await loadMembershipContext([teamId]);
const effects = userIds.flatMap(userId =>
prepareAddUserToTeams(context, userId)
);
await applyMemberships(effects);
}
After updating applyMemberships
to handle DEL_MEMBER
and SET_ACCESS
effects (with some attention required around de-duplication and ordering), we could go on to implement an entire roster-management application with only minimal regard for where and how it's persisted.
Back in the real world
An effect-based approach adds undeniable indirection and (local) complexity. For a method as simple as addUserToTeams
, our naive first implementation may be the right way to go. As users, side effects, or the surface area of our membership API increase, however, effects provide a fairly straightforward way to manage them. They might:
- provide preflight mechanisms for data migrations (by inspecting a migration's effects before applying it)
- simplify development of batch operations (e.g. archiving or deleting many related nodes in a graph) by encouraging proven, reusable load/apply methods
- simplify complex business processes (e.g. computing and synchronizing billing details owned by a mix of 1st- and 3rd-party systems) by isolating effects from application.
Ultimately, using effects as the basis the load-prepare-apply
pipeline at the heart of most state changes isn't the big step it seems. Yes, it takes time to verify mostly-independent parts and reconstitute them into working whole. But once built, and once trusted, they tremendously accelerate future development.
Cover image by Iker Urteaga via unsplash
Top comments (0)