In the previous part of the series I ended the article with a question:
- Is it possible to improve copying performance when batching updates?
We saw that immutability is a great way of avoiding side-effects. So why (and when) would someone not use it?
Use Case Example
Imagine our UI Application contains:
- A Set of email strings.
- Functions for registering/unregistering emails. They receive an array of email strings, and then update the Set accordingly.
let emailSet = new Set([
'1stEmail@mail.com',
'2ndEmail@mail.com',
'3rdEmail@mail.com',
//...
]);
const registerEmails = (list: string[]) => {
list.forEach(email => {
emailSet = new Set(emailSet).add(email)
})
}
const unregisterEmails = (list: string[]) => {
list.forEach(email => {
emailSet = new Set(emailSet).delete(email)
})
}
💡 Feel free to check my article about sets for an explanation about the concept.
Both functions update emailSet
immutably: they always create a new copy of the Set with new Set()
constructor, and then mutate only the newest version. This presents some challenges:
Constraint - Cloning can be Expensive
When cloning a Set, each item will be copied into the new Set, so the total time spent cloning is proportional to the Set size: O(size(Set))
. This is the main reason why we should avoid cloning as much as possible, while still avoiding side-effects in the original Set.
Problem #1 - Cloning an Unchanged Set
An unnecessary clone will be performed if the function:
- Registers an already existing email
- Unregisters an inexistent email
This is easy to fix: we can update the functions to perform "selective cloning" (only change the Set if there is a real modification).
const registerEmails = (list: string[]) => {
list.forEach(email => {
/* Check if email not registered before cloning */
if (!emailSet.has(email)) {
emailSet = new Set(emailSet).add(email)
}
})
}
const unregisterEmails = (list: string[]) => {
list.forEach(email => {
/* Check if email registered before cloning */
if (emailSet.has(email) {
emailSet = new Set(emailSet).delete(email)
}
})
}
💡 Client-side frameworks (E.g.: Angular, React, etc) usually rely on ===
test to detect component changes. Forcing a useless clone wastes time both during the cloning process, as well as in the framework internal diff checks.
Problem #2 - Not Batching Mutations
Our code is still not performant under certain circumstances. If we receive a list of 10 emails to register/unregister, our Set could be cloned 10 times inside our forEach
loop.
registerEmails([
'a@mail.com', // New email, clone Set
'b@mail.com', // New email, clone Set
'c@mail.com', // New email, clone Set
//... (New email, clone Set x7)
])
Is it possible to keep the immutability benefits while also speeding up code execution? We still want a new Set, but this much cloning is not desired in this case.
Batching
The solution to the problem above is called batching. From outside of the batching context, everything looks immutable (no side-effects), while on the inside it uses mutability when possible.
The batcher wraps a target object (in our case, the Set) and provides an API for changing it that follows the rules:
-
Delay cloning target until absolutely necessary (call
willChange()
). - After object is cloned, we can mutate it subsequently as many times as required (mutate
currentValue
).
Let's use the batcher in the registerEmails
function as an example:
const registerEmails = (list: string[]) => {
/* Create the batcher context for emailSet */
let batcher = prepareBatcher(emailSet);
list.forEach(email => {
/* Use batcher currentValue property to refer to Set */
if (!batcher.currentValue.has(email)) {
/* Let batcher know a change is about to happen */
batcher.willChange();
/* We can mutate currentValue (Set) directly now */
batcher.currentValue.add(email)
/* Update our emailSet variable */
emailSet = batcher.currentValue;
}
})
}
Composable Batchers
The previous code is performant, but code reusability can be present in our batching architecture. Here's one way to implement it:
Functions receives a batcher (which wraps the object to be modified) as argument instead of the object directly.
The function performs desired changes using the Batcher API.
In the end, the function returns the batcher (NOT the object).
Let's refactor the previous code snippet into more reusable functions:
/* This can be reused for any Set */
const add = <T>(batcher: Batcher<Set<T>>, item: T) => {
if (!batcher.currentValue.has(item)) {
batcher.willChange();
batcher.currentValue.add(item);
}
return batcher;
}
/* This can be reused for any Set */
const remove = <T>(batcher: Batcher<Set<T>>, item: T) => {
if (batcher.currentValue.has(item)) {
batcher.willChange();
batcher.currentValue.delete(item);
}
return batcher;
}
And now we can import the functions into our project:
const registerEmails = (batcher: Batcher<Set<string>>, list: string[]) => {
list.forEach(email => {
add(batcher, email);
});
return batcher;
}
const unregisterEmails = (batcher: Batcher<Set<string>>, list: string[]) => {
list.forEach(email => {
remove(batcher, email);
});
return batcher;
}
/* Call registerEmails */
let batcher = prepareBatcher(emailSet);
registerEmails(batcher, [...]);
emailSet = batcher.currentValue;
We can keep creating higher-level procedures:
const complexOperation = (batcher: Batcher<Set<string>>) => {
/* Apply operations */
registerEmails(batcher, [...]);
unregisterEmails(batcher, [...]);
unregisterEmails(batcher, [...]);
registerEmails(batcher, [...]);
return batcher;
}
let batcher = prepareBatcher(emailSet);
/* Call the function */
complexOperation(batcher);
/* Update variable */
emailSet = batcher.currentValue;
- The cloning still happens at most once! If we had no optimizations, there could have been
length(array)
clones (withsize(Set)
item copies) for each register/unregister call insidecomplexOperation
. - The code is modular and reusable, all we have to do is call
prepareBatcher(emailSet)
and provide it to the function. - The reference equality still stands for the object if no changes have been made.
Proof of Concept
I recently came up with a proof of concept for the Batcher Architecture. You can check the console.log
s in the CodeSandbox example below:
The source code can be found at
Stackomate / data-structures
Data-structures for Typescript projects.
For now, one can use add
, remove
and filter
methods. New operations will be available soon.
Top comments (2)
Hey Rafael,
Great article! I'm happy there are others who are using and writing about the functional programming paradigm.
Hey Justin, thank you for your comment, I'm glad you liked it! Definitely, functional programming is a great paradigm for improving coding and avoiding side-effects :)