DEV Community

Rafael Calpena
Rafael Calpena

Posted on

Batching Operations - When Pure Immutability Is Not Worth It

In the previous part of the series I ended the article with a question:

  • Is it possible to improve copying performance when batching updates?

We saw that immutability is a great way of avoiding side-effects. So why (and when) would someone not use it?

Use Case Example

Imagine our UI Application contains:

  • A Set of email strings.
  • Functions for registering/unregistering emails. They receive an array of email strings, and then update the Set accordingly.
let emailSet = new Set([
    '1stEmail@mail.com', 
    '2ndEmail@mail.com',
    '3rdEmail@mail.com',
    //...
]);

const registerEmails = (list: string[]) => {
    list.forEach(email => {
        emailSet = new Set(emailSet).add(email)
    })
}

const unregisterEmails = (list: string[]) => {
    list.forEach(email => {
        emailSet = new Set(emailSet).delete(email)
    })
}

💡 Feel free to check my article about sets for an explanation about the concept.

Both functions update emailSet immutably: they always create a new copy of the Set with new Set() constructor, and then mutate only the newest version. This presents some challenges:

Constraint - Cloning can be Expensive

When cloning a Set, each item will be copied into the new Set, so the total time spent cloning is proportional to the Set size: O(size(Set)). This is the main reason why we should avoid cloning as much as possible, while still avoiding side-effects in the original Set.

Problem #1 - Cloning an Unchanged Set

An unnecessary clone will be performed if the function:

  • Registers an already existing email
  • Unregisters an inexistent email

This is easy to fix: we can update the functions to perform "selective cloning" (only change the Set if there is a real modification).

const registerEmails = (list: string[]) => {
    list.forEach(email => {
        /* Check if email not registered before cloning */
        if (!emailSet.has(email)) {
            emailSet = new Set(emailSet).add(email)
        }
    })
}

const unregisterEmails = (list: string[]) => {
    list.forEach(email => {
        /* Check if email registered before cloning */
        if (emailSet.has(email) {
            emailSet = new Set(emailSet).delete(email)
        }
    })
}

💡 Client-side frameworks (E.g.: Angular, React, etc) usually rely on === test to detect component changes. Forcing a useless clone wastes time both during the cloning process, as well as in the framework internal diff checks.

Problem #2 - Not Batching Mutations

Our code is still not performant under certain circumstances. If we receive a list of 10 emails to register/unregister, our Set could be cloned 10 times inside our forEach loop.

registerEmails([
    'a@mail.com', // New email, clone Set
    'b@mail.com', // New email, clone Set
    'c@mail.com', // New email, clone Set
    //... (New email, clone Set x7)
])

Is it possible to keep the immutability benefits while also speeding up code execution? We still want a new Set, but this much cloning is not desired in this case.

Batching

The solution to the problem above is called batching. From outside of the batching context, everything looks immutable (no side-effects), while on the inside it uses mutability when possible.

The batcher wraps a target object (in our case, the Set) and provides an API for changing it that follows the rules:

  • Delay cloning target until absolutely necessary (call willChange()).
  • After object is cloned, we can mutate it subsequently as many times as required (mutate currentValue).

Let's use the batcher in the registerEmails function as an example:

const registerEmails = (list: string[]) => {
    /* Create the batcher context for emailSet */
    let batcher = prepareBatcher(emailSet);

    list.forEach(email => {
        /* Use batcher currentValue property to refer to Set */
        if (!batcher.currentValue.has(email)) {
            /* Let batcher know a change is about to happen */
            batcher.willChange();
            /* We can mutate currentValue (Set) directly now */
            batcher.currentValue.add(email)
            /* Update our emailSet variable */
            emailSet = batcher.currentValue;
        }
    })
}


Composable Batchers

The previous code is performant, but code reusability can be present in our batching architecture. Here's one way to implement it:

  • Functions receives a batcher (which wraps the object to be modified) as argument instead of the object directly.

  • The function performs desired changes using the Batcher API.

  • In the end, the function returns the batcher (NOT the object).

Let's refactor the previous code snippet into more reusable functions:

/* This can be reused for any Set */
const add = <T>(batcher: Batcher<Set<T>>, item: T) => {
    if (!batcher.currentValue.has(item)) {
        batcher.willChange();
        batcher.currentValue.add(item);
    }
    return batcher;
}

/* This can be reused for any Set */
const remove = <T>(batcher: Batcher<Set<T>>, item: T) => {
    if (batcher.currentValue.has(item)) {
        batcher.willChange();
        batcher.currentValue.delete(item);
    }
    return batcher;
}

And now we can import the functions into our project:

const registerEmails = (batcher: Batcher<Set<string>>, list: string[]) => {
    list.forEach(email => {
        add(batcher, email);
    });
    return batcher;
}

const unregisterEmails = (batcher: Batcher<Set<string>>, list: string[]) => {
    list.forEach(email => {
        remove(batcher, email);
    });
    return batcher;
}

/* Call registerEmails */
let batcher = prepareBatcher(emailSet);
registerEmails(batcher, [...]);
emailSet = batcher.currentValue;

We can keep creating higher-level procedures:

const complexOperation = (batcher: Batcher<Set<string>>) => {
    /* Apply operations */
    registerEmails(batcher, [...]);
    unregisterEmails(batcher, [...]);
    unregisterEmails(batcher, [...]);
    registerEmails(batcher, [...]);
    return batcher;
}

let batcher = prepareBatcher(emailSet);
/* Call the function */
complexOperation(batcher);
/* Update variable */
emailSet = batcher.currentValue;
  • The cloning still happens at most once! If we had no optimizations, there could have been length(array) clones (with size(Set) item copies) for each register/unregister call inside complexOperation.
  • The code is modular and reusable, all we have to do is call prepareBatcher(emailSet) and provide it to the function.
  • The reference equality still stands for the object if no changes have been made.

Proof of Concept

I recently came up with a proof of concept for the Batcher Architecture. You can check the console.logs in the CodeSandbox example below:

The source code can be found at

GitHub logo Stackomate / data-structures

Data-structures for Typescript projects.

For now, one can use add, remove and filter methods. New operations will be available soon.

Top comments (2)

Collapse
 
jcsh profile image
Justin Ho

Hey Rafael,

Great article! I'm happy there are others who are using and writing about the functional programming paradigm.

Collapse
 
rafaelcalpena profile image
Rafael Calpena

Hey Justin, thank you for your comment, I'm glad you liked it! Definitely, functional programming is a great paradigm for improving coding and avoiding side-effects :)