Mark Erikson

Posted on Jan 1, 2018 • Originally published at blog.isquaredsoftware.com on Dec 23, 2017

Idiomatic Redux: Using Reselect Selectors for Encapsulation and Performance

#react #redux #javascript

An overview of why and how to use Reselect with React and Redux

Note: This post was originally published on my blog at blog.isquaredsoftware.com, and is part of my "Idiomatic Redux" blog series on good Redux usage practices.

Intro

In a good Redux architecture, you are encouraged to keep your store state minimal, and derive data from the state as needed. As part of that process, we recommend that you use "selector functions" in your application, and use the Reselect library to help create those selectors. Here's a deeper look at why this is a good idea, and how to correctly use Reselect.

Basics of Selectors

A "selector function" is simply any function that accepts the Redux store state (or part of the state) as an argument, and returns data that is based on that state. Selectors don't have to be written using a special library, and it doesn't matter whether you write them as arrow functions or the function keyword. For example, these are all selectors:

const selectEntities = state => state.entities;

function selectItemIds(state) {
    return state.items.map(item => item.id);
}

const selectSomeSpecificField = state => state.some.deeply.nested.field;

function selectItemsWhoseNamesStartWith(items, namePrefix) {
     const filteredItems = items.filter(item => item.name.startsWith(namePrefix));
     return filteredItems;
}

You can call your selector functions whatever you want, but it's common to prefix them with select or get, or end the name with Selector, like selectFoo, getFoo, or fooSelector (see this Twitter poll on naming selectors for discussion).

The first reason to use selector functions is for encapsulation and reusability. Let's say that one of your mapState functions looks like this:

const mapState = (state) => {
    const data = state.some.deeply.nested.field;

    return {data};
}

That's a totally legal statement. But, imagine that you've got several components that need to access that field. What happens if you need to make a change to where that piece of state lives? You would now have to go change every mapState function that references that value. So, in the same way that we recommend using action creators to encapsulate details of creating actions, we recommend using selectors to encapsulate the knowledge of where a given piece of state lives. Ideally, only your reducer functions and selectors should know the exact state structure, so if you change where some state lives, you would only need to update those two pieces of logic.

One common description of selectors is that they're like "queries into your state". You don't care about exactly how the query came up with the data you needed, just that you asked for the data and got back a result.

Reselect Usage and Memoization

The next reason to use selectors is to improve performance. Performance optimization generally involves doing work faster, or finding ways to do less work. For a React-Redux app, selectors can help us do less work in a couple different ways.

Let's imagine that we have a component that requires a very expensive filtering/sorting/transformation step for the data it needs. To start with, its mapState function looks like this:

const mapState = (state) => {
    const {someData} = state;

    const filteredData = expensiveFiltering(someData);
    const sortedData = expensiveSorting(filteredData);
    const transformedData = expensiveTransformation(sortedData);

    return {data : transformedData};
}

Right now, that expensive logic will re-run for every dispatched action that results in a state update, even if the store state that was changed was in a part of the state tree that this component doesn't care about.

What we really want is to only re-run these expensive steps if state.someData has actually changed. This is where the idea of "memoization" comes in.

Memoization is a form of caching. It involves tracking inputs to a function, and storing the inputs and the results for later reference. If a function is called with the same inputs as before, the function can skip doing the actual work, and return the same result it generated the last time it received those input values.

The Reselect library provides a way to create memoized selector functions. Reselect's createSelector function accepts one or more "input selector" functions, and an "output selector" function, and returns a new selector function for you to use.

createSelector can accept multiple input selectors, which can be provided as separate arguments or as an array. The results from all the input selectors are provided as separate arguments to the output selector:

const selectA = state => state.a;
const selectB = state => state.b;
const selectC = state => state.c;

const selectABC = createSelector(
    [selectA, selectB, selectC],
    (a, b, c) => {
        // do something with a, b, and c, and return a result
        return a + b + c;
    }
);

// Call the selector function and get a result
const abc = selectABC(state);

// could also be written as separate arguments, and works exactly the same
const selectABC2 = createSelector(
    selectA, selectB, selectC,
    (a, b, c) => {
        // do something with a, b, and c, and return a result
        return a + b + c;
    }
);

When you call the selector, Reselect will run your input selectors with all of the arguments you gave, and looks at the returned values. If any of the results are === different than before, it will re-run the output selector, and pass in those results as the arguments. If all of the results are the same as the last time, it will skip re-running the output selector, and just return the cached final result from before.

In typical Reselect usage, you write your top-level "input selectors" as plain functions, and use createSelector to create memoized selectors that look up nested values:

const state = {
    a : {
        first : 5
    },
    b : 10
};

const selectA = state => state.a;
const selectB = state => state.b;

const selectA1 = createSelector(
    [selectA],
    a => a.first
);

const selectResult = createSelector(
    [selectA1, selectB],
    (a1, b) => {
        console.log("Output selector running");
        return a1 + b;
    }
);

const result = selectResult(state);
// Log: "Output selector running"
console.log(result);
// 15

const secondResult = selectResult(state);
// No log output
console.log(secondResult);
// 15

Note that the second time we called selectResult, the "output selector" didn't execute. Because the results of selectA1 and selectB were the same as the first call, selectResult was able to return the memoized result from the first call.

It's important to note that by default, Reselect only memoizes the most recent set of parameters. That means that if you call a selector repeatedly with different inputs, it will still return a result, but it will have to keep re-running the output selector to produce the result:

const a = someSelector(state, 1); // first call, not memoized
const b = someSelector(state, 1); // same inputs, memoized
const c = someSelector(state, 2); // different inputs, not memoized
const d = someSelector(state, 1); // different inputs from last time, not memoized

Also, you can pass multiple arguments into a selector. Reselect will call all of the input selectors with those exact inputs:

const selectItems = state => state.items;  
const selectItemId = (state, itemId) => itemId;  

const selectItemById = createSelector(  
    [selectItems, selectItemId],  
    (items, itemId) => items[itemId]  
);  

const item = selectItemById(state, 42);

/*
Internally, Reselect does something like this:

const firstArg = selectItems(state, 42);  
const secondArg = selectItemId(state, 42);  

const result = outputSelector(firstArg, secondArg);  
return result;  
*/

Because of this, it's important that all of the "input selectors" you provide should accept the same types of parameters. Otherwise, the selectors will break.

const selectItems = state => state.items;  

// expects a number as the second argument
const selectItemId = (state, itemId) => itemId;  

// expects an object as the second argument
const selectOtherField (state, someObject) => someObject.someField;  

const selectItemById = createSelector(  
    [selectItems, selectItemId, selectOtherField],  
    (items, itemId, someField) => items[itemId]  
);

In this example, selectItemId expects that its second argument will be some simple value, while selectOtherField expects that the second argument is an object. If you call selectItemById(state, 42), selectOtherField will break because it's trying to access 42.someField.

You can (and probably should) use selector functions anywhere in your application that you access the state tree. That includes mapState functions, thunks, sagas, observables, middleware, and even reducers.

Selector functions are frequently co-located with reducers, since they both know about the state shape. However, it's up to you where you put your selector functions and how you organize them.

Optimizing Performance With Reselect

Let's go back to the "expensive mapState" example from earlier. We really want to only execute that expensive logic when state.someData has changed. Putting the logic inside a memoized selector will do that.

const selectSomeData = state => state.someData;

const selectFilteredSortedTransformedData = createSelector(
    selectSomeData,
    (someData) => {
         const filteredData = expensiveFiltering(someData);
         const sortedData = expensiveSorting(filteredData);
         const transformedData = expensiveTransformation(sortedData);

         return transformedData;
    }
)

const mapState = (state) => {
    const transformedData = selectFilteredSortedTransformedData (state);

    return {data : transformedData};
}

This is a big performance improvement, for two reasons.

First, now the expensive transformation only occurs if state.someData is different. That means if we dispatch an action that updates state.somethingElse, we won't do any real work in this mapState function.

Second, the React-Redux connect function determines if your real component should re-render based on the contents of the objects you return from mapState, using "shallow equality" comparisons. If any of the fields returned are === different than the last time, then connect will re-render your component. That means that you should avoid creating new references in a mapState function unless needed. Array functions like concat(), map(), and filter() always return new array references, and so does the object spread operator. By using memoized selectors, we can return the same references if the data hasn't changed, and thus skip re-rendering the real component.

Advanced Optimizations with React-Redux

There's a specific performance issue that can occur when you use memoized selectors with a component that can be rendered multiple times.

Let's say that we have this component definition:

const mapState = (state, ownProps) => {
    const item = selectItemForThisComponent(state, ownProps.itemId);

    return {item};
}

const SomeComponent = (props) => <div>Name: {props.item.name}</div>;

export default connect(mapState)(SomeComponent);

// later
<SomeComponent itemId={1} />
<SomeComponent itemId={2} />

In this example, SomeComponent is passing ownProps.itemId as a parameter to the selector. When we render multiple instances of <SomeComponent>, each of those instances are sharing the same instance of the selectItemForThisComponent function. That means that when an action is dispatched, each separate instance of <SomeComponent> will separately call the function, like:

// first instance
selectItemForThisComponent(state, 1);
// second instance
selectItemForThisComponent(state, 2);

As described earlier, Reselect only memoizes on the most recent inputs (ie, it has a cache size of 1). That means that selectItemForThisComponent will never memoize correctly, because it's never being called with the same inputs back-to-back.

This code will still run and work, but it's not fully optimized. For the absolute best performance, we need a separate copy of selectItemForThisComponent for each instance of <SomeComponent>.

The React-Redux connect function supports a special "factory function" syntax for mapState and mapDispatch functions, which can be used to create unique instances of selector functions for each component instance.

If the first call to a mapState or mapDispatch function returns a function instead of an object, connect will use that returned function as the real mapState or mapDispatch function. This gives you the ability to create component-instance-specific selectors inside the closure:

const makeUniqueSelectorInstance = () => createSelector(
    [selectItems, selectItemId],
    (items, itemId) => items[itemId]
);    

const makeMapState = (state) => {
    const selectItemForThisComponent = makeUniqueSelectorInstance();

    return function realMapState(state, ownProps) {
        const item = selectItemForThisComponent(state, ownProps.itemId);

        return {item};
    }
};

export default connect(makeMapState)(SomeComponent);

Both component 1 and component 2 will get their own unique copies of selectItemForThisComponent, and each copy will get called with consistently repeatable inputs, allowing proper memoization.

Final Thoughts

Like other common Redux usage patterns, you are not required to use selector functions in a Redux app. If you want to write deeply nested state lookups directly in your mapState functions or thunks, you can. Similarly, you don't have to use the Reselect library to create selectors - you can just write plain functions if you want.

Having said that, you are encouraged to use selector functions, and to use the Reselect library for memoized selectors. There's also many other options for creating selectors, including using functional programming utility libraries like lodash/fp and Ramda, and other alternatives to Reselect. There's also utility libraries that build on Reselect to handle specific use cases.