DEV Community

Cover image for Algorithms for Pagination (part one of two)
Tracy Gilmore
Tracy Gilmore

Posted on • Updated on

Algorithms for Pagination (part one of two)

The majority of applications these days seam to employ web technologies, not excluding business systems. Unlike web sites, where "Content is King", web applications have a different priority (or priorities.) With business systems helping the end-user achieve their tasks in as efficient, error-free and secure way as possible is essential.

Instead of providing a viewer with 'content', business application have to provide users with functionality and data in order to extract information, often to support decision making. When data is presented in a tabular form there will eventually come a point when the question arises, how pagination is to be performed. The options being client-side within the web browser, or server-side with only a small number of records (a page of data) being sent to the client at a time.

Some key factors in the decision include:

  • The complexity of the record structure (and therefor its size)
  • The profile and contention of network traffic when transporting the data between the client and server.
  • The volatility of the dataset and the importance for the presentation to be up to date.

Additional considerations

Pagination is seldom the only function end-users require of tabulated data. Other facility that are regularly requested include:

  • Searching: Locating records containing a search term in any one of a selected group of columns.
  • Filtering: Reducing the data set by excluding rows that have a specific value match a stipulated column/property, or multiple.
  • Sorting: Arranging the data rows according to an order of one or more columns/properties.

These facilities need to be applied in a specific order to achieve the desired presentation of the data. If we extract a page of data before sorting, filtering or searching the results are unlikely to be as expected. We might also have to consider supporting "pre-canned" filter criteria for special circumstances.

In this first of two posts we will be considering how the functions can be applied in both client- and server-side pagination processes. We will commence by considering the general process and prepare a worked example to get a simple solution. In the second post we will extend the worked example to include some common additional features such as searching, filtering and sorting. The algorithms for performing this trio can be developed using techniques from the Functional Programming school of coding styles and making use of some of JavaScript's new(ish) features.

Case study

This article is based on personal experience of dealing with a performance issue we encountered. Initially we had screens that downloaded the entire dataset and presented a simple list a page at a time. No sorting or filtering and only simple searching, but all performed satisfactorily within the browser.

Performance started to degrade when a more complicated dataset comprised of over 100,000 rows was downloaded. The browser started to become unresponsive, making client-side functions unusable. At the same time we noticed the combination of searching, sorting and filtering occasionally produced unexpected results; they we not working in concert.

To address the degradation in performance we first supplied server-side pagination through a communication protocol. An initial request (query) is sent to the server along with details of an initial contract (page number, page size, search columns, filter column values and sort criteria). Whenever the user changes any of the contract values an updated contract is sent to the server. In response the client (web browser) receives a number of rows along with the number of pages in the processes result set.

For parity we then implemented the low-level functionality required to perform client-side pagination using the same contract/protocol. By conforming to the protocol both sides, if the server is Node.JS (or another JS-based server) the same low-level functions could be applied.

Defining the functions

Pagination
Inputs: Page size and Page Number
Output: Page content and Number of pages available.

Sorting
Inputs: Zero or many columns (in priority order) with the data type and sort direction.
Output: Rows arranged according to the combined sort order.

Filtering
Inputs: Zero or many columns along with a list of filter values.
Output: Only the rows that comply with the filter conditions. Filtering (in this example) applies an or condition to the values in the same column and an and condition between columns.

Searching
Inputs: A list of columns to which the search term should be applied, and the search term itself.
Output: Only those rows containing the search term in one of the stipulated search columns.


Worked example

For learning purposes we will source some data through a public API at whiskyhunter that provides us with a list of over 270 whisky distilleries.

If you want to follow along, here is a the development environment we can use.

A few things to know about using Node.Js:

  1. (As of version 18) Node has yet to full implement the Fetch API but it is available. It was introduced in Node version 17.6 as an experimental feature. To use the feature we needed to call Node using the following command-line switch --experimental-fetch. With Node version 18+ we no longer need the switch but a warning may still appear.
  2. The Fetch API is asynchronous but Node does not yet support top-level await outside of an ECMA Script module. Enabling ES modules would require another command-line switch and more code. However, there is a relatively simple work around as we shall see.
  3. We will be using the data retrieval mechanism in subsequent examples using the JavaScript Module system. To enable this we need to configure a package.json file which is easy.
    • In the same folder where you will be creating the example JS files, open a command prompt and enter the following command npm init -y. This will create the initial package.json file using default values.
    • Edit the package.json file and add the following property "type": "module", remembering to update the JSON accordingly. For example:
{
  "name": "pagination",
  "version": "1.0.0",
  "author": "Tracy Gilmore",
  "license": "ISC",
  "type": "module"
}
Enter fullscreen mode Exit fullscreen mode

Retrieving and processing the source data

Before we view the code to obtain the data we will be using there is another point to note. The source data is a little more complicated than we need so the function we create to 'fetch' the data will also pre-process it.

const X_CSRF_TOKEN = // Get from https://whiskyhunter.net/api/

// 0-retrieve-data.js

async function list() {
  const listJson = await fetch(
    'https://whiskyhunter.net/api/distilleries_info/',
    {
      method: 'GET',
      'Content-Type': 'application/json',
      'X-CSRFToken': X_CSRF_TOKEN,
    }
  );

  const listData = await listJson.json();

  return listData.map(
    ({ 
      name, 
      country, 
      whiskybase_whiskies, 
      whiskybase_votes, 
      whiskybase_rating
    }) => ({
      name,
      country,
      products: +whiskybase_whiskies,
      votes: +whiskybase_votes,
      rating: +whiskybase_rating,
    })
  );
}
Enter fullscreen mode Exit fullscreen mode

The above code defines an async function called list that uses the Fetch API to request the source data, converts it from a JSON string into an array of objects (rows/records). At the same time as converting the JSON we extract the properties we are interested in from each object, convert the numeric values from strings to Numbers and return a new array of objects (rows).

We can use the following function to call the list function and present the array it return as a table in the console. Note we have to wrap the call to console.table in an asynchronous IIFE (Immediately Invoked Function Expression) to work around the lack of top-level await support.

// 0-retrieve-data.js continued

(async function demonstration() {
  console.table(await list());
})();
Enter fullscreen mode Exit fullscreen mode

Running the code in '0-retrieve-data.js' produces the following output.

Output of 0-retrieve-data.js

Now we have demonstrated how we will be retrieving the data we will make a small change to the '0-retrieve-data.js' file to make it a module we can use going forward.

  1. Comment out (disable) the IIFE containing the demonstration function.
  2. Add the following instruction to the bottom of the file to expose the list function.
// 0-retrieve-data.js revised

export default list;
Enter fullscreen mode Exit fullscreen mode

How it is going to work

At the centre of the pagination process is a paginator function that is created by a GeneratePaginator function. Both functions are called with a 'pagination contract' (more on that later) but the generation function is also called with either (in our case) a complete data set or the parameters for a query to retrieve a data set.

For client-side pagination the GeneratePaginator function is expected to be called only once. The function it returns (the actual paginator) is expected to be called repeatedly each time the pagination contract is updated. The paginator returns two things:

  1. The total number of pages in the pre-processed data set. The 'pre-processes' data set is the result of filtering and searching the source data but before sorting and pagination is applied.
  2. The set of records in the processed page.

Paginator

For server-side pagination the contract could be sent from the client (page) to the server but the generator does not create the paginator function. Instead the server-side GeneratePaginator function manages the pagination process and expects updates to the pagination contract.

Pagination contract(s)

Pagination contracts are data objects used to pass parameters to the paginators functions, directly for client-side, via a network request for server-side pagination. There are two versions, 'initial' and 'updated':

  • 'initial' is used to prepare the GeneratePaginator by supplying parameters that are not expected to change going forward. Contracts can be empty but can also contain:
    • A list of columns/properties that searchable.
    • Initial page size and number (defaulted to 1).

Pagination Contracts

  • 'updated' is used to inform the paginator function of changes made by the user in how they want to change the presentation. The contract can contain:
    • The search term, if defined (not empty).
    • A list of columns to apply filtering, along with either:
      • The name of a pre-canned filter (which is outside the scope of this post),
      • A list of filter values,
      • A filter operation (such as 'greater than') along with parameters.
    • A list of sortable columns along with its data type and the sort order (direction).
    • Updated page size and number.

Pagination process

As indicated earlier in this post, the process of pagination, along with performing the additional functions, has to be performed in a specific sequence in order to produce the desired and predictable results.

Pagination process

Once the original source data has been retrieved (or potentially refreshed) there are three stages to be performed to produce the 'page' of data ready for presentation.

Searching and Filtering

Technically Searching and Filtering are similar operations. For each candidate row we evaluate the data in the context of the search term or filter criteria to assess if the row is to be included in or excluded from (a binary choice) the output.

Sorting

There are many sorting algorithms but, as is so often the case, it is better to utilise those mechanisms provided natively (built in to the language) than implement something on top.

JavaScript has a very simple and effective sorting method built into the Array object but it is limited to a single property. We will expand on the functionality to support multi-level sorting.

Paginating

We will dig deeper into the first two stages in the next post but for now we will tackle the simpler stage of 'paginating', although it is not without its complexities.

In its simplest form this stage isolates the rows within the range of the selected page. The output includes the rows of the page and the number of pages available.

More complicated forms (which is outside the scope of this post) consider factors such as:

  • How volatility is the data (how quickly does it change) and how important it is for the user to receive updates immediately to ensure data accuracy.
  • The availability of network capacity and the need to support data caching for an improved user experience.

Implementing paginate

Of the three stages pagination is actually the simplest (in its basic form) to implement. At the top of the example code we will begin by importing the data retrieval function list() from the '0-retrieve-data.js' module.

// 1-initial-pagination.js

import list from './0-retrieve-data.js';
Enter fullscreen mode Exit fullscreen mode

As described above we next define the GeneratePaginator function that takes in the source data and initial contract, which at this point will only contain the page parameters (number and size), and returns the Paginator function.

// 1-initial-pagination.js continued

function GeneratePaginator(
  data,
  contract = {
    page: { number: 1, size: 20 },
  }
) {
  let currentContract = contract;
  return (newContract = { page: { number: 1, size: 20 } }) =>
    {
      currentContract.page = {
        ...currentContract.page,
        ...newContract.page
      };
      const {
        number: pageNumber,
        size: pageSize
      } = currentContract.page;
      const dataset = pageNumber * pageSize;
      return {
        page: data.slice(dataset - pageSize, dataset),
        pages: Math.ceil(data.length / pageSize),
      };
    };
}

(async function () {
  const paginator = GeneratePaginator(await list());

  // Test call 1: Using default parameters
  console.table(paginator().page);

  // Test call 2: Using custom parameters
  console.table(paginator({ page: { number: 3, size: 5 } }).page);

  // Test call 3: Updating custom parameters
  const pagedData = paginator({ page: { number: 4 } });
  console.log('Total pages in dataset:', pagedData.pages);
  console.table(pagedData.page);

  // Test call 4: Accessing the last (partial) page
  console.table(paginator({ page: { number: 55 } }).page);
})();
Enter fullscreen mode Exit fullscreen mode

In the second half of the above example we use an IIFE to first create the Paginator instance 'paginator', populating it with our test data. We then use the console.table to make a series of test calls and present the results, as shown below. Notice we get two values back for each call:

  • page contains the subset of the source data containing the rows to be presented.
  • pages contains a number representing how many pages required to present the entire data set.

Test call 1: Using default parameters

By default the paginator will select the first page of data with a size of 20 rows, as defined by the initial contract.
Using default parameters

Test call 2: Using custom parameters

We can update the contract to stipulate a different page number and size.
Using custom parameters

Test call 3: Updating the custom parameters

We can also revise the contract, specifying a new page size or, as demonstrated below, an alternative page number.
Updating the custom parameters

Test call 4: Accessing the last (partial) page

When we request the last available page, we might not get as many rows as we stipulated for the page size, but we will get all the rows in the last page.
Accessing the last (partial) page

With 274 rows of data we would get the same results for the following parameters:

Page
Number 6 7 10 11 16 19 28 31 46 55
Size 54 45 30 27 18 15 10 9 6 5

In the above code we assume the page number will be within the range 1 to the number of pages defined by the page size. But what if somehow a page number less than 1 is requested, or more likely, a page greater than the number of pages available?

Extending paginate

What if we resize the page but the page number is still within range? ideally we should revise the page number to align with the previously presented data. We need to enhance the GeneratePaginator function as follows.

// 2-enhanced-pagination.js

function GeneratePaginator(
  data,
  contract = {
    page: { number: 1, size: 20 },
  }
) {
  let currentContract = contract;
  return (newContract = { page: { number: 1, size: 20 } }) => {

// [1]
    const currentPageSize = (newContract.page?.number
      ? newContract : currentContract
    ).page.size;

    currentContract.page = {
      ...currentContract.page,
      ...newContract.page
    };
    let { number: pageNumber, size: pageSize } =
      currentContract.page;

// [2]
    pageNumber = Math.ceil(
      ((pageNumber - 1) * currentPageSize + 1) / pageSize);
    const pages = Math.ceil(data.length / pageSize);

// [3]
    if (pageNumber < 1) pageNumber = 1;
    if (pageNumber > pages) pageNumber = pages;

    const dataset = pageNumber * pageSize;
    return {
      page: data.slice(dataset - pageSize, dataset),
      pages,
    };
  };
}
Enter fullscreen mode Exit fullscreen mode
  1. We need to preserve the previous pageSize for comparison but only if none is supplied in the call.
  2. Next we need to make sure that the pageNumber is in range so, where possible, the top of the page remains the same irrespective of pageSize by adjusting the pageNumber; this is less jarring to the eye of the user.
  3. We also need to check the upper and lower ranges for the new pageNumber to ensure it is in range.

Let's see how point 2 works with the following test.

// 2-enhanced-pagination.js revised

(async function () {
  const paginator = GeneratePaginator(await list());

  // Test call 5a: Page resizing - initial
  let pagedData = paginator({
    page: { number: 3, size: 10 },
  });
  console.log('Total pages in dataset:', pagedData.pages);
  console.table(pagedData.page);

  // Test call 5b: Page resizing - resized
  pagedData = paginator({ page: { size: 5 } });
  console.log('Total pages in dataset:', pagedData.pages);
  console.table(pagedData.page);
})();
Enter fullscreen mode Exit fullscreen mode

At 5a we prepare the test case by setting the initial page parameters to number 3 and size 10, which will present rows 21 to 30, with 28 pages available.

Initialising page resizing

When in 5b we resize the page to 5 rows the number of pages available increase to 55 but we still present rows 21 to 25.

Update page resizing


In part two we will dig deeper into the ancillary functions (search, filter and sort) and see how functional programming (FP) techniques can help create efficient solutions.

Top comments (0)