Typically in an application with a database, you might have more records than you can fit on a page or in a single result set from a query. When you or your users want to retrieve the next page of results, two common options for paginating data include:
- Offset Pagination
- Cursor Pagination
Offset Pagination
When retrieving data with offset pagination, you would typically allow clients to supply two additional parameters in their query: an offset, and a limit.
An offset is simply the number of records you wish to skip before selecting records. This gets slower as the number of records increases because the database still has to read up to the offset number of rows to know where it should start selecting data. This is often described as O(n)
complexity, meaning it's generally the worst-case scenario. Additionally, in datasets that change frequently as is typical of large databses with frequent writes, the window of results will often be inaccurate across different pages in that you will either miss results entirely or see duplicates because results have now been added to the previous page.
If we want to get the first page of the newest posts from a database, the query might look like this:
Post
|> order_by(inserted_at: :desc)
|> limit(20)
Then, when we want the second page of results, we can include an offset:
Post
|> order_by(inserted_at: :desc)
|> limit(20)
|> offset(20)
While you could get away with this method initially, and it's definitely worth doing first - as the number of records increases you can consider alternatives to make reading much faster and more accurate.
Cursor Pagination
This is where cursor based pagination comes in. A cursor is a unique identifier for a specific record, which acts as a pointer to the next record we want to start querying from to get the next page of results. With using a cursor, we remove the need to read rows that we have already seen by using a WHERE
clause in our query (making it faster to read data as it's constant i.e. O(1)
time complexity) and we address the issue of inaccurate results by always reading after a specific row rather than relying on the position of records to remain the same.
Using our previous example, but this time implementing pagination with a cursor:
Post
|> order_by(inserted_at: :desc)
|> limit(20)
|> where([p], p.id < ^cursor)
In order for us to use a cursor, we need to return the results from the first page, in addition to the cursor for the last item in our result set. Using a cursor in this way is fine for moving forward in the result set, but by changing the fetching direction, you add complexity to how you retrieve records.
Conclusion
Cursor pagination is most often used for real-time data due to the frequency new records are added and because when reading data you often see the latest results first. There different scenarios in which offset and cursor pagination make the most sense so it will depend on the data itself and how often new records are added. When querying static data, the performance cost alone may not be enough for you to use a cursor, as the added complexity that comes with it may be more than you need.
Top comments (5)
Thank you, this has cleared my doubt. Can you attach the implementation for them in SQL. It would be great.
Implementation of Offset Pagination (very basic)
SELECT * from ourTable LIMIT offsetValue , limitValue;
offsetValue specifies number of records to skip.
limitValues specifies number of records to return.
Implementation of CursorPagination (very basic):
SELECT * from ourTable where id<=cursor order by id desc LIMIT limitValue
cursor needs to be returned by the client
The SQL you have is very close:
Offset
select * from posts LIMIT 20 offset 20;
If you were to paginate every 20 results, then offset would increase/decrease by 20 each subsequent query.
Cursor
select * from posts where id < 40 LIMIT 20;
The cursor is
id
here, but it could be any column, even a dedicatedcursor
.Hope that helps.
for the cursor query
how can we guarantee which 20 to pick?
like lets say that there are 40 records with id 1 to 40,
if I am doing "prev" with id "40" I should get id 20 to 39.
With that be the case here?
One question: How can you retrieve the latest record with cursor pagination? In offset pagination, we just need to use order by id or created_at. But in cursor pagination, it won't easy like that since we use
p.id < ^cursor
Hey Jack..Thankyou for this super useful article on pagination.
I have a question here - according to you which type of pagination should be considered in the server which pulls the data for clients in the API call vs a job which need to read and update the records?
Can you please throw some light in these lines ??