João Antunes

Posted on Jul 26, 2018

Pagination in an API: page number vs start index

#discuss #api #pagination

Imagine you're developing an API and have an endpoint that lists some entities. You want it to be paged because there could be a lot of entities, what would you use?

Page number and page size
Start index and limit

I usually go with the first for no specific reason, it was what was used in my first job, I got used to it and kept using it.

Anyone has any strong opinions on one being better than the other, or is this just a do whatever makes you happy kind of topic?

Note: I know a pagination token approach (if this is the correct name for it) is probably better and more scalable, but in this case I'm only considering these "old school" approaches to the problem.

Top comments (19)

Quentin Sonrel • Jul 26 '18

Imagine you're developing an API and have an endpoint that lists some entities

I'm currently doing more than just imagining 😀

I'm developing an API with that exact use case. My solution is more like number 2. The entities I use have a date so I use the date and a limit to paginate.

João Antunes • Jul 26 '18

Nice!

I would say that's an upgraded version of number 2, as you're using something that can be indexed in the database rather than a usually generated in query time index.

One question though, if you fetch by date then limit, how do you get the remainder entities for that date? Or is this not relevant in your case?

Quentin Sonrel • Jul 26 '18

For now I'd say it's not relevant because it's actually not dates but datetimes (including seconds), so most entities should have a unique datetime, so remainders after applying the limit are quite unlikely to happen. This API is still in early stages of development and is not fully tested so I might have missed some issues though 😅

João Antunes • Jul 26 '18 • Edited

Got it, so you could just use the datetime of the last one, and get the rest from there 👍

Quentin Sonrel • Jul 26 '18

Exactly :)

Rafal Pienkowski • Jul 26 '18

I'd choose the first option. I advise to adding in both approaches the count number of entities in a collection. It helps on a grid creation.

BTW you can check how it's implemented in the OData specification.

João Antunes • Jul 26 '18

Yes, on the output side the total item count is really important (unless we're working with an "infinite" scrolling list probably).

When implementing in .NET and using for example EF, the easiest is going with start index and item count, as it maps directly to Skip() and Take(), but it could not be the easiest for the client.

Rafal Pienkowski • Jul 26 '18

In .NET there is the Nuget package which translates the OData query to the LINQ expression 😊

João Antunes • Jul 26 '18

Never really used OData that much, but that's useful 😉

mehdibenhemdene • Nov 4 '19

Whether you should go for the first or the second approach really depends on your needs. Let's take an example. If you implement a page number / page size solution, this may not always be reliable if data is changing. Meaning you can get some "gaps" in the results whenever count / pageCount changes. If your data changes so often, you should probably use a start index / limit approach (AKA cursor based pagination) which is adopted by both Facebook and Twitter API.

If my response doesn't make that much sense to you, consider taking a look at this article:

sitepoint.com/paginating-real-time...

Hope you find this useful !

João Antunes • Nov 4 '19

Agree, the cursor based pagination is certainly a great alternative, particularly in cases like the ones you mentioned.

Regarding the two "approaches" I mentioned in the post though, they are much more similar, the main difference is more of developer perception, not really having an impact on the end result as the choice between them and a cursor based approach have.

For an example of what I was thinking when I wrote this post:
Imagine an endpoint in which we want to get 10 items after skipping other 10, in the two examples:

Page number and page size -> https://some-api/items?pageNumber=2&pageSize=10
Start index and limit -> https://some-api/items?startIndex=10&limit=10

So my question was more regarding what people prefer between these two.

mehdibenhemdene • Nov 5 '19

Well in this case I'd have to say that I really prefer the first method (page number and page size) because it's kind of similar to what the user is seeing and requesting (making it easier to render in client side instead of having to calculate the startIndex each time in case we want to display the page number in the UI for eg.).

rhymes • Jul 26 '18

I don't think there's much of a difference between the two. A page number is basically a mnemonic aid that gets translated by a paginator to an offset and a limit with the page size. APIs are for people so I would go with pages, it's also at an higher level somewhat

João Antunes • Jul 26 '18

Yes, I agree. The end result is exactly the same, it's just for ease of understanding the API and maybe depending on the way the information is displayed, spare the client some math.

Xing Wang • Jul 26 '18

Check out this article written by my co-worker.

moesif.com/blog/technical/api-desi...

I think it summarized all the considerations really well.

João Antunes • Jul 27 '18

Nice article.

Doesn't touch the page number vs start index (or offset) I mentioned but adds other interesting points.

Alain Van Hout • Jul 26 '18

Either is fine, I'd say. The key thing is to be consistent.

If I wanted to maximize convenience, I'd perhaps add both, with one being the actually used one and the other being automatically converted to that primary one.

João Antunes • Jul 26 '18

Yeah, consistency is key.

Never thought of supporting both for the same endpoint, but that's not a bad idea, allows the client to use what's easier for a given scenario.

João Antunes • Jul 30 '18

Yes, on the client side I agree it's simpler (at least if the client is using a table style view using the page numbers for navigation).

Regarding the payload size, I would think there is't much difference, as the client could end up specifying a large page size, the same way it could send a large limit for the index based alternative. Either way, the API should have safeguards in place if it is not desired to support a "give me everything" scenario.

View full discussion (19 comments)