DEV Community

Parulian Sinaga
Parulian Sinaga

Posted on

Ruby cursor pagination (2)

Following this post, I would like to make it open source. So, people around the world can share with me for improvement and feature suggestion. Then, I will explain here my thought.

"If this project should become open source, then I need to make it become general for all case, configurable, and people can extend functionality in dynamic way", this is what I thinking at first. So, I list down for possibility code that make it not general.

Hardcoded primary_key

We hardcoded id as primary_key as default cursor when order not specified. This become a problem if this project will be used by project that have uuid or other column name as primary key. For this first problem, can be fixed by using active record attribute method primary_key to get primary_key column name

# from
...where('id > ?', cursor)

# to
...where(arel_table[primary_key].gt(cursor))

Limitation of order column

Since in our case we only have one order value for every page, we hardcoded order method also. Other people may have case when data need to order by name, quanty, etc for example.

# from
...where('(name, id) > (?, ?)', *cursor).order(:name, :id)

# to
# sorted_columns = { :name => :asc, :id => :asc }
# cursor = { :name => 'Shampoo', :id => 23 }
...where(
  Arel::Nodes::GreaterThan.new(
    Arel::Nodes::Grouping.new(cursor.keys.map{ |col| arel_table[col] }),
    Arel::Nodes::Grouping.new(
      cursor.values.map do |col|
        Arel::Nodes.build_quoted(col)
      end
    )
  )
).order(sorted_columns)

Above code will generate this sql in our case.

-- row values support PostgreSQL (since 8.4)
select * 
from products 
where (title, id) > ('Shampoo', 23)
order by name asc, id asc

-- equal to
select *
from products
where (title > 'Shampoo'
  or (title = 'Shampoo' and id > 23))
order by title asc, id asc

Encode and Decode Json

Usually cursor values encode and decode into Base64. In our case, we store cursor values as a Hash ex: next_cursor = {created_at:'2020-07-11T00:49:06Z'}, parse it to json and then encode to Base64. But, lately we found the issue when we collect data after '2020-07-11T00:49:06Z'.

[
  [
    # should not include in collection
    #<Product:0x00007fd53c33fc70 id: 5, name: "Shampo0", created_at: 2020-07-11 00:49:06 UTC, updated_at: 2020-07-11 00:49:06 UTC>,

    #<Product:0x00007fd53c33f950 id: 6, name: "Tooth brush", created_at: 2020-07-11 00:49:07 UTC, updated_at: 2020-07-11 00:49:06 UTC>,
    #<Product:0x00007fd53c33f590 id: 7, name: "Lifeboy Shoap", created_at: 2020-07-11 00:49:08 UTC, updated_at: 2020-07-11 00:49:06 UTC>,
    #<Product:0x00007fd53c33f3b0 id: 8, name: "Handsanitizer efe", created_at: 2020-07-11 00:49:09 UTC, updated_at: 2020-07-11 00:49:06 UTC>,
    #<Product:0x00007fd53c33eaa0 id: 9, name: "Pantene", created_at: 2020-07-11 00:49:10 UTC, updated_at: 2020-07-11 00:49:06 UTC>
  ],
  {next_cursor: ''}
]

We forgot that created_at store store micro/nano seconds ex: 2020-07-11T00:49:06.234632Z. So, we need to add micro/nano seconds into cursor values next_cursor = {created_at:'2020-07-11T00:49:06.234632Z'}.
After thingking further, I prefer store cursor values in string so that after encoded, it will be shorter than json ex: next_cursor = 'created_at:2020-07-11T00:49:06.234632Z'.

Alt Text

All of this code available here

Top comments (0)