As you might know, the default pagination mechanism provided in PostgreSQL is not very efficient, actually they warn against OFFSET
when you need to go deep inside pagination.
So, what alternative to OFFSET
do we have?, one option might be by design, let's say we will build a modern application with infinite scrolling that works in a similar fashion as Instagram, this implies a simple solution.
We could just return a group of records, let's say 10, and sort them by any field, in this case we could create a rank field which contains some number that prioritize the search according to some factors, like a premium account or any other we can decide, anyway we're going to ignore the algorithm for the rank field and assume we already have it.
We want another rank, this is a PostgreSQL function called ts_rank, we're using full text search and we need to order our rows according to the frequency of their matching lexemes.
However we have a problem, because of the previous requirement (the full text search rank) our queries might be slow, so we need to find a solution.
RUM index
We have something that might help, there is an external PostgreSQL module that creates a custom index that allows us to use full text search with ranking with faster results, it's called Rum index, the problem is that we will have slow inserts, but it's ok, because we need fast reads.
The first thing we need to do is install the extension and load it in PostgreSQL, in ubuntu we can do it this way:
sudo apt install postgresql-10-rum
sudo su - postgres
psql -d mystore
CREATE EXTENSION rum;
Then we need to edit our last migration to change GIN
by RUM
and add the function rum_tsvector_ops
:
migrations/2019-06-25-112356_add_tsvector_column_to_products/up.sql
:
ALTER TABLE products ADD COLUMN text_searchable_product_col tsvector NOT NULL;
UPDATE products SET text_searchable_product_col =
to_tsvector('english', name || ' ' || coalesce(description, ''));
CREATE INDEX textsearch_idx ON products USING RUM (text_searchable_product_col rum_tsvector_ops);
CREATE TRIGGER tsvectorupdateproducts BEFORE INSERT OR UPDATE
ON products FOR EACH ROW EXECUTE PROCEDURE
tsvector_update_trigger(text_searchable_product_col, 'pg_catalog.english', name, description);
Now, we can redo our migration:
diesel migration redo
Be careful to check the schema file, we add a new field type called TsVector
and need to load it from diesel_full_text_search
crate, every time we perform a migration schema.rb reloads removing the use instructions.
RUM index support
Our crate called diesel_full_text_search
doesn't support the RUM operator: <=>
, so we can use my fork for that:
Cargo.toml
:
diesel_full_text_search = { version = "1.2.1", git = "https://github.com/werner/diesel_full_text_search" }
We said that we would be using a rank for product let's add it.
diesel migration generate add_rank_to_products
diesel migration generate add_rank_index
migrations/2019-07-01-114531_add_rank_to_products/up.sql
:
ALTER TABLE products ADD COLUMN product_rank NUMERIC DEFAULT 0.0;
migrations/2019-07-01-192138_add_rank_index/up.sql
:
CREATE INDEX products_rank_idx ON products (product_rank);
Now we can modify the model.
src/models/product.rs
:
impl ProductList {
pub fn list(connection: &PgConnection, search: &str, rank: f64) -> Self {
use diesel::RunQueryDsl;
use diesel::ExpressionMethods;
use diesel::QueryDsl;
use diesel::pg::Pg;
use crate::schema::products::dsl::*;
use crate::schema;
use diesel_full_text_search::{plainto_tsquery, TsRumExtensions, TsVectorExtensions};
let mut query = schema::products::table.into_boxed::<Pg>();
if !search.is_empty() {
query = query
.filter(text_searchable_product_col.matches(plainto_tsquery(search)))
.order((product_rank.desc(),
text_searchable_product_col.distance(plainto_tsquery(search))));
} else {
query = query.order(product_rank.desc());
}
let result = query
.select(PRODUCT_COLUMNS)
.filter(product_rank.le(rank))
.limit(10)
.load::<Product>(connection)
.expect("Error loading products");
ProductList(result)
}
}
We added the corresponding order by
to take care of the RUM functionality, if no search is provided we just do a regular sort by our product rank.
We use a filter to perform a where
clause when we need to sort the results, then from the front end we would receive the corresponding rank to return the products.
Finally we edit the handler:
src/handlers/products.rs
:
#[derive(Deserialize)]
pub struct ProductPagination {
pub rank: f64
}
pub fn index(_user: LoggedUser,
pool: web::Data<PgPool>,
product_search: web::Query<ProductSearch>,
pagination: web::Query<ProductPagination>) -> Result<HttpResponse> {
let pg_pool = pg_pool_handler(pool)?;
let search = &product_search.search;
Ok(HttpResponse::Ok().json(ProductList::list(&pg_pool, search, pagination.rank)))
}
Check the source code here
Top comments (6)
How to make admin can do everything and user can access the only function that we allow.
I think that functionality (Authorization library) is not available for Rust, maybe I can perform a little research to know how to implement it.
Rust has authz now, see: github.com/casbin/casbin-rs
Thank you!!
Rust has authz now, see: github.com/casbin/casbin-rs
Thank you for database relationship.