DEV Community

Cover image for Postgres Is Underrated—It Handles More than You Think

Postgres Is Underrated—It Handles More than You Think

Jason Skowronski on October 09, 2019

Thinking about scaling beyond your Postgres cluster and adding another data store like Redis or Elasticsearch? Before adopting a more complex infra...
Collapse
 
thojest profile image
thojest

Hi, this is a really nice article. The funny thing is, it comes at the right time. We use Postgres and are currently considering to add Redis behind our rust backend with websockets to enable push notifications for turtle (turtle.community), but are not quite sure if this is the right way.

Does anyone have some inspiration or experience on that topic?

Collapse
 
zchtodd profile image
zchtodd

I'm not sure if it fits your use case perfectly, but take a look at NOTIFY, introduced in PG 9.0 -- it's basically a push notification mechanism built into PostgreSQL.

Collapse
 
thojest profile image
thojest

thx for the info, will have a look!

Collapse
 
iamkarshe profile image
Utkarsh Kumar Raut

Really nice project, Turtle.

Collapse
 
markpieszak profile image
Mark Pieszak

I'm loving your articles Jason, great job again !
It's hard to find in-depth and super high-quality articles like these in this day and age :)

Collapse
 
mostlyjason profile image
Jason Skowronski

This is an amazing complement! Mind if I use this quote on my site?

Collapse
 
markpieszak profile image
Mark Pieszak

Of course! My pleasure :)

Anytime, keep up the great work!

Collapse
 
brianjohnsonsr profile image
Brian Johnson

I think another really important PG plugin to add is TimeScaleDB -- the ability to add efficient time series data collection/query to Postgres is awesome.

Collapse
 
davedecahedron profile image
David Howell • Edited

This is a great article!

You mention “Use multi-column indexes sparingly“, which is generally good advice, however I would qualify that by saying avoid redundant indexes. It’s important to know that multicolumn (btree) indexes have a specific ordering to the columns. If you don’t filter or join a table using one or more of the columns in that order then it can’t use the index.

For example an index on columns a,b. If you only filter on column b then this index can’t be used. It can be used if you filter by just a, or both a and b. Following on from that, if you frequently filter by both a and b then this is a good index to have. If you also have both a single column index on a, and a multicolumn index on a,b then in that specific case the single column index is the redundant one.

Collapse
 
sonnk profile image
Nguyen Kim Son

Great article Jason! It's really important to limit the number of technologies used in a project and not follow trends blindly! This is quite a similar analogy to the microservice vs monolith debate.

Collapse
 
rhamedy profile image
Rafiullah Hamedy

Great write up Jason. I wasn't aware of PL/Python until now. Definitely a big fun of the JSON data types and have never used hstore but, thank you for sharing the use case.

There is also Postgres-XL that seems to offer a lot and only 5 years old compared to 3 decade-old PostgreSQL.

Collapse
 
robconery profile image
Rob Conery

Hey Jason great post! Quick thing on the full text example you have: if you use plianto_tsquery it will split the words for you and add the AND. If you use phraseto_tsquery it will apply a positional argument (<->) instead of the &, which is great if you're looking for a name or place. The new websearch_to_tsquery (in PG 11) is a great general purpose query builder as well.

Collapse
 
ccleary00 profile image
Corey Cleary

I wasn't aware of tsquery, thanks for pointing that out! - I'm working on a project right now where full on Elasticsearch is probably overkill, but needed something more robust than just doing LIKE

Collapse
 
scottw profile image
Scott Watermasysk

100% especially on search. Search in Postgres is very underrated.

Collapse
 
amatosg profile image
Alejandro • Edited

This article was definitive, it was the last straw. I have been thinking about switching my Java application to postgres from mysql and now I'm in the middle of the process using pgloader and everything has worked as expected.

Yet I have an unresolved question: does it make sense to have a 2nd level cache in Hibernate when postgresql already has caching? thanks!

Collapse
 
mostlyjason profile image
Jason Skowronski

Thanks I'm glad enjoyed it! The Hibernate second level cache lives on your application server, whereas the postgresql cache lives on the database of course. This matters because it's faster to retrieve or update data already stored on your application server. It reduces network and database load by removing duplicate queries and batching writes. This is great for applications with many reads and infrequent writes, or cases where eventual consistency on writes is acceptable.

Collapse
 
steph_baltus profile image
Stephanie Baltus

Hey, many thanks for this article, I'm working with Postgres for a while now, but still, I've learned a lot of great stuff here !

I just wanted to add that Postgres also offers HyperLogLog as a data type, just by adding an extension (see this great article for more details.)

Collapse
 
fultonbrowne profile image
Fulton Browne

Great article, I use postgres on heroku for database stuff all the time, I love it and I recommend it all the time.

Collapse
 
aurelmegn profile image
Aurel

Thanks for this post,
I wrote about using postgres to setup a distributed database. Here is the link for those who are interested dev.to/sh1ftsh/setting-up-distribu...

Collapse
 
thejoezack profile image
Joe Zack

Thanks for the post, I learned a lot of new things! I'd also like to mention that OLAP style queries can be rough with lots of data.

Collapse
 
davedecahedron profile image
David Howell

I am also thinking about this. RDBMS generally are good at most types of workloads but mixed workloads like OLTP and OLAP on the same system will interfere with each other.

How do people do embedded reporting AND transactions in modern web apps?

NoSQL solutions like ElasticSearch are mentioned but they seem more appropriate for search. Data warehouse solutions like Snowflake, BigQuery, Redshift are good for internal analytics and reporting but they just don’t have the concurrency to support direct queries from customer facing apps.

How else to do this without complex data pipelines or complex infrastructure involving Kafka?

Collapse
 
thejoezack profile image
Joe Zack

I do it with a complex data pipeline and infrastructure involving Kafka!

I've got my eye on Apache Druid, though I haven't spent any real time (sorry, pun totally intended) with it.

Thread Thread
 
davedecahedron profile image
David Howell

I’ve looked at Druid, also considering MemSQL , ClickHouse and others

Collapse
 
davedecahedron profile image
David Howell

Is “forking” count some special operation or just a nice word in place of swearing?

Regarding (exact) count, pretty much every system has trouble doing this quickly and on most cases you really don’t need an exact count. This is especially true for medium to large data.

Table/index statistics that are kept up to date will give a good approximation.

HyperLogLog was one option mentioned which will give good-enough approximations, another approach is log-normal histograms. I don’t think this challenge is unique to PG.

Collapse
 
cwreacejr profile image
Charles Reace

Don't forget Materialized Views: can be an awesome way to optimize searches on things that you might be tempted to dump into some NoSQL text-based search tool.

Collapse
 
idoshamun profile image
Ido Shamun

Wow! An awesome review of Postgres important features.
I can't stress out more how devs sometimes just follow the hype without thinking. When someone suggests a NoSQL data store, I always ask her/him to convince me why SQL doesn't work here.

Collapse
 
okbrown profile image
Orlando Brown

Where have I read this article before? Feels like either this is a clone or the other is a clone. Brb...

Collapse
 
okbrown profile image
Orlando Brown

Found it, it's the same guy. Fair play!
infoq.com/articles/postgres-handle...

Thanks for the other one, it was very useful.

Collapse
 
mostlyjason profile image
Jason Skowronski

Maybe InfoQ or Hackernoon? I posted it there as well :)

Collapse
 
fil2fip profile image
fil2fip

Nice article ! Thank you.

I think there'e a typo in the check strings length example (b- vs b=).

Collapse
 
mostlyjason profile image
Jason Skowronski

Thanks updated!

Collapse
 
thojest profile image
thojest

thx a lot :)

Collapse
 
kelsonpw profile image
Kelson Warner

The first paragraph is repeated twice

Collapse
 
mostlyjason profile image
Jason Skowronski

Doh! Thanks I fixed it :)

Collapse
 
zoechi profile image
Günter Zöchbauer

Nice article!

Nit - there is one redundant "are" in "Extensions are to Postgres are"

Collapse
 
mostlyjason profile image
Jason Skowronski

Thanks updated!

Collapse
 
codethug profile image
Nicolas Quijano

More in-depth articles of this kind will make this old code bum happy ;)

Collapse
 
zzcoder profile image
zzcoder

Just want point out one error (or misunderstanding), B-Tree is not binary as stated in "B-tree indexes B-tree indexes are binary trees"

Collapse
 
mostlyjason profile image
Jason Skowronski

Thanks I meant to say "balanced trees"

Collapse
 
vitorscassiano_34 profile image
Vitor Cassiano

Great article.
Congratz!

Collapse
 
pavelbisse profile image
Pavel Bisse

Hey Jason, thanks for nice post!
There is a HyperLogLog PostgreSQL extension btw.
github.com/citusdata/postgresql-hll

Collapse
 
hasii2011 profile image
Humberto A Sanchez II

Nice article