DEV Community

Cover image for CockroachDB vs PostgreSQL
John Muthua
John Muthua

Posted on

CockroachDB vs PostgreSQL

Table Of Contents

    * Chapter 1
* Chapter 2
Enter fullscreen mode Exit fullscreen mode




Chapter 1 Introduction

Roaches are among the most persistent creatures in the universe. They are so tenacious that they can go for weeks without their heads [1]
. The residency of these creatures is so astonishing enabling survive in high altitudes, extreme weather conditions [2], to a point of surviving a nuclear holocast [4] . Did the creators of cockroachDB have this in mind while designing the software? It's hard to say but it is quite obvious from their software design perspective.

Chapter 2 What is CockroachDB

As they explain the product on their website:

CockroachDB is a distributed SQL database built on a transactional and strongly-consistent key-value store. It scales horizontally; survives disk, machine, rack, and even datacenter failures with minimal latency disruption and no manual intervention; supports strongly-consistent ACID transactions; and provides a familiar SQL API for structuring, manipulating, and querying data.

cockroach

CockroachDB was created with the cloud in mind. One of its biggest sale points is its ability to be distributed in numerous geographical regions and its simplicity to scale. In relation to the creature, this DB can survive a digital nuclear holocaust through the fault-tolerant methods they incorporate in their software architecture. The DB favors CP (Consistency and Partition Tolerance) in place of Availability. The developers of the DB employed the Eric Brewer CAP theory by allowing the DB to stay functional even when certain nodes are offline [5].

In computing, the occurrence of failure is an indisputable event. The question revolves no if the failure occurs rather it is a matter of when it does occur. CockroachDB addresses this phenomenon by ensuring the salvage, protection, and continuity of data.

Why choose cockroachDB over postgresql

1. Scaling

As pointed out earlier, the greatest benefit of using cockroachDB is its ability to scale. Unlike other RDBMS that scales through complex conventions such as sharding, cockroachDB scales seamlessly. By default, cockroachDB has a fixed threshold of 512MB. After this data range is reached, the data is split into two parts. Each of the parts further splits when the threshold is arrived at providing numerous avenues to scale. Moreover, the 512MB threshold enables easy replication and distribution of data to several geographical regions.

Case of postgres on scaling

Cloud applications rely on their ability to scale to enhance efficiency. There are two types of scaling; horizontal and vertical scaling. Postgres was built with vertical scaling in mind. This simply means that scaling a Postgres database would involve increasing the hardware resources of a single node which has a constraint of an upper limit[6].

Note: This does not mean that Postgres databases have no options for horizontal scaling although it is not a straightforward process[7].
scalling

Sebastian Insausti further expounds on scaling postgres and its not a walk in the park[8]. CockroachDB does not suffer from such drawbacks as Postgres as it automatically scales both vertically and horizontally provided there are enough system resources or there is an increase in the number of nodes within the cluster.

2. Data Integrity

CockroachDB employs numerous technologies to ensure the integrity of data. Replication is one of the methods the DB uses. These replicas are rebalanced based on the available number of nodes in the cluster. Replication not only ensures availability in the event a node is down but also improved performance as applications can read and write to multiple sources.
The DB also incorporates functionalities repair data in the event of server failure.

Postgres on Data Repair

Postgress does not support data repair

CockroachDB terminologies and Meaning

TERM MEANING
Cluster This is an instance of a CoachroachDB deployment that has numerous nodes and acts as a single application.
Nodes These are physical computing modules that are running instances of coachroachDB
Range This refers to user data stored in the database. By default coachroachDB splits the data automatically into two ranges when a threshold of 512MB is reached
Replica These are exact copy of user data that ensure availability in the event a node is offline. By default it is set to 3 as the number represent the minimum number to achieve a quorum. $tolerance = \frac{Replication factor -1}{2}$ By using a replica of 3, the database can tolerate unavailability of one node.
Raft This is an algorithm that cockroachDB uses to ensure that there is a consensus on who is the leader in a group. The followers will replicate data from the leader[9].

You can check out the complete comparision in [https://www.cockroachlabs.com/docs/v22.1/cockroachdb-in-comparison]

Top comments (0)