I have always used relational databases for my web development and GIS projects. Most of us should be familiar with relational databases. Relational databases store data in tables that are made up of rows and columns where each data entry is referred to as a record. Each entry into the database is given a unique identifier called a primary key and different tables can be related to one another by referencing the primary key of a record from another table (foreign key). The composition of relational databases makes them an excellent choice for storing and querying large amounts of interconnected data using Structured Query Language (SQL).
Although relational databases have long been the standard in data storage, there's a (not so) new kid on the block that has gained a lot of popularity in recent years; the non-relational database. Non-relational databases, often referred to as NoSQL databases, ditch the use of tables and instead stores data in a variety of structures. Although they have been around since the 1960's, they haven't had widespread use until recent years.
The reason for the rise in popularity is that NoSQL databases are a faster and more scalable solution for storing large amounts of data when compared to relational databases. Large companies like Amazon, Facebook, and Google are utilizing these databases to increase the speed and performance of their immense data stores.
Several technology stacks, including the MERN and MEAN stacks utilize NoSQL databases for housing data. With all the buzz around non-relational databases, I wanted to compare them to relational databases and find a good use case for a future project.
What on Earth is a non-relational database?
At its core, a non-relational database is a database that stores data outside of the typical tabular format. There are several types of non-relational databases that differ in the way they structure stored data. Non-relational databases aren't associated with a schema and thus do not enforce restrictions on data types and data structuring.
The increasing popularity of NoSQL databases is partially due to the fact that they can be scaled much more easily than relational databases. There are two main ways to scale; up or out. Scaling up means you are increasing the power of your hardware to handle increasing data storage and manipulation. Scaling out is the practice of distributing data storage across more servers, thereby spreading out the data into smaller chunks.
Relational databases can be difficult to scale due to the joins and relationships between tables. Relational databases are much more efficient when all of the tables associated with that database exist on a single server. With non-relational databases, data is self-contained and does not need to interact with other pieces of data so it is easier to distribute across many systems.
There are numerous companies that produce NoSQL database products including MongoDB, Oracle NoSQL, DynamoDB, etc. NoSQL databases rely on several data models that excel in different use cases.
Document Data Model
One of the most common "styles" of non-relational databases is the document storage model. Data stored in a document is often stored in JSON or a similar data structure. Document databases are useful because the data is already formatted in a JSON-like structure that interacts well with programming languages. Document databases are excellent for storing user information or data pertaining to content management systems. One of the most popular document database platforms is MongoDB.
Graph Database Model
Graph databases are used to store relationships and are often used on social media platforms where large-scale user relationships exist (i.e. Twitter). Graph databases connect data through the use of nodes and edges. A node is a an entity (i.e. a user) that can potentially be connected to other nodes. Nodes are connected through edges, which can store properties related to the relationship. Edges also have directionality. For example, a user may choose to follow another user, but that followed user may not follow them back. This would be a unidirectional relationship. Retrieving relationship data from graph databases can be very fast because graph databases can ignore unrelated data during searches. Some common graph-based database systems include Amazon Neptune, Neo4j, and OrientDB.
Key-Value Database Model
The key-value pair data model is very similar to the document model. Key-value databases resemble a dictionary, where each value is given a unique key identifier. A variety of data can be stored as the values of this data model, including integers, strings, arrays, and objects. An example of a key-value NoSQL database could be a database of customers, where each customer ID maps to an object of customer information:
Some common key-value database systems include Amazon's DynamoDB, Oracle NoSQL, and Riak KV. Key-value systems are useful when handling a high number of read and write requests at scale. They are often used to manage store session data in large web applications as well as storing shopping carts and e-commerce information.