DEV Community

Cover image for SQL Versus NoSQL Databases: Which to Use, When, and Why
Yuval Hazaz for Amplication

Posted on • Originally published at amplication.com

SQL Versus NoSQL Databases: Which to Use, When, and Why

Choosing a suitable database for storing specific data types is vital in software development. The right choice means we can scale up an application quickly and handle increasing user requests without encountering problems. Appropriate data storage also gives us an easier time when adding new features to the application. It lets us manipulate users' data effectively and efficiently. However, choosing the correct database for your application is no mean feat, given the number of available databases.

Databases can be categorized into two primary types: SQL, which is an acronym for Structured Query Language, and NoSQL, which stands for "not only SQL" or "non-SQL." In this article, we'll review the differences between SQL and NoSQL databases, examine their appropriate use cases, and look at how to work with each using Node.js, so you can confidently opt for the right one in your next project.

Differences Between SQL and NoSQL Databases

There are many differences between SQL and NoSQL databases, such as how they handle relational data, their query languages, and their supported tools for development purposes. In this section, we'll look at their most significant differences.

Handling Relations

SQL databases handle data relations using tables. The tables can be interconnected using key constraints. There are three types of relations in SQL databases: one-to-one, one-to-many, and many-to-many.

On the other hand, NoSQL databases do not support handling data relations. So, for example, if we want to combine data from two documents in a MongoDB database, we need to write our own code to apply the logic we want.

Schema vs Schemaless

Schema is a term that describes the objects in a database, such as tables, views, relationships, and indexes. All SQL databases are schema databases. However, different SQL databases treat their schemas differently from one another. In Oracle and MySQL, there is no specific database object named schema; however, they have other database objects—like tables, views, and relationships. PostgreSQL and SQL Server have a specific database object: a schema. An SQL database can also have multiple schemas, as in the case of PostgreSQL.

By contrast, NoSQL databases are schemaless. As such, NoSQL databases provide flexibility when storing user data. We don't need to keep the data structurally consistent, as in SQL databases. Taking MongoDB as an example, we can store user data in different formats so long as it is in BSON (the data type that MongoDB supports.)

Querying Data

In SQL databases, the querying languages are mostly the same. There are some differences in the syntax and how the databases execute actions based on each syntax, but they're just minor cases. For example, with PostgreSQL, let's say we want to query all the users with all columns from the user table. We can write your query like this:

select * from users;
Enter fullscreen mode Exit fullscreen mode

To list all databases in Postgresql, we write:

\l
Enter fullscreen mode Exit fullscreen mode

With another SQL database, such as MySQL, the syntax is different when we try to list all the databases:

show databases;
Enter fullscreen mode Exit fullscreen mode

In NoSQL databases, the query language varies much more. Each of the NoSQL databases has its query language. For example, in MongoDB, if we want to search for all users from the users' collection, we write the query as:

db.users.find({})
Enter fullscreen mode Exit fullscreen mode

Whereas with another NoSQL database, like Redis, if we want to retrieve the value of a key in a Redis database, we write the query as:

GET {key_name}
Enter fullscreen mode Exit fullscreen mode

Scalability

There are two ways to scale the database for SQL and NoSQL databases: horizontal and vertical scaling. With horizontal scaling, we add more nodes to the databases. With vertical scaling, we add more RAM and CPU to the database node.

Horizontal Scaling for SQL Databases

Horizontal scaling for SQL databases is typically handled differently than for NoSQL databases. With SQL databases, like PostgreSQL, we can apply sharding or add replicas to scale the database horizontally.

If our use case targets the read capability, adding replicas is a great option; PostgreSQL provides built-in functionality for scaling replicas to improve readability. However, if we are targeting both read and write capabilities, the solution is much more complicated. PostgreSQL is not designed with heavy write capability or support for distributed databases. We must apply sharding data on partitioning tables or use logical replication to achieve both write and read capabilities.

Horizontal Scaling for NoSQL Databases

With NoSQL databases—say, MongoDB—we can use sharding and replica sets for horizontal scaling. Sharding is usually a preferred option for replica sets when performing horizontal scaling in MongoDB. With sharding, we copy the slices of the data from the primary node to multiple replica sets, and then the replica sets work together to join pieces of the data into a complete dataset. With replica sets, we copy the whole data from the primary node to other nodes. As a result, the write data capacity using replica sets is lowered than sharding.

The differences between sharding and replica sets for horizontal scaling in MongoDB are complex and worth further exploration.

Node.js Tooling

In the Node.js environment, several tools offer support for SQL or NoSQL databases or, in some cases, for both. Prisma is one of the libraries that supports working for SQL and NoSQL databases. With Prisma, we can work with several databases such as PostgreSQL, MySQL, MongoDB, or SQL Server.

SQL: MySQL

Let's say our SQL database has two tables: Author and Blog. The table Blog and Author can join using the authorId key. We will define these two tables in Prisma using schema.prisma file as below:

//schema.prisma
model Blog {
  id Int @id @default(autoincrement())
  createdAt DateTime @default(now())
  updatedAt DateTime @updatedAt
  title String @db.VarChar(255)
  content String?
  published Boolean @default(false)
  author Author @relation(fields: [authorId], references: [id])
  authorId Int
}

model Author {
  id Int @id @default(autoincrement())
  email String @unique
  name String?
  blog Blog[]
}
Enter fullscreen mode Exit fullscreen mode

In the Blog model, we defined an author field related to the model Author by authorId.

To migrate the currently defined schemas with our MySQL database, we run the following:

npx prisma migrate dev --name init
Enter fullscreen mode Exit fullscreen mode

(Note that this command is used for development purposes only. Production database migration requires a different process.)

We should see similar output from the console as the one below:

The following migration(s) have been created and applied from new schema changes:

migrations/
  └─ 20230212011801_init/
    └─ migration.sql
Enter fullscreen mode Exit fullscreen mode

Our database is now in sync with our schema. To add several authors to the Author table, we need to write the following code:

await prisma.author.createMany({
  data: [{
      email: "donald.le@gmail.com",
      name: "Donald Le"
    },
    {
      email: "evans.chris@gmail.com",
      name: "Evan Chris"
    }],
});
Enter fullscreen mode Exit fullscreen mode

Checking the Author table in the database, we see these two authors were added.

+----+-----------------------+------------+
| id | email | name |
+----+-----------------------+------------+
| 1 | donald.le@gmail.com | Donald Le |
| 2 | evans.chris@gmail.com | Evan Chris |
+----+-----------------------+------------+
Enter fullscreen mode Exit fullscreen mode

To add these authors' blog posts to the database, run the following:

await prisma.blog.createMany({
  data: [{
      title: "A great blog",
      content: "A great blog",
      authorId: 1
    }, {
      title: "Another great blog",
      content: "Another great blog",
      authorId: 2
    }],
});
Enter fullscreen mode Exit fullscreen mode

Checking the Blog table, we see these two blogs were added:

+----+-------------------------+-------------------------+--------------------+--------------------+-----------+----------+
| id | createdAt | updatedAt | title | content | published | authorId |
+----+-------------------------+-------------------------+--------------------+--------------------+-----------+----------+
| 1 | 2023-02-12 01:30:35.629 | 2023-02-12 01:30:35.629 | A great blog | A great blog | 0 | 1 |
| 2 | 2023-02-12 01:30:35.629 | 2023-02-12 01:30:35.629 | Another great blog | Another great blog | 0 | 2 |
+----+-------------------------+-------------------------+--------------------+--------------------+-----------+----------+
Enter fullscreen mode Exit fullscreen mode

To query all blogs of the author with authorId of 1, we write our code as:

async function main() {
  const results = await prisma.blog.findMany({
    where:{authorId:1}
  });
  console.log(results);
}
Enter fullscreen mode Exit fullscreen mode

Here's what we see as output:

[
  {
    id: 1,
    createdAt: "2023-02-12T01:30:35.629Z",
    updatedAt: "2023-02-12T01:30:35.629Z",
    title: "A great blog",
    content: "A great blog",
    published: false,
    authorId: 1
  }
]
Enter fullscreen mode Exit fullscreen mode

NoSQL: MongoDB

Let's say we want to work with a NoSQL database like MongoDB (Mongo Atlas), and we have two topics: Blog and Author. These topics have a relation to each other using authorId. We then define these topics in schema.prisma file as:

//schema.prisma
model Blog {
  id String @id @default(auto()) @map("_id") @db.ObjectId
  title String
  content String
  author Author @relation(fields: [authorId], references: [id])
  authorId String @db.ObjectId
}

model Author {
  id String @id @default(auto()) @map("_id") @db.ObjectId
  email String @unique
  name String?
  address Address?
  blogs Blog[]
}
Enter fullscreen mode Exit fullscreen mode

To apply these defined models to MongoDB, we run the following command:

npx prisma generate 
Enter fullscreen mode Exit fullscreen mode

Upon checking the Mongo Atlas database, we see two collections are created.

Two collections created in Mongo Atlas


Figure 1: Two collections created in Mongo Atlas

We can add a new author to the Author collection and also add a new blog for that author with the following code:

await prisma.author.create({
  data: {
    name: 'Donald Mathew',
    email: 'donald.matthew@gmail.com',
    blogs: {
      create: {
        title: 'This is an interesting blog post',
        content: 'Lots of really interesting stuff',
      },
    },
  },
});
Enter fullscreen mode Exit fullscreen mode

Under the Author collection in Mongo Atlas, we can see that the new author has been added successfully.

A new author is added to the Author collection


Figure 2: A new author is added to the Author collection

And under the Blog collection, we see the new blog, too.

A new blog is added to the Blog collection


Figure 3: A new blog is added to the Blog collection

We can also query all the blogs that belong to the author Donald Mathew with authorID of 63e8580b1076c768fd1fe772 by running the following code:

const allUsers = await prisma.blog.findMany({
  where:{authorId:"63e8580b1076c768fd1fe772"}
});

console.dir(allUsers, { depth: null })
Enter fullscreen mode Exit fullscreen mode

Running the code will show the blogs that belong to the author as:

[
  {
    id: "63e8580b1076c768fd1fe773",
    title: "This is an interesting blog post",
    content: "Lots of really interesting stuff",
    authorId: "63e8580b1076c768fd1fe772"
  }
]
Enter fullscreen mode Exit fullscreen mode

And that's how we can use a Node.js library like Prisma to work with NoSQL and SQL databases. Prisma's documentation page gives further details about this process.

Choosing the Right Database for a Specific Use Case

Now that we've reviewed the differences between SQL vs. NoSQL databases let's look at some specific use cases and determine the database type that would be the most appropriate fit in each instance.

Relational Data

Relational data might include users' personal information, including their full name, date of birth, children, and siblings—the information that would logically be stored in a table. An SQL database would be a good choice for these cases because features like join and primary key help us quickly query the relational data we need. Although a NoSQL database gives us flexibility when storing data, and we can quickly scale the MongoDB database, it is not designed for dealing with relational data.

Flexible Data Structures

With data that has flexible data structures, such as user-generated data, a NoSQL database would be a better choice. For example, if we wanted to research how users interact with our application and which functionalities users use most often to achieve this. In that case, we should first store all user logs in a database and then deep dive into these logs. Again, given the flexibility of the data structures from multiple functionalities in our application, a NoSQL database like MongoDB would be a good fit.

Low Latency Applications

When working with applications that require a very low latency—say, ten or twenty milliseconds, as might be expected in a trading application or an online role-play game—a NoSQL database like Redis would best serve the use case. Redis is known for its speed of reading data. However, some parts of the application that do not require such a fast server response would still benefit from using a traditional SQL database, like storing user profiles, a case in which the benefit of an SQL database's support for relational data outweighs the speed benefits of NoSQL.

Applications Targeting Fraud Detection or Personalization

For organizations and their applications that are designed to detect fraud, like International Consortium of Investigative Journalists, or try to improve customer experience via personalization, as in the case of Tourism Media, a NoSQL graph database like Neo4j is a good match. In these kinds of use cases, the quantity of data we're dealing with is enormous, and the pattern we're searching for in the data is often complex.

Neo4j addresses both of these problems effectively. Its design as a graph database means that Neo4j has a query language that provides a nested queries mechanism, which is much simpler than using join in an SQL database. This helps us to find a pattern in our vast ocean of data with relative ease. Neo4j also offers excellent performance with lightspeed throughput when working with massive quantities of data. This makes Neo4j ideal for fraud detection and personalization applications.

Tired of Writing Boilerplate Code when Working with Databases?

Amplication is an application development platform that generates all necessary boilerplate code, leaving you free to focus on creating business value. In addition, Amplication has first-class support for SQL and NoSQL databases, so you can generate applications quickly using whichever is the best fit for your use case and start shipping code immediately.

Top comments (0)