I consider myself a full-stack developer, but I lean toward the front-end. That is probably because I am a visual learner (i.e. I process information easier if it is presented in a visual format). My assumption is that many front-end developers are also visual learners. When I learn a new concept, if I can see a picture or a video of that concept then I am able to process that information faster and retain that information longer.
That was the case when I was first introduced to graph databases. At first, the concept of storing data in nodes and then creating relationships with edges was strange because it was so foreign to the relational and document database concepts that I was used to. But after I used a data explorer to see how my data was organized into a graph, things clicked.
If you would like to understand a bit more about what graph databases are, then you can read this for a nice overview: What is a Graph Database?
To be quite honest, graph databases are so intuitive to me now that it is difficult for me to switch back to relational and document databases because I can’t easily visualize the data and their relationships.
So when I say that graph databases are perfect for front-end developers, I am saying that their data models work really well for people who process information visually.
In this article I will introduce you to a graph database called TerminusDB. I stumbled upon TerminusDB a while ago when I was looking for a graph database that combined the graph features of Neo4j with the document storage capabilities of MongoDB. Since TerminusDB is still relatively new, I wanted to introduce it here for those of you who are evaluating graph databases (or any database) for a new project.
Why would I choose a graph database over other options?
Most graph databases can do almost everything that a relational or document database can do. TerminusDB can do everything that a relational or document database can do and much more.
The following are lists of pros and cons for relational, document, and graph databases. These lists are by no means comprehensive, but they are intended to illustrate some of the capabilities of these databases. (Apologies for any bias that I have shown below.)
Relational Databases
Pros:
- Great for structured data. If your data fits nicely into a table, then relational databases are great.
- SQL. You can use any relational database and the query language will be mostly the same.
- Data integrity. Data is usually very accurate, complete, and consistent.
- Data duplication. Data tends to be well-organized with very little, if any, data duplication.
- Versatility. Can be used in a wide variety of applications.
- Acid transactions. ACID is a set of properties of database transactions intended to guarantee data validity despite errors, power failures, and other mishaps.
Cons:
- Relationships are not always easily represented. This is one of the ironic things about relational databases. Yes, relationships can be represented, but it's not always easy or intuitive to do so and there is no way to describe the relationships among entities.
- Queries can get a bit complex at times. Many queries in SQL aren't too complex, but sometimes you have to run queries that can feel a bit overwhelming. (I might feel this way just because graph queries are so much more intuitive to me.)
- Restrictive data model. It can be difficult sometimes to get relational databases to do what you want.
- Scaling can be challenging. Many other types of databases have been created to solve the scaling problem. However, this seems to have been addressed with newer, cloud-based relational databases.
Document Databases
Pros:
- Rapid application development. Document databases don't typically use a schema, so the data model is flexible. You can change data without needing to restructure or change your entire database.
- High performance. Document databases were designed to handle queries very quickly.
- Flexibility. You can do a lot of things easily that would be very difficult or even impossible to do with a relational database.
- Scalability. You can storage huge amounts of data without worrying whether or not your database can handle it.
Cons:
- No schema enforcement. What is a pro is also a con. Your data can become a jumbled mess pretty easily.
- No standardized query language. Raise your hand if you like learning a new query language when all you want to do is build an app? How about context switching if you have to move between two different types of databases?
- Lack of data consistency. At large scales, some document databases struggle to maintain consistency between shards.
Graph Databases
Pros:
- Graph Query Language (GQL - not to be confused with GraphQL). GQL is an upcoming ISO standard language for property graph querying that is currently being created. To put into context how significant that is, SQL is the only other database query language that has been standardized by ISO. Not even MongoDB, Redis, or DynamoDB, as popular as they are, have an ISO standardized query language. Currently, many graph database vendors have their own unique query language. However, with GQL graph databases will be able to share a common query language, which will make adoption of graph databases much easier for developers.
- Data is organized in the same way it is found in nature. We rarely find data in nature that is organized in nice and neat rows and columns. In nature, data is usually organized as entities (e.g. people, places, companies, animals, movies, books) and the relationships among those entities.
- Relationships. Relationships among entities are represented very clearly and can easily be displayed in data visualizations.
- High performance. Queries for data can traverse a graph and return your data very quickly.
Cons:
- Specialty applications. Some graph databases can be used as your primary database, but others can only be used for special use-cases.
- Scalability. As with relational databases, this has been a major problem for graph databases. However, many newer graph databases have shown to scale without limits while maintaining data consistency.
- No schema. Many graph databases don't have schemas, which can be nice for rapid application development, but that can become a big problem later on when your data doesn't follow any guidelines.
- Compound data types are not stored easily. Each node can store simple data types (e.g. strings, numbers, booleans), but when you want to store a compound data structure, like a JavaScript object or an array, you either have to create a new node to hold the nested data (which doesn't always makes sense - think of creating a separate record in a relational database for a person's address), or you have to convert the data structure to a string before you can store it. If you convert the data structure to a string, then you will often run into problems when performing a query for people from a specific city, for example.
TerminusDB
Pros:
TerminusDB has most of the pros of other graph databases plus a couple of other features:
- Versatility. TerminusDB can be used as a primary database in a wide variety of applications.
- Flexibility. You can do a lot of things easily that would be very difficult or even impossible to do with a relational or a document database.
- An enforced schema. This prevents the structure of your data from becoming a jumbled mess over time. I once asked a web developer friend of mine what his experience was like when he worked for a company that used a document database. He said that when he got there, since the database had no way of enforcing the schema, other developers would just create new documents where they didn't belong and nest data inside of documents when they shouldn't have. My friend said it was pretty messy, to say the least.
- Document-based nodes. This is one of my favorite features of TerminusDB! The nodes in TerminusDB are essentially JSON documents and can store simple and compound data types easily.
- ACID transactions. TerminusDB is ACID compliant. That means that you can use TerminusDB to create apps where ACID transactions are critical, like banking apps.
- Data integrity. Due to the schema enforcement, data is accurate, complete, and consistent.
- Data duplication. Since the data for each entity is stored in its own node, the database can be very well-organized with no data duplication.
- GraphQL integration. While we wait for GQL to be finalized, GraphQL can be used right now to access TerminusDB. Having a standardized query language that is already well-known makes it easier to get started with TerminusDB.
Cons:
- TerminusDB is newer than some other graph databases, so you deal with issues that are common with small software communities (e.g. lack of information, lack of people to help you).
- There is no way to store data about relationships, like you can with Neo4j. Relationships are represented through properties, so you have some information about the relationship due to the property name, but not much more than that.
A couple of things before we continue
I want to share this bit of information that tripped me up about graph databases when I was first introduced to them:
When working with graph databases, the term "graph" is often used instead of "database" or "data". For example, when creating a new database, you might read something like this: "Now it’s time to create our graph." In this case, graph means database. Or when designing the schema for your graph database, you might read something like this: "When designing your schema, you will think in terms of the graph that matters for your app." In this case, graph means data.
Just keep that in mind, when working with graph databases.
Onward
In Part 2, we will get started with TerminusDB.
Top comments (0)