Memgraph for Memgraph

Posted on Jan 24, 2023 • Originally published at memgraph.com

Graph Database Query Languages You Should Try

#memgraph #cypher #graphql #gremlin

Query languages used for graph data management are called graph query languages (GQLs). They define a way of extracting and retrieving data that has been modeled as a graph. All databases need to talk with their clients, so the query languages define what they can do.

Below are the popular graphDB query languages that provide a way to unpack information in graphs. Let’s dive into the details of popular query languages with their pros and cons.

GraphQL

GraphQL is a modern alternative to the REST-based architecture that aims to query the databases from client-side applications. It specifies how to present the data to the client on the backend to the API. It also offers flexibility and improves client-server interactions by enabling the clients to make precise data requests.

Pros of GraphQL

GraphQL is best for complex microservices, and you can integrate multiple systems behind its API. Its server fetches the data from the existing systems and packages it up in the GraphQL response format. Thus, it is of great benefit for third-party APIs and legacy infrastructures that are difficult to handle and maintain due to their enormous size. Further, GraphQL handles the communication between multiple microservices by merging them into one GraphQL schema.
The data in the REST responses is either low or sometimes not sufficient, creating the need for another request. GraphQL solves the problem by fetching the needed data in a single request.
GraphQL keeps the documentation in sync with the API changes. As its API is tightly coupled with code, the documentation changes when there is a change in queries, fields, or types.

Cons of GraphQL

Sometimes, GraphQL queries encounter performance issues when clients ask for too many nested fields at once. Therefore, it might be worth using a REST API for complex queries.
The file uploading feature is not included in the GraphQL specification as it does not understand files.
It is challenging to implement a simplified cache in GraphQL because each query can be different even if it operates on the same entity.

Gremlin

Gremlin is a graph traversal language of Apache TinkerPop adopted by many graph database solutions. As it is a path-oriented language, so it is used for:

Retrieving the data from the graph
Modifying the graph data
Expressing complex mutation operations and graph traversals The developers can write Gremlin queries in many programming languages like Python, Java, Javascript, Scala, and Groovy.

Pros of Gremlin

Gremlin allows the users to do imperative (procedural) and declarative (descriptive) traversals in a graph.
The Gremlin traversals can be written in any programming language that supports function composition and function nesting.

Cons of Gremlin

It is a low-level graph traversal language that is not easy to read.
Complex queries that require a lot of pattern matching can be difficult to write and do not perform well in Gremlin.

Cypher

Cypher is a graphDB query language that allows you to retrieve data from a graph. It is the easiest language to learn because of its intuitiveness and similarity with other languages.

Cypher is Neo4j’s query language, and it is unique because it is heavily based on patterns and provides a visual way to match relationships and patterns. You can use Object Graph Mappers (OGMs) to map the relationships and nodes in graphs to references and objects in a domain model.

The two well-known OGMs are:

Neo4j-OGM - It uses Cypher query statements alongside the Java driver and maps the existing domain objects to Neo4j. Neo4j-OGM supports features like fast class metadata scanning and optimized management of data loading.
GQLAlchemy - This Object Graph Mapper is an open-source Python library that acts as a link between Python objects and Graph Database objects. It provides a developer-friendly workflow for writing object-oriented code. You can write object-oriented code, and the GQLAlchemy will automatically translate it into Cypher queries.

Pros of Cypher

Cypher query language is compact and easy to learn. It helps users write intuitive and expressive queries for the fast retrieval of results.
It is data-rich, reliable, secure, and well-suited for application development.
It is visual and logical as it matches the patterns of nodes and relationships in the graph using the ASCII-Art syntax.

Cons of Cypher

Neo4j Cypher does not have the internal support of date data type.
It doesn’t have a good performance advantage for a simple data model that doesn’t require a lot of joining or aggregation.
It doesn’t scale writes very well in case of high write loads.

SPARQL

SPARQL is an acronym for “SPARQL Protocol and RDF Query Language.” It is a query language that enables users to query information from databases that can be mapped to RDF (Resource Description Framework). SPARQL has four types of queries, namely ASK, SELECT, CONSTRUCT, and DESCRIBE. The data in the SPARQL query is internally expressed as triples consisting of subject, predicate, and object.

Pros of SPARQL

SPARQL can federate queries across different repositories.
It can access RDF as well as relational data.
SPARQL can integrate data requiring that the data be in intersecting domains.

Cons of SPARQL

SPARQL is not easy to read and understand.
It does not allow unbound recursive queries.
SPARQL is purpose-built for the RDF data, which not every database conforms to.

AQL

AQL is an acronym for “ArangoDB query language” that lets the users retrieve and modify data stored in ArangoDB (a fully managed and scalable graph database). AQL is human-readable as it uses keywords from the English language, and it is declarative, which means the query depicts what result should be achieved but not how it should be achieved.

Pros of AQL

AQL supports various functions like ANALYZER(), BOOST(), and EXISTS(), etc., to allow complex computations.
It helps the developers develop robust applications and map the data natively to the database.
It is a flexible language that helps the architects scale and adapt their architectures to changing needs with much less effort.

Cons of AQL

It does not support data definition operations like creating and dropping collections or databases.
It can not have more than four thousand execution nodes in its initial execution plan.
It can not use more than a thousand result registers.

Conclusion

We hope you gained an idea of the use of popular graphDB query languages. Each one has its upsides and downsides. But which language to use to extract data from graphs depends on the available data and the operations supported by the language. We can not make a comparison among the languages to decide which one is the best. They all have their use cases in various areas, as shown below:

GraphQL - It is best for complex systems and microservices.
Gremlin - It allows the users to do procedural and descriptive graph traversals.
Cypher - It is well-suited for data analytics and application development.
SPARQL - It is best for integrating data from various intersecting domains.
AQL - It helps the developers and the architects develop powerful applications and scaling architectures, respectively.

DEV Community