DEV Community

Cover image for Mastering Cypher Query Language for Neo4j Graph NoSQL Databases
Sidali Assoul
Sidali Assoul

Posted on • Originally published at blog.spithacode.com

Mastering Cypher Query Language for Neo4j Graph NoSQL Databases

Introduction

The Cypher Query Language (CQL) is a powerful tool designed for querying graph databases. Unlike traditional relational databases, graph databases excel in managing heavily connected data with undefined relationships. CQL provides a syntax that is both intuitive and powerful, making it easier to create, read, update, and delete data stored in graph databases. In this comprehensive guide, we'll explore the features, constraints, terminologies, and commands of CQL, along with practical examples to help you harness its full potential.

Features of Cypher Query Language (CQL)

Suitable for Heavily Connected Data

One of the standout features of CQL is its suitability for data that is heavily connected. Unlike relational databases, where relationships are often complex and cumbersome to manage, graph databases thrive on connections. CQL allows for intuitive and efficient querying of these relationships, making it an ideal choice for social networks, recommendation engines, and more.

Multiple Labels for Nodes

In CQL, a node can be associated with multiple labels. This flexibility allows for better organization and categorization of data. For instance, a node representing a person can have labels such as Person, Employee, and Customer, each representing different aspects of the individual's identity.

Constraints of CQL

Fragmentation Limitations

While CQL is powerful, it does have some constraints. Fragmentation is only possible for certain domains. This means that, in some cases, data may need to be traversed in its entirety to retrieve a definitive answer.

Full Graph Traversal for Definitive Answers

For some queries, especially those involving complex relationships, the entire graph may need to be traversed to ensure that the returned data is accurate and complete. This can be resource-intensive and time-consuming, depending on the size and complexity of the graph.

Terminologies in CQL

Node

A node represents an entity in the graph. Nodes can have properties that store information about the entity, such as name, age, or any other relevant attribute.

Label

Labels allow for the grouping of nodes. They replace the concept of tables in SQL. For example, a node with a label Person groups all nodes that represent people.

Relation

A relation is a materialized link between two nodes. This replaces the notion of relationships in SQL, enabling direct connections between entities.

Attributes

Attributes are properties that a node or a relation can have. For instance, a Person node may have attributes such as name and age, while a LIKES relationship may have attributes like since.

Basic Commands in CQL

CREATE

The CREATE command is used to create nodes and relationships. This is fundamental for building the graph structure.

MATCH

The MATCH command is used to search for patterns in the graph. It is the cornerstone of querying in CQL, allowing you to retrieve nodes and relationships based on specified criteria.

Creating Nodes

Basic Node Creation

Creating nodes in CQL is straightforward. Use the CREATE command followed by the node details.

CREATE (:Person {name:\"John\", age:30})
CREATE (:Food {name:\"Pizza\"})

Enter fullscreen mode Exit fullscreen mode

Creating Nodes with Properties

Nodes can be created with properties, which are key-value pairs that store information about the node.

CREATE (:Person {name:\"Jane\", age:25, occupation:\"Engineer\"})
CREATE (:Food {name:\"Burger\", calories:500})

Enter fullscreen mode Exit fullscreen mode

Searching Nodes

Basic Node Search

The MATCH command allows you to search for nodes in the graph.

MATCH (p:Person) RETURN p

Enter fullscreen mode Exit fullscreen mode

Advanced Search with WHERE Clause

For more specific searches, use the WHERE clause to filter nodes based on their properties.

MATCH (p:Person)
WHERE p.age > 20
RETURN p.name, p.age

Enter fullscreen mode Exit fullscreen mode

Creating Relationships

Creating Relationships While Creating Nodes

You can create relationships between nodes as you create them.

CREATE (p:Person {name:\"John\", age:30})-[:LIKES]->(f:Food {name:\"Pizza\"})

Enter fullscreen mode Exit fullscreen mode

Creating Relationships Between Existing Nodes

Relationships can also be created between existing nodes using the MATCH command.

MATCH (p:Person {name:\"John\"})
MATCH (f:Food {name:\"Pizza\"})
CREATE (p)-[r:LIKES]->(f)
RETURN r

Enter fullscreen mode Exit fullscreen mode

Modifying Nodes and Relationships

Adding Attributes

Attributes can be added to existing nodes using the SET command.

MATCH (p:Person {name:\"John\"})
SET p.occupation = \"Developer\"
RETURN p

Enter fullscreen mode Exit fullscreen mode

Deleting Attributes

To delete an attribute, set its value to NULL.

MATCH (p:Person {name:\"John\"})
SET p.age = NULL
RETURN p

Enter fullscreen mode Exit fullscreen mode

Modifying Attributes

Attributes can be modified by setting them to new values.

MATCH (p:Person {name:\"John\"})
SET p.age = 35
RETURN p

Enter fullscreen mode Exit fullscreen mode

Using Aggregate Functions in CQL

COUNT

The COUNT function returns the number of nodes or relationships.

MATCH (n) RETURN count(n)

Enter fullscreen mode Exit fullscreen mode

AVG

The AVG function calculates the average value of a numeric property.

MATCH (n) RETURN avg(n.age)

Enter fullscreen mode Exit fullscreen mode

SUM

The SUM function calculates the total sum of a numeric property.

MATCH (n) RETURN sum(n.age)

Enter fullscreen mode Exit fullscreen mode

Advanced Queries in CQL

Number of Relations by Type

To get the count of each type of relationship in the graph, use the type function.

MATCH ()-[r]->() RETURN type(r), count(*)

Enter fullscreen mode Exit fullscreen mode

Collecting Values into Lists

The COLLECT function creates a list of all values for a given property.

MATCH (p:Product)-[:BELONGS_TO]->(o:Order)
RETURN id(o) as orderId, collect(p)

Enter fullscreen mode Exit fullscreen mode

Database Maintenance in CQL

Deleting Nodes and Relationships

To delete all nodes and relationships, use the DELETE command.

MATCH (a)-[r]->(b) DELETE a, r, b

Enter fullscreen mode Exit fullscreen mode

Visualizing Database Schema

Visualize the database schema to understand the structure of your graph.

CALL db.schema.visualization YIELD nodes, relationships

Enter fullscreen mode Exit fullscreen mode

Practical Tricks and Tips

Finding Specific Nodes

Here are three ways to find a node representing a person named Lana Wachowski.

// Solution 1
MATCH (p:Person {name: \"Lana Wachowski\"})
RETURN p

// Solution 2
MATCH (p:Person)
WHERE p.name = \"Lana Wachowski\"
RETURN p

// Solution 3
MATCH (p:Person)
WHERE p.name =~ \".*Lana Wachowski.*\"
RETURN p

Enter fullscreen mode Exit fullscreen mode

Complex Query Examples

Display the name and role of people born after 1960 who acted in movies released in the 1980s.

MATCH (p:Person)-[a:ACTED_IN]->(m:Movie)
WHERE p.born > 1960 AND m.released >= 1980 AND m.released < 1990
RETURN p.name, a.roles

Enter fullscreen mode Exit fullscreen mode

Add the label Actor to people who have acted in at least one movie.

MATCH (p:Person)-[:ACTED_IN]->(:Movie)
WHERE NOT (p:Actor)
SET p:Actor

Enter fullscreen mode Exit fullscreen mode

Application Examples

Real-World Use Cases

Consider a database for an online store where you need to manage products, clients, orders, and shipping addresses. Here's how you might model this in CQL.

Example Queries

Let's create some example nodes and relationships for an online store scenario:

CREATE (p1:Product {id: 1, name: \"Laptop\", price: 1000})
CREATE (p2:Product {id: 2, name: \"Phone\", price: 500})
CREATE (c:Client {id: 1, name: \"John Doe\"})
CREATE (o:Order {id: 1, date: \"2023-06-01\"})
CREATE (adr:Address {id: 1, street: \"123 Main St\", city: \"Anytown\", country: \"USA\"})

Enter fullscreen mode Exit fullscreen mode

Now, let's create the relationships between these nodes:

CREATE (p1)-[:BELONGS_TO]->(o)
CREATE (p2)-[:BELONGS_TO]->(o)
CREATE (c)-[:MADE]->(o)
CREATE (o)-[:SHIPPED_TO]->(adr)

Enter fullscreen mode Exit fullscreen mode

Querying Products Ordered in Each Order

To find out the products ordered in each order, including their quantity and unit price, use the following query:

MATCH (p:Product)-[:BELONGS_TO]->(o:Order)
RETURN id(o) as orderId, collect(p)

Enter fullscreen mode Exit fullscreen mode

Querying Clients and Shipping Addresses

To determine which client made each order and where each order was shipped, use this query:

MATCH (c:Client)-[:MADE]->(o:Order)-[:SHIPPED_TO]->(adr:Address)
RETURN c.name as client, id(o) as orderId, adr.street, adr.city, adr.country

Enter fullscreen mode Exit fullscreen mode

FAQ

What is Cypher Query Language (CQL)?

Cypher Query Language (CQL) is a powerful query language designed specifically for querying and updating graph databases. It allows you to interact with data in a way that emphasizes the relationships between data points.

How does CQL differ from SQL?

While SQL is designed for querying relational databases, CQL is designed for graph databases. This means that CQL excels at handling complex, highly connected data, whereas SQL is better suited for tabular data structures.

Can I use CQL with any database?

CQL is primarily used with Neo4j, a popular graph database management system. However, other graph databases may have their own query languages with similar capabilities.

What are the benefits of using CQL?

CQL allows for intuitive querying of graph databases, making it easier to manage and analyze data with complex relationships. It supports a rich set of commands for creating, updating, and deleting nodes and relationships, as well as powerful query capabilities.

Is CQL difficult to learn?

CQL is designed to be user-friendly and intuitive. If you are familiar with SQL, you will find many similarities in CQL. The main difference lies in how data relationships are handled.

How can I optimize my CQL queries?

Optimizing CQL queries involves understanding your graph's structure and using efficient query patterns. Indexing frequently searched properties and avoiding unnecessary full graph traversals can significantly improve performance.

Conclusion

Cypher Query Language (CQL) is a robust tool for managing graph databases, offering powerful capabilities for querying and updating complex, highly connected data. By mastering CQL, you can leverage the full potential of graph databases, making it easier to handle intricate data relationships and perform sophisticated analyses.

Top comments (0)