Today we are going to learn how to make Hybrid Queries using Apache-Age's extension for PostgreSQl.
If you don't already have PostgreSQl or AGE installed you can visit Apache-Age Setup and PostgreSQL Installation site to get more information on how you can get your own instance of PostgreSQL and the AGE extension.
One of the most powerful features of AGE is the ability to perform hybrid queries, which allow users to combine the power of graph queries with traditional SQL queries. In this blog post, we will explore the concept of hybrid queries and how they can be used to improve data analysis.
What are Hybrid Queries?
A hybrid query is a powerful feature of AGE for PostgreSQL that allows users to perform both graph and SQL queries on their data. With hybrid queries, users can take advantage of the strengths of both types of queries to gain deeper insights into their data.
Graph queries are designed to work with inter-connected data, such as social networks, recommendation engines, and knowledge graphs. They allow users to traverse the relationships between nodes in the graph and extract meaningful information. SQL queries, on the other hand, are ideal for working with structured data, such as tables and columns, and performing operations like aggregation, filtering, and sorting.
How to perform Hybrid Queries using AGE in PostgreSQL
To perform hybrid queries in AGE, you need to install and configure the extension. Once you've installed the extension and loaded AGE, you can create a graph in PostgreSQL using the AGE syntax. For example, the following command creates a graph called "my_graph":
demo=# SELECT * FROM ag_catalog.create_graph('my_graph');
NOTICE: graph "my_graph" has been created
Once you have created a graph, you can add vertices and edges into it. For example, the following command adds a node with the label "Person" and the property "name" set to "Daniel" and other properties to the graph:
demo=# SELECT * FROM cypher('my_graph', $$
CREATE (:Person {id: 123, name: 'Daniel', title: 'Software Engineer', city:'Lagos'}) $$)
as (n agtype);
n
---
(0 rows)
We can add another node:
demo=# SELECT * FROM cypher('my_graph', $$
CREATE (:Person {id: 124, name: "Emmanuel", title: "Developer", city: "Sydney"}) $$)
as (n agtype);
n
---
(0 rows)
And now we can add our edge by creating a relationship between 'Daniel' and 'Emmanuel':
demo=# SELECT * FROM cypher('my_graph', $$ MATCH (a:Person), (b:Person)
demo$# WHERE a.name = 'Daniel' AND b.name = 'Emmanuel'
demo$# CREATE (a)-[e:Friends]->(b)
demo$# RETURN e $$) as (e agtype);
We can now check to see the relationship between 'Daniel' and 'Emmanuel' using:
demo=# SELECT * FROM cypher('my_graph', $$ MATCH
(a {name: 'Daniel'})-[r]-(b {name: 'Emmanuel'})
RETURN type(r) $$) as (type agtype);
type
-----------
"Friends"
(1 row)
After creating our graph and inserting our vertices and edges, let's create and view our table in SQL:
demo=# CREATE TABLE Persons (
id INT,
name VARCHAR,
city VARCHAR,
demo(# hired_year INT);
CREATE TABLE
demo=# SELECT * FROM Persons;
id | name | city | hired_year
----+------+------+------------
And let's INSERT our VALUES:
demo=# INSERT INTO Persons (id, name, city, hired_year)
VALUES (123, '"Daniel"', 'Lagos', 2023);
INSERT 0 1
demo=# INSERT INTO Persons (id, name, city, hired_year)
VALUES (124, '"Emmanuel"', 'Sydney', 2018);
INSERT 0 1
When we view our Table now:
demo=# SELECT * FROM Persons;
id | name | city | hired_year
-----+------------+--------+------------
123 | "Daniel" | Lagos | 2023
124 | "Emmanuel" | Sydney | 2018
Now Let's Perform Hybrid Queries
To run a hybrid query, we first need to define the graph pattern we want to match using the AGE syntax.
For example, the inner cypher query searches for the name 'Daniel' in 'my_graph' in the Label :Person and then returns it to the outer query, which SELECTS and returns the id, city and hired_year of the person whose name matches the inner (cypher) query which in this case is 'Daniel':
demo=# SELECT p.id, p.city, p.hired_year FROM Persons AS p
demo-# WHERE p.name in (SELECT name from cypher('my_graph',
demo(# $$ MATCH (v:Person) WHERE v.name='Daniel'
demo$# RETURN v.name $$) as (name VARCHAR(25)));
id | city | hired_year
-----+-------+------------
123 | Lagos | 2023
(1 row)
The following is another example of Hybrid query, the inner cypher query MATCHES and returns all id(s) in 'my_graph' in the Label :Person to the outer SQL query which returns the names of the owners of the id(s) from the table Persons:
demo=# SELECT p.name FROM Persons AS p WHERE p.id in (
demo(# SELECT id FROM cypher('my_graph', $$
demo$# MATCH (v:Person) RETURN v.id $$)
demo(# AS (id int));
name
------------
"Daniel"
"Emmanuel"
(2 rows)
Benefits of Hybrid Queries
Hybrid queries offer several benefits over traditional SQL and graph queries.
Firstly, they allow users to work with complex, interconnected data sets in a more efficient and effective way. By combining the strengths of graph and SQL queries, users can gain deeper insights into their data and make more informed decisions.
Secondly, hybrid queries are highly flexible and customizable. Users can define complex graph patterns and apply SQL operations to the result set, allowing them to tailor the query to their specific needs.
Thirdly, hybrid queries are compatible with existing SQL tools and interfaces. This means that users can leverage their existing knowledge and skills in SQL to work with graph data, without needing to learn a new query language.
Finally, hybrid queries are supported by a growing number of open-source graph databases, including Apache AGE. This means that users have access to a range of tools and resources to help them get started with hybrid queries and graph databases.
Conclusion
In conclusion, hybrid queries are a powerful feature of the Apache AGE's extension for PostgreSQL that allow users to combine the strengths of graph and SQL queries to gain deeper insights into their data. By defining graph patterns and applying SQL operations to the result set, users can work with complex, interconnected data sets in a more efficient and effective way. Furthermore, the compatibility of hybrid queries with existing SQL tools and interfaces means that users can leverage their existing knowledge and skills in SQL to work with graph data. With the increasing popularity of graph databases and tools, hybrid queries are becoming an essential tool for data analysts and developers.
Relevant Resources
Apache-Age: An Introduction to Graph Database Management System
3 Common PostgreSQL and AGE Installation Errors and How to Solve Them
Visit Apache AGE Website: https://age.apache.org/
Visit Apache AGE GitHub: https://github.com/apache/age
Visit Apache AGE Viewer GitHub: https://github.com/apache/age-viewer
Top comments (1)
I'm working with Apache AGE and PostgreSQL, and I have a question regarding the design of relational tables and graphs. For example, if I create a SQL table called Company and another called Users, they would be relational tables in the SQL database.
However, I also need to represent the data in a graph, as I might have a structure with multiple services that don't follow a specific relationship and can be connected in various ways. The "root" node would be the Company.
My question is, should I create the Company node in the graph as well, or should I create the service nodes with a Company_id property referencing the Company table's ID, and perform queries based on this ID? Which approach would be more appropriate in this scenario?
Image to represent. I would like, at some point, to query all services related to this company, so it will return from 1 to 4.