DEV Community

Cover image for Understanding Recursive Queries in PostgreSQL: A Process Hierarchy Example
Dmitry Romanoff
Dmitry Romanoff

Posted on

Understanding Recursive Queries in PostgreSQL: A Process Hierarchy Example

In the world of databases, hierarchical data can often be tricky to handle. PostgreSQL, however, offers a powerful feature for this task: recursive Common Table Expressions (CTEs). In this article, we’ll dive into how you can use recursive CTEs to work with hierarchical data, using a practical example of process management.

The Problem: Hierarchical Data Representation
Consider a scenario where we need to represent a hierarchy of processes. Each process might have a parent process, forming a tree-like structure. Our goal is to query this hierarchical data efficiently and display it in a readable format.

Setting Up the Data
First, let’s create a table to store our process information. Each process has a name, a unique process ID (PID), and an optional parent PID indicating its parent process.

CREATE TABLE the_processes (
    process_name VARCHAR(100),
    pid VARCHAR(100),
    parent_pid VARCHAR(100)
);

INSERT INTO the_processes VALUES
('a.exe', '420', '428'),
('c.exe', '428', NULL),
('d.exe', '551', '420'),
('e.exe', '552', '428'),
('f.exe', '553', NULL),
('g.exe', '4', '553'),
('b.exe', '7', '4'),
('h.exe', '11', '7'),
('j.exe', '666', '428');
Enter fullscreen mode Exit fullscreen mode

After inserting the data, querying the table provides us with the following snapshot of our hierarchical structure:

SELECT * FROM the_processes;
Enter fullscreen mode Exit fullscreen mode

Output:

process_name | pid | parent_pid 
--------------+-----+-----------
 a.exe        | 420 | 428
 c.exe        | 428 | 
 d.exe        | 551 | 420
 e.exe        | 552 | 428
 f.exe        | 553 | 
 g.exe        | 4   | 553
 b.exe        | 7   | 4
 h.exe        | 11  | 7
 j.exe        | 666 | 428
Enter fullscreen mode Exit fullscreen mode

Recursive CTE: Building the Hierarchy
To fetch the hierarchical data, we use a recursive CTE. The CTE helps us traverse the tree structure by starting with root nodes (processes without a parent) and then recursively joining child nodes.

WITH RECURSIVE HierarchyCTE AS (
    SELECT 
        process_name,
        parent_pid,
        pid,
        0 AS level
    FROM 
        the_processes
    WHERE 
        parent_pid IS NULL

    UNION ALL

    SELECT 
        t.process_name,
        t.parent_pid,
        t.pid,
        h.level + 1
    FROM 
        the_processes t
    JOIN 
        HierarchyCTE h ON t.parent_pid = h.pid
)
SELECT 
    process_name,
    parent_pid,
    pid,
    level
FROM 
    HierarchyCTE
ORDER BY 
    level, pid;
Enter fullscreen mode Exit fullscreen mode

In this query:

  • The base case of the recursion selects the root nodes (where parent_pid is NULL).
  • The recursive step joins the CTE with the table to find child processes, increasing the level by 1 for each level of recursion.

Output:

process_name | parent_pid | pid | level 
--------------+------------+-----+-------
 c.exe        |            | 428 |     0
 f.exe        |            | 553 |     0
 g.exe        | 553        | 4   |     1
 a.exe        | 428        | 420 |     1
 e.exe        | 428        | 552 |     1
 j.exe        | 428        | 666 |     1
 d.exe        | 420        | 551 |     2
 b.exe        | 4          | 7   |     2
 h.exe        | 7          | 11  |     3

Enter fullscreen mode Exit fullscreen mode

Visualizing the Hierarchy
To visualize the hierarchical structure, you can format the output with indentation to reflect the tree structure:

WITH RECURSIVE HierarchyCTE AS (
    SELECT 
        process_name,
        parent_pid,
        pid,
        0 AS level
    FROM 
        the_processes
    WHERE 
        parent_pid IS NULL

    UNION ALL

    SELECT 
        t.process_name,
        t.parent_pid,
        t.pid,
        h.level + 1
    FROM 
        the_processes t
    JOIN 
        HierarchyCTE h ON t.parent_pid = h.pid
)
SELECT 
    REPEAT('----', level) || process_name
FROM 
    HierarchyCTE
ORDER BY 
    level, pid;
Enter fullscreen mode Exit fullscreen mode

Output:

c.exe
f.exe
 ----g.exe
 ----a.exe
 ----e.exe
 ----j.exe
 --------d.exe
 --------b.exe
 ------------h.exe
Enter fullscreen mode Exit fullscreen mode

This output uses dashes (----) to indicate the level of each process, making the hierarchy easier to understand at a glance.

Conclusion
Recursive CTEs in PostgreSQL provide a robust method for querying and visualizing hierarchical data. Whether you are managing processes, organizational structures, or any tree-like data, mastering recursive queries can significantly enhance your ability to work with complex datasets.

Top comments (0)