Apache AGE is an open-source graph database that allows users to store, manage, and analyze data in a highly connected and flexible manner. With its efficient data model and query language, AGE is an excellent tool for creating and exploring complex data relationships.
In this tutorial, we will focus on using Apache AGE to build a graph database for NBA statistics. We will cover the basics of graph databases and show you how to analyze NBA statistics with AGE. We will also demonstrate how to use AGE to perform analysis tasks, such as finding the player who scored the most and exploring the relationships between different players and teams. By the end of this tutorial, you will have a solid understanding of how to use Apache AGE to create and analyze graph databases for NBA statistics, and you will be well-equipped to apply these skills to other domains as well.
Let's tip off!
You can install Apache AGE on macOS, Windows and Linux.
To create a graph to store all of our players, teams and games, will need to execute the following query:
SELECT * FROM create_graph('NBA');
Now lets add some teams to our graph:
-- Sacramento Kings
SELECT * FROM cypher('NBA', $$
CREATE (t:Team {
name: 'Sacramento Kings',
city: 'Sacramento',
division: 'Pacific Division',
coach: 'Mike Brown',
players: ["De'Aaron Fox", 'Domantas Sabonis', 'Kevin Huerter', 'Harrison Barnes', 'Malik Monk'],
conference: 'Western Conference'})
RETURN t
$$) as (Team agtype);
-- Phoenix Suns
SELECT * FROM cypher('NBA', $$
CREATE (t:Team {
name: 'Phoenix Suns',
city: 'Phoenix',
division: 'Pacific Division',
coach: 'Monty Williams',
players: ['Chris Paul', 'Devin Booker', 'Deandre Ayton', 'Torrey Craig', 'Damion Lee'],
conference: 'Western Conference'})
RETURN t
$$) as (Team agtype);
To add the players stats for each team, we can do the following:
-- Sacramento Kings Players
SELECT * FROM cypher('NBA', $$
CREATE
(:Player {
name: "De'Aaron Fox", team: 'Sacramento Kings',
height: 6.4, weight: 185, age: 25,
nationality: 'USA'}),
(:Player {
name: 'Domantas Sabonis', team: 'Sacramento Kings',
height: 7.1, weight: 240, age: 26,
nationality: 'Lithuania / USA'}),
(:Player {
name: 'Kevin Huerter', team: 'Sacramento Kings',
height: 6.7, weight: 198, age: 24,
nationality: 'USA'}),
(:Player {
name: 'Harrison Barnes', team: 'Sacramento Kings',
height: 6.8, weight: 225, age: 30,
nationality: 'USA'}),
(:Player {
name: 'Malik Monk', team: 'Sacramento Kings',
height: 6.3, weight: 200, age: 25,
nationality: 'USA'})
$$) as (v agtype);
-- Phoenix Suns Players
SELECT * FROM cypher('NBA', $$
CREATE
(:Player {
name: 'Chris Paul', team: 'Phoenix Suns',
height: 6.0, weight: 175, age: 37,
nationality: 'USA'}),
(:Player {
name: 'Devin Booker', team: 'Phoenix Suns',
height: 6.5, weight: 206, age: 26,
nationality: 'USA'}),
(:Player {
name: 'Deandre Ayton', team: 'Phoenix Suns',
height: 7.0, weight: 250, age: 24,
nationality: 'USA'}),
(:Player {
name: 'Torrey Craig', team: 'Phoenix Suns',
height: 6.7, weight: 221, age: 32,
nationality: 'USA'}),
(:Player {
name: 'Damion Lee', team: 'Phoenix Suns',
height: 6.5, weight: 210, age: 30,
nationality: 'USA'})
$$) as (v agtype);
To add each player to their according team, we do the following:
SELECT * from cypher('NBA', $$
MATCH (t:Team),(p:Player)
WHERE p.name IN t.players
CREATE (p)-[e:PLAYS_FOR]->(t)
RETURN e
$$) as (e agtype);
Checking the box score
On the 14th of February, 2023, Phoenix Suns won a match at home against the Sacramento Kings. The final result was 120 - 109, with a incredible performance from Devin Booker, scoring 32 points.
We can create a new node with the game stats and connect the players and the team with it:
-- The Game
SELECT * FROM cypher('NBA', $$
CREATE (g:Game {
home_team: 'Phoenix Suns',
away_team: 'Sacramento Kings',
date: '2023-02-14',
home_score: 120,
away_score: 109})
RETURN g
$$) as (Game agtype);
-- Making the edges
SELECT * FROM cypher('NBA', $$
MATCH (g:Game), (ht:Team), (at:Team)
WHERE ht.name = g.home_team AND at.name = g.away_team AND g.home_score > g.away_score
CREATE (ht)-[:WON]->(g)
CREATE (at)-[:LOST]->(g)
$$) as (v agtype);
With the players stats, we can add them as an edge property between the players and the game:
-- Suns Players
SELECT * FROM cypher('NBA', $$
MATCH (p:Player), (g:Game)
WHERE p.name = 'Torrey Craig' AND g.date = '2023-02-14'
CREATE (p)-[e:PLAYED_AT {min: 30, reb: 3, asst: 3, pts: 6}]->(g)
RETURN e
$$) as (player agtype);
SELECT * FROM cypher('NBA', $$
MATCH (p:Player), (g:Game)
WHERE p.name = 'Deandre Ayton' AND g.date = '2023-02-14'
CREATE (p)-[e:PLAYED_AT {min: 37, reb: 11, asst: 2, pts: 29}]->(g)
RETURN e
$$) as (player agtype);
SELECT * FROM cypher('NBA', $$
MATCH (p:Player), (g:Game)
WHERE p.name = 'Chris Paul' AND g.date = '2023-02-14'
CREATE (p)-[e:PLAYED_AT {min: 37, reb: 5, asst: 19, pts: 17}]->(g)
RETURN e
$$) as (player agtype);
SELECT * FROM cypher('NBA', $$
MATCH (p:Player), (g:Game)
WHERE p.name = 'Damion Lee' AND g.date = '2023-02-14'
CREATE (p)-[e:PLAYED_AT {min: 23, reb: 0, asst: 2, pts: 5}]->(g)
RETURN e
$$) as (player agtype);
SELECT * FROM cypher('NBA', $$
MATCH (p:Player), (g:Game)
WHERE p.name = 'Devin Booker' AND g.date = '2023-02-14'
CREATE (p)-[e:PLAYED_AT {min: 29, reb: 5, asst: 2, pts: 32}]->(g)
RETURN e
$$) as (player agtype);
-- Sacramento Kings Players
SELECT * FROM cypher('NBA', $$
MATCH (p:Player), (g:Game)
WHERE p.name = 'Harrison Barnes' AND g.date = '2023-02-14'
CREATE (p)-[e:PLAYED_AT {min: 36, reb: 2, asst: 0, pts: 11}]->(g)
RETURN e
$$) as (player agtype);
SELECT * FROM cypher('NBA', $$
MATCH (p:Player), (g:Game)
WHERE p.name = 'Domantas Sabonis' AND g.date = '2023-02-14'
CREATE (p)-[e:PLAYED_AT {min: 38, reb: 15, asst: 7, pts: 24}]->(g)
RETURN e
$$) as (player agtype);
SELECT * FROM cypher('NBA', $$
MATCH (p:Player), (g:Game)
WHERE p.name = 'Malik Monk' AND g.date = '2023-02-14'
CREATE (p)-[e:PLAYED_AT {min: 0, reb: 0, asst: 0, pts: 0}]->(g)
RETURN e
$$) as (player agtype);
SELECT * FROM cypher('NBA', $$
MATCH (p:Player), (g:Game)
WHERE p.name = 'Kevin Huerter' AND g.date = '2023-02-14'
CREATE (p)-[e:PLAYED_AT {min: 35, reb: 3, asst: 2, pts: 18}]->(g)
RETURN e
$$) as (player agtype);
SELECT * FROM cypher('NBA', $$
MATCH (p:Player), (g:Game)
WHERE p.name = "De'Aaron Fox" AND g.date = '2023-02-14'
CREATE (p)-[e:PLAYED_AT {min: 38, reb: 4, asst: 4, pts: 35}]->(g)
RETURN e
$$) as (player agtype);
We can then visualize the graph in a deeper complexity:
SELECT * from cypher('NBA', $$
MATCH (V)-[R]-(V2)
RETURN V,R,V2
$$) as (V agtype, R agtype, V2 agtype);
How many buckets did you drop?
Let's analyze how many points each player got in the game
SELECT * FROM cypher('NBA', $$
MATCH (p:Player)-[e:PLAYED_AT]-(g:Game)
RETURN p.name, e.pts, g.date ORDER BY e.pts DESC
$$) as (player_name agtype, points agtype, date agtype);
player_name | points | date
--------------------+--------+--------------
"De'Aaron Fox" | 35 | "2023-02-14"
"Devin Booker" | 32 | "2023-02-14"
"Deandre Ayton" | 29 | "2023-02-14"
"Domantas Sabonis" | 24 | "2023-02-14"
"Kevin Huerter" | 18 | "2023-02-14"
"Chris Paul" | 17 | "2023-02-14"
"Harrison Barnes" | 11 | "2023-02-14"
"Torrey Craig" | 6 | "2023-02-14"
"Damion Lee" | 5 | "2023-02-14"
"Malik Monk" | 0 | "2023-02-14"
(10 rows)
Conclusion
In conclusion, Apache AGE is a powerful and flexible platform for creating database analytics for NBA. The flexibility of graph databases allows us to easily modify and extend our data model as needed, and the performance of Apache AGE enables us to handle large datasets with ease.
Whether you are a coach, a player, or a fan, the insights gained from database analytics can provide valuable insights into the game of basketball. By using Apache AGE, you can easily create and explore new data-driven approaches to understanding and improving performance in NBA.
Apache AGE Website : https://age.apache.org/
Apache AGE GitHub Repository : https://github.com/apache/age
Top comments (0)