How to Use Memgraph With Python and Jupyter Notebooks

#python #jupyter #memgraph #graphdatabase

One of the simplest ways you can analyze graphs is by creating a Python script or Jupyter Notebook. Python has always been the language of choice for most data scientists, and we know why. It strives to be simple and straightforward!

So, in the spirit of this simplicity, here is a quick tutorial on how to do some basic network analysis using Memgraph and Python.

The general outline is:

Start Memgraph
Connect to Memgraph
Import a graph
Query the database
Run graph algorithms

Now, I am going to show you an example of how to accomplish this. You can also take a look at this Jupyter Notebook if it's more your style.

Prerequisites

For this tutorial, you will need to install:

Python3 & pip
Docker

Docker is used because Memgraph is a native Linux application and cannot be installed on Windows and macOS without it. Even if you are running Linux, I suggest the Docker installation as it includes some additional helpful resources like Memgraph Lab.

1. Start Memgraph

While there are a few ways of installing Memgraph, the recommended one is to install Memgraph Platform with Docker. If you don't have Docker, I suggest you follow this guide.

After you install Docker, you can set up Memgraph by running:

docker run -it -p 7687:7687 -p 3000:3000 memgraph/memgraph-platform

This command will start the download and after it finishes, run the Memgraph container.

2. Connect to Memgraph

First, create a file named memgraph.py and add the following code:

from gqlalchemy import Memgraph


memgraph = Memgraph("127.0.0.1", 7687)

However, before we can execute this, we need to install the GQLAlchemy package. We will be using the GQLAlchemy object graph mapper (OGM) to connect to Memgraph and execute Cypher queries easily. GQLAlchemy also serves as a Python driver/client for Memgraph. You can install it using:

pip install gqlalchemy

Hint: You may need to install CMake before installing GQLAlchemy.

Now, make sure that everything is working correctly by running the script:

python3 memgraph.py

If you get an error or you need help with something else, definitely message us on Discord or through our forum.

3. Import a graph

Let's make sure that Memgraph is empty before we start with anything else.
Add the following line to memgraph.py:

memgraph.drop_database()

You will need to download this file which contains a simple dataset called Zachary's karate club. To import it into Memgraph, we will first need to copy it to the Docker container where Memgraph is running.

Find the CONTAINER_ID by running:

docker ps

Copy the file with the following command (don't forget to replace CONTAINER_ID):

docker cp karate-club.csv CONTAINER_ID:karate-club.csv

Now, we can execute the Cypher command LOAD CSV, which is used for loading data from CSV files:

memgraph.execute("""
    LOAD CSV FROM "/karate-club.csv" NO HEADER AS row
    MERGE (p1:Person {id: row[0]})
    MERGE (p2:Person {id: row[1]})
    MERGE (p1)-[:FRIENDS_WITH]->(p2);
""")

That's it for the import. After you run the script, Memgraph should be populated with the data. You can make sure by opening Memgraph Lab in your browser at the address http://localhost:3000. To retrieve everything from the database, execute the following query in the Query tab:

MATCH (n)-[r]-(m) RETURN n, r, m;

Hint: Memgraph Lab is running on http://localhost:3000 if you installed Memgraph Platform using Docker. Otherwise, you will need to download and install Memgraph Lab manually.

4. Query the database

Let's make sure that our data was imported correctly by retrieving it:

results = memgraph.execute_and_fetch("""
    MATCH (p1:Person)
    RETURN p1
    ORDER BY ToInteger(p1.id) ASC;
""")

for result in results:
    print(result['p1'])

5. Run graph algorithms

Now, let's do something clever with our graph. For example, calculating PageRank for each node:

results = memgraph.execute_and_fetch("""
    CALL pagerank.get()
    YIELD node, rank
    RETURN node, rank;
""")

for result in results:
    print("The PageRank of node ", result['node'].properties['id'], ": ", result['rank'])

What's next?

As you can see, it's very easy to connect to Memgraph and run graph algorithms using Python. I even have a suggestion where you could apply this knowledge.