Since Memgraph is a graph database that stores data only in memory,
the GQLAlchemy library provides an on-disk storage solution for large properties not used in graph algorithms.
from gqlalchemy import Memgraph, SQLitePropertyDatabase, Node, Field from typing import Optional graphdb = Memgraph() SQLitePropertyDatabase('path-to-my-db.db', graphdb) class User(Node): id: int = Field(unique=True, exists=True, index=True, db=graphdb) huge_string: Optional[str] = Field(on_disk=True) my_secret = "I LOVE DUCKS" * 1000 john = User(id=5, huge_string=my_secret).save(db) john2 = User(id=5).load(db) print(john2.huge_string) # prints I LOVE DUCKS, a 1000 times
What’s happening here?
graphdbcreates a connection to an in-memory graph database
SQLitePropertyDatabaseattaches to the
graphdbin its constructor
- When creating a definition for a node with a label
Usertwo properties are defined
User.idis a required property of type
intthat creates UNIQUENESS and EXISTS constraints and an index inside Memgraph
User.huge_stringis an optional User property that is saved to and loaded from an SQLite database
my_secretis an example of a huge string that would unnecessarily slow down a graph database
User().save()saves the node with the label
Userin a graph database and stores the
- When loading the data, the inverse happens, the node is fetched from the graph database and the
huge_stringproperty from the
Saving large properties in an on-disk database
Many graphs used in graph databases have nodes with a lot of metadata that isn't used in graph computations. Graph databases aren't designed to perform effectively with large properties like strings or parquet files.
The problem is usually solved by using a separate SQL database or a key-value store to connect large properties with the ID of the node. Although the solution is straightforward, it is cumbersome to implement and maintain. Not to mention, you have to do it for each project from scratch.
We've identified the problem and decided to take action. With the release of GQLAlchemy 1.1, you can easily define which properties will be saved in a graph database, and which in an on-disk storage solution. You can do that once, in the model definition, and never worry again if properties are saved or loaded properly from the correct database.
How does it work?
GQLAlchemy is a python library that aims to be the go-to Object Graph Mapper (OGM) -- a link between graph database objects and python objects. It is built on top of Pydantic and provides object modeling, validation, serialization and deserialization out of the box.
With GQLAlchemy, you can define python classes that map to graph objects like Nodes and Relationships in graph databases.
Every such class has properties or fields that hold data about the graph objects.
When you want a property to be saved on disk instead of an in-memory database, you specify that with the
from gqlalchemy import Node, Field, SQLiteOnDiskPropertyDatabase from typing import Optional class User(Node): graphdb_property: Optional[str] = Field() on_disk_property: Optional[str] = Field(on_disk=True)
This instruction influences Node serialization and deserialization when it is being saved or loaded from a database.
Before being able to use it, you have to specify which implementation of the
OnDiskPropertyDatabase you'd like to use.
For example, we'll use the SQLite implementation.
from gqlalchemy import Memgraph, SQLiteOnDiskPropertyDatabase db = Memgraph SQLiteOnDiskPropertyDatabase("property_database.db", db)
Now, every time you'd save or load a graph object from a graph database, the
on_disk properties are going to be handled automatically using the
user = User( graphdb_property="This property goes into the graph database", on_disk_property="This property goes into the sqlite database" ).save(db)
Now you know how to use on-disk properties, so your in-memory graph doesn't eat up too much RAM.
Graph algorithms should also run faster because most of these large properties often aren't needed for graph analytics.
If you have questions about how to use the on-disk storage, visit our Discord server and drop us a message.
Top comments (0)