Introduction to graph databases
Graph databases are gaining traction across a variety of applications and industry sectors.
From fraud detection and supply chain optimization to machine learning and artificial intelligence,
graph analytics enable developers to build a new generation of applications centered around network analysis.
In this article, we compare two leading graph databases, Memgraph and Neo4j, to help you choose the best graph analytics platform for your needs.
Although comparisons usually focus on performance benchmarks, there are many other crucial factors to consider when choosing a database for your business.
With that in mind, we’ll start by looking at some of the similarities between Memgraph and Neo4j before moving on to the
differentiating factors between the two graph databases.
What is Neo4j?
Neo4j is an ACID-compliant transactional native graph database. It is a disk-based system mainly implemented in Java and has been publicly available since 2007.
It's the most widely utilized graph solution out there and one of the pioneers in the space.
What is Memgraph?
Memgraph is an open source graph database built for real-time streaming that is also compatible with Neo4j. It's powered by a high-performance, ACID-compliant transactional native graph database. It is engineered from the ground-up leveraging an in-memory first, durable, and redundant architecture and a C/C++
implementation to deliver the unique capability of supporting both transactional and analytical workloads.
General Differences Between Neo4j and Memgraph
Feature | Neo4j | Memgraph |
---|---|---|
Initial release | 2007 | 2017 |
License | AGPLv3 / Commercial | BSL / Apache 2 / Commercial |
Written in | Java | C++ |
Data model | Labeled Property Graph | Labeled Property Graph |
Data storage | On-disk | In memory |
Source code | GitHub | GitHub |
Hosted Cloud service | Neo4j Aura | Memgraph Cloud |
Both solutions have been around for some time now. While Neo4j has a longer track record,
Memgraph has the benefit of being implemented in C++, which makes it more optimized and more performant.
The main difference between these two is the storage engine. Memgraph uses an in-memory storage engine
while Neo4j implements a traditional on-disk storage solution.
The main difference: On-disk vs in-memory storage
Even though there are many differences between on-disk and in-memory, the choice primarily depends on your use case and your requirements.
The on-disk storage method is a default choice if you are storing a large number of objects that don't need to be retrieved very often, i.e.,
if you need a system of record and a general-purpose graph storage solution. If that is the case, Neo4j will do an amazing job.
On the other side, Memgraph has implemented a complete in-memory solution that focuses on stream processing and real-time computations
that need to be executed in the shortest possible timeframe. So, if you have a large graph that needs to be analyzed frequently and you need real-time response and answers without performance-related issues, then Memgraph is the way to go.
Technical features
Feature | Neo4j | Memgraph |
---|---|---|
ACID transactions | Yes | Yes |
Replication | Yes | Yes |
Query language | Cypher | Cypher |
Drivers & clients | .Net, Clojure, Elixir, Go, Groovy, Haskell, Java JavaScript, Perl, PHP, Python, Ruby, Scala | .Net, C, C++, Go, Haskell, Java, JavaScript, PHP, Python, Ruby, Scala |
Triggers | Yes | Yes |
Concurrency | Yes | Yes |
Durability | Yes | Yes |
Bolt protocol support | Yes | Yes |
Backups | Yes | Yes |
Streaming platform integrations | Apache Kafka, Redpanda | Apache Kafka, Redpanda, Apache Pulsar |
Query execution Plans | Yes | Yes |
Authentication and Authorization | Yes | Yes |
Data encryption in transit | Yes | Yes |
Data science library | GDS | MAGE |
Custom procedures | Java | Python, C, C++, Rust [1] |
[1] You can write the procedures in any programming language which can work with C and can be compiled to the ELF shared library format.
There are of course many more features that Neo4j and Memgraph implement,
but we will be focusing on those that are necessary for stream processing.
Drivers & clients
There is a broad number of drivers in many different programming languages available for both solutions.
While Memgraph only maintains a few in-house drivers that it develops and supports (C,
C++, Python, Rust),
most Neo4j drivers can also be used with Memgraph. This is due to the fact that both solutions use the Bolt protocol, labeled property graph model and Cypher query language.
Streaming platform integrations
Memgraph includes connectors out of the box for Apache Kafka and Apache Pulsar with a
few more on the way in future releases. Neo4j also offers a Kafka Connect plugin
that brings streaming support to the whole ecosystem. Memgraph has also been tested with Redpanda, a high-performance Kafka alternative.
Neo4j GDS & Memgraph MAGE
Neo4j GDS or Graph Data Science is a library that provides efficiently implemented,
parallel versions of common graph algorithms for Neo4j, exposed as Cypher procedures.
It contains many of the most popular graph algorithms out there and you can use it to perform complex graph analysis tasks.
Memgraph MAGE or Memgraph Advanced Graph Extensions
is an open-source library for running graph algorithms exposed as Cypher procedures.
It focuses on real-time analysis and implements a few online algorithms like PageRank, community detection and node2vec.
These algorithms are suited for streaming data that needs to be processed incrementally whenever a new node or relationship is created, or existing ones are updated.
Custom procedures in Neo4j and Memgraph
Neo4j and Cypher can be extended with User Defined Procedures and Functions. Neo4j itself provides and utilizes custom procedures.
Many of the monitoring, introspection and security features available through the Neo4j-Browser are implemented using these custom procedures.
However, given that Neo4j is implemented in Java, the custom procedures and functions also depend on a Java API.
Memgraph is mainly focused on the Python ecosystem and community.
While the core engine is implemented in C++ to ensure the best resource utilization and performance,
custom procedures (called query modules in Memgraph) can be implemented in multiple programming languages and, most importantly, in Python as well.
These procedures can contain graph algorithms, utility tools, custom APIs... whatever you can come up with it.
You can call them from the Cypher query language like you would any other query and combine them with other features such as streams or triggers.
What is the best graph database for your use case?
As both databases offer a broad range of features, your decision will mostly depend on your specific use case. If performance and cost are not crucial factors, then Neo4j will work for you. However, if you are dealing with real-time data and need a faster and more optimized alternative, then you should go with Memgraph.
As already mentioned, Neo4j is a pioneer among graph databases and graph technologies in general. It is good for Java-oriented developers and for static data storage that doesn't rely on frequent write operations. On the other side, Memgraph focuses on stream processing, real-time graph analytics and caters more to Python, C++ and Rust developers. If you need to run complex graph algorithms and traversals often and expect the results in the shortest amount of time, Memgraph is the way to go.
If you want to try Memgraph, check our Memgraph Demo on Playground (no installation or registration needed). Explore our guides, samples and references on Memgraph Docs and if you have any questions, join our growing Community and share your projects with us.
Top comments (0)