TL;DR
Neo4j is a pretty solid solution with a lot of features and tooling around.
Intro
I joined a new project recently, and it uses Neo4j as the main DBMS. Before I joined, my knowledge of Neo4j was limited to the fact that it exists. So, today I'm going to tell you about my experience of using Neo4j as the main storage and what I learned so far.
Cypher
Neo4j uses cypher as a query language. "It is like SQL," - they said. Don't you see it?
MATCH (abs:OrgAbstract) WHERE abs.legacyId = toInteger(55)
CREATE (s:Standard:OrgStandard { legacyId: toInteger(1), dateFrom: date('1900-01-01') })
CREATE (f:Factor:OrgFactor { legacyId:toInteger(13) })
WITH abs, s, f, split('F1;42', ';') AS typeArray
UNWIND typeArray AS type
MERGE (ft:FactorType {name: type})
MERGE (abs)-[:HAS_STANDARD]-(s)
MERGE (s)-[:HAS_FACTOR]-(f)
MERGE (f)-[:HAS_FACTOR_TYPE]-(ft)
Okay, once you read the docs and try it out, you start to understand what is going on here. And then, it looks pretty beautiful.
Query execution tools
To try out cypher queries you need a client, don't you?
IntelliJ IDEA plugin
The first thing that comes to mind is an IntelliJ IDEA plugin. 2 seconds of googling, IDEA restart, and you already connecting to Neo4j with Graph Database support plugin.
However, the first attempt with a simple query fails with an IDE plugin error:
java.lang.Throwable: class com.intellij.openapi.editor.EditorFactory it is a service, use getService instead of getComponent
At least this is the case for IntelliJ IDEA 2022.1.4 that I have.
For some consequent queries plugin worked but returned incorrect results.
And if you look at the plugin repository on github, it seems that it is abandoned - there have been no updates for three years.
Neo4j Browser
Neo4j comes with bundled Neo4j Browser, where you can execute queries and interact with the graph. You don't even need to install anything, just open it in the browser, and start querying.
And look how amazing the graph view is:
Neo4j Desktop
There is also Neo4j Desktop app which is an IDE to connect to and manage your Neo4j instances. But I haven't tried it yet.
Migrations and data access
If you need schema migrations, you can use liquigraph or liquibase-neo4j. Note that liquigraph has reached the end of life, and it is advised to migrate to liquibase. And there is a migration utility for this.
There are different options to access the data on the application level. You can use spring-data-neo4j if you are using spring. Another option is an Object Graph Mapping Library for Neo4j. And of course, you can simply use the database driver and execute cypher queries directly.
Community vs Enterprise Edition
If you consider Neo4j as a database for your project, you need to be aware of Neo4j editions and features they support.
There is Neo4j Community Edition which is free. It is a fully functional edition of Neo4j that includes a single database. It can suit you if you are experimenting or delivering a non-critical project.
However, if you want features like multiple databases, clustering, backups and restore, authentication and authorization, role based access control, LDAP and Active Directory integration - they are all in Neo4j Enterprise Edition. And it is paid. Furthermore, you cannot buy it in several countries. So, make sure to consult the documentation or the sales team before you choose Neo4j as the database for your project. Otherwise, you may end up needing Enterprise Edition features to release the project to production, but you cannot buy Enterprise Edition for some reason.
Multi-tenancy approaches
If you need to implement multi-tenancy in your system, here are some options:
- Each tenant has a standalone Neo4j instance. This is the most obvious option. You will probably have some overhead in terms of operations, cost, etc.
- Each tenant has its own database within a single Neo4j instance. This option is available in Enterprise Edition only. And it looks like this is the preferred way to do multi-tenancy in terms of security. Note that to add a new tenant you will need to do some ops/dev work: provision a new database, add some configurations to connect to that database on the application level, and so on.
- Tenants are separated using Neo4j authorisation, users, and ACLs. Again, this is available in Enterprise Edition only.
- Each tenant is simply a label on a node. This way you can implement application logic so that adding a new tenant is simply an administrator's work, no ops/dev effort is required. However, you will need to implement a good solution to separate the data on the application level to make sure that tenants cannot see each other's data.
- Each tenant is a node, and all the data nodes are connected to it. This approach has the same benefits and trade-offs as the previous one.
It is a good question what is the best option in terms of performance, but I do not have the answer yet. I tried to do some heavy queries for the last two approaches, and there was no clear winner.
And here are a couple of useful threads on multi-tenancy on the Neo4j forum:
Conclusion
So, what did I learn?
I learned that Neo4j is a pretty solid solution with a lot of features and tooling around. And it should definitely be considered as an option if you are choosing a database for your project.
Take care. Tomorrow will be... better!
Top comments (0)