loading...
Cover image for Choosing a graph database
Playtomic

Choosing a graph database

sgmoratilla profile image Sergio Garcia Moratilla Updated on ・2 min read

Disclaimer: I am writing this because I throw a question about what graph database we should use at Playtomic to dev.to, and I would like to give something back to the community. This is the process of how we decided.

When starting with a new piece of tech, the question is always clear: which one of the alternatives? Most of you have had to choose a relational database: mysql vs postgresql vs oracle... Maybe, if you were lucky enough, a no relational database: mongodb (document) vs redis (key-value) vs cassanda (columns) vs hbase (columns)...

What about graphs databases (which are no relational as well)? I only knew Neo4j. With every cloud-vendor offering their own proprietary solutions (i.e. Amazon Nepture) this choose is even harder.

We prefer open source solutions over proprietary ones. More community, less prone to be vendor locked-in. We tend to host on cloud services. We don't have a infrastructure team and we don't want to spend time in maintenance.

In our philosophy, experiments must be goal oriented, not tech-oriented. What's the purpose of testing a graph database in Playtomic? We want to explore whether we can model relations between players better than we already do (with a relational database). Final aim is ending with a recommendation system: new players to meet, new venues where to play, ... all based in the relations with players that you already know.

As our team is small, our time to spend in experiments is limited too. So that, I have to reduce the number of options: Nepture is proprietary and pretty unknown to me, so that I will drop it. I'm not very seduced by OrientDB, as it looks like a too general db.

OrientDB:

  • Schema-less
  • SQL for queries (big win IMHO).
  • Great web console.

Janusgraph:

  • Schema.
  • Gremlin for queries (functional programming language)
  • Drawback: you have to choose what storage to use: HBase vs Cassandra. I'm not sure about the implications.
  • No web console. You have to use third-party application to visualise data.

Neo4j:

  • Schema-less (I'd rather say, only supports key-value).
  • Cipher for queries (query language).
  • Web console.
  • Hosted by GraphStory and GrapheneDb (cheaper).

All of them are available in AWS marketplace. Neo4j has an online sandbox which the rest don't have. You can play with all using docker images.

At this point, I think all three would give us all we need but Neo4j seems easier to test.
Searching how to integrate them with Micronaut and Spring Boot, Neo4j is more popular. It doesn't mean it is better, just that when I find a problem (and I will), someone has already been there.

Thinking about putting this experiment in production, we found some managed Neo4j hostings that allows us to start cheap and scale later if we are happy with it.

So that, Neo4j was the chosen one for our experiment. We will probably writing about the result in a few weeks x)

Playtomic

Playtomic is the biggest network of sport reservations in Spain. Through either our app or web, users can find and book sport activities.

Discussion

markdown guide
 

Hey Sergio, having been working on neo4j, this blog might be useful to you:

inzamam.dev/optimization-essential...