How to use relational and graph db in one project?

#discuss #graph #rdbms #neo4j

I have some use cases which better fit graph db (neo4j), but most of the feature are handled in MySql.

I'm thinking of how to put a glue between both.
Would you duplicate some data to bind MySql with Neo4j, e.g nodes in neo4j keep reference to IDs from MySql or vice versa?

I'm using Spring and thought I'd use chained transactions to handle rollback gracefully of transactions which span both dbs.

For entities on the edge of both sql and graph worlds i plan to duplicate entities (not all properties, just the ones which are used in each db world + IDs)on both and on queries to join data on IDs.

In DDD it's a known technique to split domains and use separate DB for each domain, where certain DB technology fits better. Bit in my case it's really the same domain but different views.

I'm totally aware it all depends from the context and business use cases, but are you aware of some resources/examples of how people are doing it?

I know, "Google is my friend", but I had difficulties to find useful resources.

Top comments (3)

HS • Mar 6 '21

Had a poor experience with Spring Neo4j as it was mainly orientated to use it as relational instead of providing actual graph benefit of looking for related things in specifc ways - traversing but returning only small pieces.

Now regardless of framework or using just cypher builder another thing is I used MongoDB for GeoJSON. I thought about PostgreSQL and it could've been an easy replacement but regardless. Thig is some data should be duplicated in a proper maner you tend to use it. I used Neo4j mainly and if one is to request GeoJSON i first found paths in Neo4j and allowed objects for the user than collected just IDs and found them in MongoDB. Why? Because data relation was easy more dynamic and applicable in graph than any other DB type so an extra call is nothing compared to flexibility and speed provided with it. So I would say that Neo4j should suffice in many aspects but copying data is not a sin. In fact microservices have huge data redundancy in big projects if we are being honest. But if you tend to keep them together in monolith I think you need to pick the prime one. And keep ID references in another as easiest approach. As I said my case was geo data in mongo db had ref to IDs in neo as plan was to split it fully and have isolated geo microservice which depended on neo one for consistency and access control. As mongo is not so good with transactions I couldn't achive it nor cared about it as it was built to sync via events in message queue BUT I think transaction handled by Spring would work great in your case.

Marcin Piczkowski • Mar 6 '21

Thanks, looks like I'm going in the right direction.

Tomasz Kuczma • Mar 7 '21

Data duplication is a standard technic in the NoSQL world. Simply SQL normalization approach (third normal form etc) is not efficient on modern hardware where CPU (not storage) is the most expensive component. So you are good here.

The only question is how are you going to handle transactions and consistency (do you need strong consistency or just eventual consistency is fine). I hope your Spring chained transaction will do the job.

You could also look at it from a different angle and don't use transactions at all - just deal with EC data (it depends on your use case but worth mentioning that it is still an option). The second approach will be useful in distributed systems (SOA/microservices) but here I got the impression you have monolith.

Re DDD concern,
You do your design correctly. It's fine to create a separate module/service (or few) for reads and a separate one for writes. It's called Command Query Responsibility Segregation and helps a lot.

A real-world example for such an approach:
Imagine searching user in Users SQL table. You can search the user by name or id in SQL pretty easily. But doing a more advanced search (e.g. multi-column search with %pattern% ) is not efficient and tricky (imagine users have a lot of searchable columns like "job_title", "department", "team", "hobby" etc.) It would be great to provide separate services for advance searching (based e.g. on Elastic search).
Looks like you have a similar use case so I hope it will help set your mind.