re: Any NoSQL true believers out there? VIEW POST

VIEW PARENT COMMENT VIEW FULL DISCUSSION

"There is still a schema, but it is defined in data, not code"
Surely you have this the wrong way around? You're right that there is still a schema, but it is defined implicitly by how the domain model utilises the data. By going down this route you're forgoing the data consistency guarantees granted by referential integrity constraints. What if we want object deletion to cascade? What if we want to be sure our data relations are still intact? All of these basic responsibilities have been moved from the database to the application. I've seen the amount of application logic necessary to ensure simple referential integrity in a large scale application, it's not pretty. Feel like null-checking everything? Me neither.
Even Mongo's de-facto standard... O*D*M?... 'Mongoose' implements referential integrity disastrously.
Sure, you're very correct about defining data structures at run-time with ease. You are absolutely correct. This is a huge boon to some applications, but I feel that even modestly complex applications will out-scale mongo very quickly.
Also, have you ever had to use the aggregation framework for anything even moderately complex? It's a dumpster fire. It'll take you hundreds of lines to accomplish even the most basic aggregate queries that SQL is capable of.

My applications are more like spreadsheets in that the user defines the data structures and relationships. They do this at runtime and the data structures are stored as data, but used when data is submitted. We have introduced referential links between entities and it is possible to create views which traverse the references. We have implemented GraphQL to be able to get data, which is also able to traverse between documents using references.

In relation to maintaining referential integrity because there is no coupling to the domain there really is only one area of the code that needs to worry about this. We reap other benefits from this approach, including a elegant security model which means we have fine grained access controls over what fields and documents are visible to users based on an access control policy.

Trying to author your own aggregations is folly. In our application we have been able to do complex data transformations easily by having easy to configure transforms which generate the aggregations. Doing it by hand would be a living nightmare.

Is MongoDB the best solution for everything? Nah. For highly structured data like telco call records SQL is the way. For apps that are tightly coupled to the domain, which is typically how things have been done, is fine. But... and this is a big but... the way we tightly couple applications to the data model is making our applications less flexible than they need to be.

Schemaless systems are opening the door. Ten years ago I was where you are now; SQL was the light and the truth. Today my view is broader and I have been given good reason to question the accepted orthodoxy. That said we can't be blind to the downsides.

My applications are more like spreadsheets in that the user defines the data structures and relationships.

This is an excellent argument for you to use NoSQL stores. But I really don't think the scenario posted in this article needs one. What do you think? :-)

Do you implement GraphQL on the controller layer? or on top on HTTP REST APIs?

Controller. Used the standard Java API for GraphQL, but the schema is dynamically generated when the entities are changed. The schema is not fixed, rather it is defined in data.

Thanks Peter... I will come with couple of questions when we start implementing the system :D

"In relation to maintaining referential integrity because there is no coupling to the domain there really is only one area of the code that needs to worry about this"
What can this passage possibly mean? If there's no coupling to the domain then what is the data doing there in the first place? The problem domain will enforce some kind of invariants on your data, which the schema will need to enforce either explicitly ( through database level constraints such as PRIMARY KEY, NOT NULL etc. ), or implicitly through application logic.
If you're trying to say that there's no relationship between different collections then your application is a much better candidate for NoSQL, but in my experience such cases are actually exceedingly rare.

"We reap other benefits from this approach, including a elegant security model which means we have fine grained access controls over what fields and documents are visible to users based on an access control policy."
The same thing can be implemented at the database level through views and roles in most SQL implementations, which tend to be much more robust than application logic in my experience. That's just my two cents on the matter, however. Security and access control in Mongo has always been pretty much abysmal.

"Trying to author your own aggregations is folly. In our application we have been able to do complex data transformations easily by having easy to configure transforms which generate the aggregations. Doing it by hand would be a living nightmare."
Why would I choose to use a database solution where writing aggregate queries by hand is 'folly', when I can easily pick ones where it isn't?
For a challenge, see how few lines you can write a MongoDB query in that finds all documents where an arbitrary date falls between the range of two date fields.

"Ten years ago I was where you are now; SQL was the light and the truth. Today my view is broader and I have been given good reason to question the accepted orthodoxy"
Ignoring the obvious passive-aggression here, I have worked with MongoDB for years. I'm not some stuffy SQL shill who will never budge. I have worked with both for years, both writing new applications and maintaining legacy ones. I have already "questioned the accepted orthodoxy", and come to my own conclusions.

code of conduct - report abuse