Thanks for sharing your experience. Sounds like it was a pretty frustrating project.
There’s no rules to say that everything in a relational database has to be completely normalised with primary/foreign keys on each table. There’s nothing to stop your team storing what is effectively a collection of spreadsheets. It’s an expensive way to go , but not flat out wrong.
Database development is a lot like software development. Sometimes a decision made a few years back causes headaches down the line - setting too strict data types, building metrics on dim tables, retrospectively adding keys, dropping tables without checking dependcies ... so a big bang overhaul to the architecture probably isn’t the best option.
What was the outcome for your project?
That's a good point about technical debt. I'm always cognizant that good database design is hard, but I suppose I should remember that it's not necessarily that someone sat down one day and said "I think this is a good design". These sort of situations can occur over time. And unlike software you cannot refactor a database, so any serious change will need an upgrade/transform step. And I've been involved in enough upgrades to know how scary that can seem.
We actually got it to a POC where its heuristics of inferring schema from unstructured/semi-structured sources worked fairly well, generating UNIQUE queries to find primary keys and JOIN queries for foreign keys. Then show our 'best guess' to a user who could fiddle with the diagram correcting mistakes before committing to the new schema and starting the transfer. But at a certain point if you had a very large, convoluted unstructured database it became a 'garbage in, garbage out' situation.
The actual frustrating part of the project is that priorities shifted and it got put on the shelf :(
We're a place where coders share, stay up-to-date and grow their careers.
We strive for transparency and don't collect excess data.