re: BI Series: Datamarts, Data Vault, Data Lake... Data Swamp? VIEW POST

TOP OF THREAD FULL DISCUSSION
re: Thanks for sharing your experience. Sounds like it was a pretty frustrating project. There’s no rules to say that everything in a relational datab...
 

That's a good point about technical debt. I'm always cognizant that good database design is hard, but I suppose I should remember that it's not necessarily that someone sat down one day and said "I think this is a good design". These sort of situations can occur over time. And unlike software you cannot refactor a database, so any serious change will need an upgrade/transform step. And I've been involved in enough upgrades to know how scary that can seem.

We actually got it to a POC where its heuristics of inferring schema from unstructured/semi-structured sources worked fairly well, generating UNIQUE queries to find primary keys and JOIN queries for foreign keys. Then show our 'best guess' to a user who could fiddle with the diagram correcting mistakes before committing to the new schema and starting the transfer. But at a certain point if you had a very large, convoluted unstructured database it became a 'garbage in, garbage out' situation.

The actual frustrating part of the project is that priorities shifted and it got put on the shelf :(

code of conduct - report abuse