DEV Community

Ian Greengross
Ian Greengross

Posted on

Data is hard.

I understand data. I spent my late teens and early 20's programming in the languages associated with some of the first DOS based relational databases - Paradox and DBase IV.

So I understand data structure and what data looks like.

But, when you're trying to merge together data from 4 or 5 different sources, where the only key across all of them is a person's name and that person's name can be used differently in each set (Ian Greengross, I. Greengross, IG 01, etc) - then you need to find a way to get those keys unified.

And while my project only involves 32 individuals, I didn't want to do it the "cheap way" and just add column and put in my own key manually.

So - I actually had to think about the data and how to automate giving each individual one unifying key so that when the data is merged, the right data gets assigned to the right individual.

And, guess what? I did it. It wasn't eloquent, but as Zax says, the first rule is - if it works, then that's good enough (or he tells me something to that effect.)

Then after I finally had a way to make sure that each person's data would be properly merged into their "record", I then had to deal with the fact that I would be assigning the same info types/column names into each individual's record, but for different years AND those years varied from individual to individual.

I again wrote another ineloquent script, which again worked. But I would have to rewrite dozens of lines for each script to cover the varying years and rewrite each script every time I moved on to different data tables.

This is where I had left off when I had my first meeting with my mentor, Matt.

In all of about 30 minutes, Matt showed me a much more efficient way to join the data and to do it in a way where I really didn't have to re-write anything to cover the shift in year and only had to rewrite some column labels when I moved on to the next data set I wanted to merge.

In those 30 minutes, Matt saved me from a few hours of rewriting and also helped me to see and think of the data and its structure in a new light.

Which makes data just a little less hard.

For now.

Top comments (0)