DEV Community

loading...
Cover image for Designing the data team

Designing the data team

helenanders26 profile image Helen Anderson Originally published at helenanderson.co.nz Updated on ・7 min read

So you want to start a data team.

Maybe you need reporting for sales and marketing. Perhaps you want to see if there are insights in the application data you’ve been collecting. Maybe you want to learn more about data science and if that will help keep investors happy.

Whatever your reasoning there is some planning and some tough questions to ask yourself before you engage a recruiter.


Understanding
Collection
Cleansing
Structure
Exploration
Empower
Prediction


Understanding

The first hire you should make is not a data scientist or machine learning engineer. It's time to take a long hard look at your business. The purpose of this step is to understand where the gaps are, how decisions are made and what it is you really need from a data team.

While the idea of having a PhD level data scientist onboard has been romanticised, there must be a reason for them to start digging through data for those elusive insights. And when they have worked through a dataset, what happens next.

Appoint a Project Manager or team lead to sit down with the leaders of your sales, marketing, finance and product functions to identify where the gaps in knowledge or reporting are, and how the business fits together.


Collection

The next step may be a hire, or in a pinch may be something that can be done with your current team. Collecting the data identified from the sources identified in step one. You need something for your data team to analyse and if there is nothing to look at, they will be stuck.

The data I’m referring to here is from systems which already exist. Log files, production databases if you are building software, CRM systems, billing platforms, financial systems, marketing automation tools. All of these systems tell a story about your organisation and customers.

If you are not aware of where and how the data is stored, no analysis can take place.

A data engineer, BI developer, or BI engineer can help get things moving. This role is focused on the ‘plumbing’ of the data world and sets up pipelines to collect data in a constant stream or in batches throughout the day. However, their first job will be to work with your Project Manager or team lead to determine what is available and how easy it is to get at.

Data teams can struggle to produce reports, insights and models because they simply don’t have the data they need. If you have the luxury of starting a team from scratch make this your priority.


Source: [Monica Rogati](https://hackernoon.com/the-ai-hierarchy-of-needs-18f111fcc007)

Source: Monica Rogati

Cleansing

Data may not be structured in a way that’s right for modelling or so granular it needs work to process first. At this stage, a decision needs to be made on where the data is going to go and how it will be processed.

Just because you can get at the data, doesn’t mean you can get to work on analysis.

Will the data flow into a data warehouse, a data lake, or a database for the next step? Or does it make more sense to leave it where it is and query it there using another tool? A data engineer will be able to understand the complexity around the landing and processing large amounts of data and the best way to get it where it needs to go. This repository becomes your ‘single source of truth’

This isn’t the end of the story for the collection and cleansing of data. Changes at source and changes in what is required for analysis means there is always a need for a data engineer.


Structure

Traditionally, data or business intelligence teams have reported into the IT function as they are heavy users of databases and have specialised infrastructure needs. These teams generally work closely with, but are not a part of, product or sales but have the advantage of being able to lean on their IT teammates for support.

data is becoming its own centralised team and is seen as a service function.

If the team are reporting into a product or sales function this changes their focus. They have much more of an inside view of the team they are producing reports and analysis for. However, they may not be able to share knowledge as easily. This also brings up the topic of career progression. If you are the only analyst in the sales team there is no obvious place to move up.

The third possible structure is a hybrid. Teams share knowledge with their peers but remain reporting into their functional teams.

Each has a tradeoff so it’s important to decide which focus your team will take once the team is big enough to need to decide.


Source: DataCamp.com

Source: DataCamp

  • Is the functional area specialised? If the analysis requires deep knowledge of the product or market it could be worth embedding analysts in teams.
  • How complex is the data? If there are gotchas and a steep learning curve just to get to grips with the data, a centralised function is best for learning and onboarding.
  • How distributed is the team? If you have regional offices it makes sense for the analyst to report in locally.

Exploration

Once a reliable flow of data is arriving on your chosen data platform it’s time to start analysing. The first place to start is with descriptive analytics. And with that your first data analyst.

Descriptive analytics is all about getting useful data in front of stakeholders and decision-makers. This answers questions like ‘Which customers have churned?’, ‘Do products sell faster in certain markets?’

Data professionals skills are on a scale with analysts sometimes also being interested in data science, data engineering, visualisation or business decisions.

Data analysts specialise in visualising and describing data using SQL, BI tools and spreadsheets. This role is responsible for taking deep dives into the data to answer business questions and create regular reporting when needed.

It’s an unfortunate fact that analysis is often ad-hoc and never used again. While it is tempting to move on to the next thing that code block might be useful down the track. While not strictly a part of creating a team, creating a culture of sharing and documenting findings is an important first step.

The most important part is to be able to communicate their findings in a way that makes sense for the requester.
  • Is data being captured? You cannot expect an analyst to create reports without access to data.
  • How will the insights be used? There should be a relationship between the analyst and the requester and some kind of feedback loop when they produce work.
  • Are you likely to need reports, graphs, or simply a number for a slide deck? analysts are not "data vending machines" for dashboards.

Empower

As demand for analysis and reporting grows you may consider self-service analytics. These are tools designed to take the complexity away from the data itself and build in business rules.

Anyone with a bit of training and enthusiasm should be able to open the tool and get the answers they need.

Using tools rather than sharing spreadsheets around provides a level of tracking to see what is actually being used. They also build in data governance and security. Something you can’t get from emailing a report around every Monday morning.

While this is great in theory it requires ‘buy-in’ from the teams using the tool and preparation of the data so it’s ready to go. Consider bringing on ‘power users’ from functional areas to test the waters before making any investment in this area.

  • Are your Execs/Board asking for it? Better get to it then.

  • Are you getting the same kind of adhoc requests again and again? If there’s a way to pull together a ‘drag and drop’ dashboard to answer the most common questions it’s worth investigating.

  • Is there a desire for visualisation/automation? If these things will help drive action and business decision making it’s worth looking into a tool to provide a solution.


Prediction

Predictive analytics is the practice of building custom models that can help business decision-makers predict what is coming next. This answers questions like ‘Which customers are likely to churn this month?’, ‘What will sales be over the next year?’. This can also be seen in recommendation systems, suggestions in your email subject line and image recognition.

  • Is there data to support their work? - data science often requires more granular, and potentially sensitive, data to get going.

  • Is there a need for this kind of analysis? - data scientists who end up doing regular reporting will not be utilised or satisfied in their work.

  • How will the models be put into production? - is there a mechanism for their hard work to go anywhere beyond a sandpit and dummy data?


Creating a data team isn’t easy, especially if you are starting from scratch. These guidelines will help you better understand the workflow and building blocks that need to be in place to start your data journey.

This is the first part in a series on designing a data team. I hope you join me for the next post on Hiring a Data Analyst.


Further Reading


Read more:


This post originally appeared on helenanderson.co.nz

Discussion (6)

Collapse
ronsoak profile image
ronsoak

first hire you should make is not a Data Scientist or Machine Learning Engineer

DYING AT THIS COMMENT

Also i'm pretty sure you've been asked to set up data teams in the past, why give it away for free in an article? :P

Collapse
helenanders26 profile image
Helen Anderson Author

Glad you enjoyed it :D

I found myself nodding along to this article as it feels like there aren't enough resources out there for non-Data people who are hiring Data people. Hope my quick overview gives them a few pointers to get started.

Collapse
czep profile image
Scott Czepiel

This is a fantastic article, thank you for providing such a balanced perspective on the tradeoffs involved in these complex decisions. I've worked on data teams at companies large and small, with all 3 organizational structures. Companies usually start with a central team, then move to an embedded or hybrid model as the data needs scale beyond what a central team can deliver. However, if this transition is managed poorly, it can kill morale on the data team as they now have less autonomy over their work and have to answer to less data savvy managers and often can't say no to time wasting ad hoc questions. It takes a strong leader to manage expectations with business teams, and mature leadership to grant the data team autonomy.

Collapse
helenanders26 profile image
Helen Anderson Author

Thanks for the lovely comments, I enjoyed getting all my thoughts down on 'paper'.

The move from a central function to embedding Analysts in teams is really tricky. It's so easy to feel silo'd and miss out on all the things you pick up just by 'hanging around'. I certainly agree it needs to be handled carefully. I prefer the model where Analysts still report in to a central function for the same reasons you've described. It's too easy for Functional leads to treat the Analyst as a vending machine.

Collapse
steelwolf180 profile image
Max Ong Zong Bao • Edited

Haha looks like a must have for board members looking to setup a data science team.

Collapse
helenanders26 profile image
Helen Anderson Author

Glad you liked it :D

Forem Open with the Forem app