Why does it feel like so many more articles are discussing the data engineering profession?
Perhaps it's because Dice's 2020 tech jobs report cites data engineering as the fastest-growing field in 2020, increasing by a staggering 50%, while data science roles only increased by 10%?
Or maybe it's just because the Medium algorithm knows that you want to read data engineering articles.
That brings up an important question.
Is data engineering for you?
Data engineers move, remodel, and manage data sets from 10s if not 100s of internal company applications so analysts and data scientists don't need to spend their time constantly pulling data sets.
They may also create a core layer of data that lets different data sources connect to it to get more information or context.
These specialists are usually the first people to handle data. They process the data so it's useful for everyone, not just the systems that store it.
There are obvious reasons to become a data engineer --- like a high salary and numerous opportunities due to limited competition within the job market --- but we're not focusing on those today. Instead, consider the following thoughts, which are a bit more relevant to the job description.
Are you the type of person who enjoys "big picture" ideas and the way systems work? Data engineering may be the perfect job for you.
As a data engineer, you see the whole data lifecycle from end to end. You see how data works on the software side, how it's processed, and how it ends up in an analytical layer.
Along with understanding data, you understand the logic behind the software or application and how it functions, and you know how to translate the data so it's easy for analysts to use. That means knowing the analyst's use case.
Doing your job effectively means knowing how often the person who uses the data needs updates. Is it every day? Every hour? Do they have a model with live data updates?
These are common questions a data engineer asks themselves every day:
- How did the software developers set up the user-flows and business logic?
- What granularity do the scientists need for the data to be most effective?
- Is the table a snapshot that updates the information while losing the historical data, or does it store historical data and need a way to manage duplicates?
In other words, you need to understand the workflow, not just the pipelines.
It's worth noting that many data engineers transfer to a software architect or similar role because they understand the entire system.
"Recovering data scientist" is a common moniker on LinkedIn. It often refers to data scientists who switched to a data engineering role.
If you really enjoy the data aspect but also appreciate the automation components, programming, and system design, you might prefer data engineering. Data engineers build systems.
While data scientists may also build sometimes, their role isn't geared for it. Data engineers regularly build systems, and the best ones enjoy it.
If you like creating automated systems, consider data engineering.
Data engineers develop systems that take data from Point A to Point B. They automate workflows and processes. They automate monitoring, data management, data movement, and data quality checks, and sometimes they have opportunities to implement ML models or statistical models into data pipelines that the data scientists created.
When you enjoy automation, you can do it every day. You'll find new systems to automate, new problems to solve, and daily opportunities to solve puzzles that involve taking data from one point to another.
You may need to optimize workflows because the data is taking too long to move from Point A to Point B. How can you circumvent the current system's limitations? Do you need to think about distributed systems? What technologies are available to help you solve the problem?
Data engineers are forced to be on the cutting edge of tools and systems. A massive data problem grows daily, and these engineers are the best ones to solve it. They figure out ways to compile the data sources into one centralized system, how to process a data set that's too large for one workload to handle, and how to distribute the processing over multiple nodes.
On a daily basis, you'll need to consider your tech options, whether that's older or open source tooling or newer, modern solutions. If you enjoy knowing how other people are solving the Big Data problem and utilizing those systems, you will find data engineering to be a good choice.
Of course, you have to enjoy data. Data engineers like data and know how it moves, how it flows, and how events inside a system represent day-to-day events and complex interactions. The data could be for anything --- people, cars, machinery, etc. You enjoy thinking about data and how it comes together and melds so people can create a logical analysis. If that's you, data engineering is probably a perfect fit.
Some people find data boring, but others enjoy thinking about the analytical side of things or developing systems. The field is expanding, and opportunities are everywhere. You'll find hundreds of articles online about data engineering, and hundreds of millions of dollars are flowing into the field.
There's no better time to jump into the world of data engineering.
If you want to read more about data engineering, then consider joining me on some of my other social sites.