If you were paying attention in 2021, then you would know that data engineering jobs are on the rise.
Jay Feng's report analyzing 10,000 data science interviews found that interviews for data engineering jobs increased by 40% in 2020. There has also been a massive increase in funding in the data engineering space and the explosion of data-driven content has led to a revival of the term data engineering. I've also seen an increase in the number of companies offering data engineering internships. This is not just at Amazon and Facebook, but also at companies like Spotify.
All of this information is good news for aspiring data engineers. It's possible to come out of college, land a good job, and thrive in this growing field.
But what does it take to become a data engineer in 2022? What skills do you need to have today so that they are still relevant five years from now?
In this article, I'll cover what skills emerging data engineers need, where they can apply for jobs, and what they can do to stand out in this competitive field.
What skills are companies asking for from junior data engineers?
It's important to know the basics. Most interviews, at a junior or intern level won't expect you to know more than the three sections below.
- Data modeling and ETL development
Those are generally on all interviews. An exception is that if you are in coding you could learn either Python, Java, or Scala. If you know two of those languages, you can usually figure out the last one.
If you're going into an internship or junior position, you shouldn't be expected to know much more than these requirements. If you're going into a mid-level position, then there might be an expectation to know a little more.
Let's take a look at this data engineering internship opportunity from Amazon. They expect that you have a bachelor's degree, which makes sense. They also expect you to know Python, how to create data pipelines, databases, and warehouse modeling concepts. All of this makes sense.
When you scroll down to the preferred qualifications, there are some questionable statements. Amazon is asking for a master's degree, which seems unnecessary for any sort of computer engineering, software engineering, or data engineering job. These skills are not generally learned in college and there's not exactly a data engineering master's degree.
I also find it frustrating that they want applicants to articulate the basic differences between data types. I'm not sure what they're going for there because NoSQL and relational data types are very similar. I understand what they're getting at but wish they had phrased that point better. They should have said something like understanding the difference between SQL and NoSQL or something similar.
However, they have stated that you'll need a lot of SQL and data modeling experience. I've interviewed at Amazon, and those are definitely skills you will need.
Taking a look at this data engineer intern role at Facebook, they also require experience with SQL Python. They also call out the need to understand distributed systems, which I think is unnecessary, having worked at Facebook or Meta. You don't need to know things that are under the hood, like Presto or Hive, because your job will just involve writing SQL. Nor will you be writing any MapReduce jobs.
I found a lot of junior data engineer positions just by Googling. There are a lot more than were around when I was going into data engineering.
Some companies don't seem to understand the term junior and are asking for 2+ years of experience whereas a junior role should be 0--2 years of experience.
However, there are plenty of junior and internship positions advertised. Thus, you can just ignore the ones that are asking for too much experience.
At least make sure that the position is paying well (at least $80--100K) if it asks for a lot of experience. The Amazon internship that we looked at is paying $7,700 a month for a position based in Colorado, which makes it close to $100,000 a year.
Although there are a lot of positions for data engineers, there are also a lot more people wanting to become data engineers. Entry-level positions are paying up to $100K per annum, which would put you into the top 15% of American earners. You could even make more if you worked at a successful startup with equity options. So, it's important to know how to stand out.
To get a job typically you would study hard, apply for positions and network and get referrals. However, to stand out, you need to promote yourself. You could be the smartest data engineer in the world but if you don't know how to promote yourself, no one will know.
In 2022, there are a ton of avenues to promote yourself. You can make videos, write articles, and share your code, as well as what you're currently learning and/or working on.
Channels to publish content include:
This kind of content helps recruiters and companies get to know you as a person, not just as a resume that hits their desks.
The content publishing method can be tailored to suit your personality and skills. Some people are great at creating their own projects, like doing open-source work and writing code. For people who don't like coding as much, you can work on more high-level concepts and write blogs. This might involve understanding data modeling and writing basic data modeling breakdowns. For example, you could write articles about what the different tables in a warehouse are or compare modern concepts like data mesh, data fabric, and data warehousing.
This type of work helps you to prepare for interviews as well as produce content to share with potential employers. Platforms like YouTube are great because you can share your work and ideas with what feels like an infinite number of people. I've had hiring managers reach out to me because there are not a lot of people publishing videos about data engineering.
There are many things you can do to increase your chances of becoming a data engineer in 2022. You can start by learning the right skills and applying for positions. However, it's important to also promote yourself through content publishing. This will help you stand out from the crowd and show potential employers that you're passionate about data engineering. There are many avenues to publish content, so find one that best suits your personality and skillset.
✅ Website: https://www.theseattledataguy.com/
✅ LinkedIn: https://www.linkedin.com/company/18129251
✅ Personal Linkedin: https://www.linkedin.com/in/benjaminrogojan/
✅ FaceBook: https://www.facebook.com/SeattleDataGu