loading...

Path to become a junior+ data engineer?

beamer profile image Michel's fanboi ・1 min read

Henlo!

I'm an I.T. student and I'd like to work as a data engineer but I'm like a fish lost in an ocean of big data tools.

First of, I've got a strong Web background, mainly doing back-end stuff such as building and deploying kind of micro-services around the internet. But what I like most is to work with data, Big Data.

But I don't know where to start. Today I'm quite confident with Apache Beam, SQL/NoSQL, Messaging Queues, Cloud solutions... but I feel like it's nothing compared to the great diversity of Big Data tools.

Should I go for Open-Source stuff such as Kafka, Cassandra, HDFS etc, or should I focus on the Cloud side (Cloud Dataflow, AWS EMR, Pub/Sub, Kinesis...) ?

I'd appreciate any help ;)

Discussion

pic
Editor guide
Collapse
kerriop profile image
Dmitry

Try to setup your first hadoop cluster(powered by azure/aws), then use clustered database(hive or another) for your regular tasks, then you'll get the basics of big data tools