DEV Community

Cover image for Data Engineering
Muhammed Jimoh
Muhammed Jimoh

Posted on

Data Engineering

Week5 of the Data Engineering Zoomcamp featuring Sejal Vaidya, Ankush Khanna, and Alexey Grigorev by #DataTalksClub.

Week 5 repo ➡️ Github

🛠 Tools
🧵 Apache Spark
🧵 Apache Hadoop
🧵 Google VMs

Apache Airflow

🏹Week 5 (Batch Processing) Summary:

🎯 Streaming vs. Batch Processing: Advantages and Disadvantages.
🎯 Theoretical and Practical understanding of Batching Processing.
🎯 Bash scripting
🎯 Anatomy of Spark
🎯 RDDS
🎯 Connecting Spark to BigQuery
🎯 Setting up Dataproc cluster

Apache Spark

What a beautiful Monday to begin the week.

Top comments (0)