DEV Community

# dataengineering

Posts

ūüĎč Sign in for the ability to sort posts by relevant, latest, or top.
Batch Processing vs Stream Processing: Why Batch is dying and Streaming takes over

Batch Processing vs Stream Processing: Why Batch is dying and Streaming takes over

Comments
13 min read
Structure Query Language

Structure Query Language

6
Comments
2 min read
Important Questions related to Data Engineering

Important Questions related to Data Engineering

2
Comments
1 min read
AWS Cloud9 for Data Engineers

AWS Cloud9 for Data Engineers

1
Comments
5 min read
How we mastered dbt: A true story

How we mastered dbt: A true story

3
Comments
14 min read
Using python dictionary in data engineering.

Using python dictionary in data engineering.

2
Comments 2
2 min read
Integrando uma Web API com Datastore Emulator

Integrando uma Web API com Datastore Emulator

Comments
4 min read
Python functions and lambda functions in data engineering.

Python functions and lambda functions in data engineering.

2
Comments
3 min read
Data Wrangling in Python: Tips and Tricks

Data Wrangling in Python: Tips and Tricks

Comments
3 min read
How I Decreased ETL Cost by Leveraging the Apache Arrow Ecosystem

How I Decreased ETL Cost by Leveraging the Apache Arrow Ecosystem

Comments
6 min read
Data Platform Architecture Types

Data Platform Architecture Types

1
Comments
9 min read
Creating Data Pipelines as DAGs in Apache Airflow (Part 1)

Creating Data Pipelines as DAGs in Apache Airflow (Part 1)

Comments
6 min read
Website Monitoring using AWS Lambda and Aurora

Website Monitoring using AWS Lambda and Aurora

2
Comments
4 min read
Apache Airflow - Deep Dive | All you need to know about Airflow

Apache Airflow - Deep Dive | All you need to know about Airflow

5
Comments
20 min read
SQL101: Introduction to SQL

SQL101: Introduction to SQL

Comments 2
12 min read
Data Pipelines with Great Expectations | Introduction

Data Pipelines with Great Expectations | Introduction

2
Comments
2 min read
22 Best DataOps Tools To Optimize Your Data Management and Observability In 2023

22 Best DataOps Tools To Optimize Your Data Management and Observability In 2023

16
Comments 1
30 min read
Building a Data Lakehouse for Analyzing Elon Musk Tweets using MinIO, Apache Airflow, Apache Drill and Apache Superset

Building a Data Lakehouse for Analyzing Elon Musk Tweets using MinIO, Apache Airflow, Apache Drill and Apache Superset

13
Comments 2
8 min read
Nesting Columns like a Pro: A Guide to Mastering Nested Structs in PySpark

Nesting Columns like a Pro: A Guide to Mastering Nested Structs in PySpark

Comments
4 min read
AWS Data Engineering Services: Everything you need to know

AWS Data Engineering Services: Everything you need to know

5
Comments
9 min read
PySpark: A brief analysis to the most common words in Dracula, by Bram Stoker

PySpark: A brief analysis to the most common words in Dracula, by Bram Stoker

13
Comments
5 min read
Working with Map() function in Python, Pyspark and Apache Beam

Working with Map() function in Python, Pyspark and Apache Beam

1
Comments
3 min read
Working with large CSV files in Python from Scratch

Working with large CSV files in Python from Scratch

6
Comments
1 min read
Job Search API

Job Search API

6
Comments
1 min read
Redshift Deep Dive

Redshift Deep Dive

1
Comments
5 min read
Azure Data Factory - Incrementally load data from Azure SQL to Azure Data Lake using Watermark

Azure Data Factory - Incrementally load data from Azure SQL to Azure Data Lake using Watermark

4
Comments
1 min read
What is data integration?

What is data integration?

10
Comments 2
4 min read
Data Engineering Trends for 2023

Data Engineering Trends for 2023

3
Comments
4 min read
The Changing Face Of ETL

The Changing Face Of ETL

3
Comments 1
12 min read
Ultimate guide to becoming a Data Analyst/Data Scientist

Ultimate guide to becoming a Data Analyst/Data Scientist

3
Comments
4 min read
SkyX: desenvolvimento de uma análise de tráfego aéreo em tempo real com Spark Structured Streaming e Apache Kafka

SkyX: desenvolvimento de uma análise de tráfego aéreo em tempo real com Spark Structured Streaming e Apache Kafka

1
Comments
8 min read
2022 Beginner Friendly Modern Data Engineering Career path With Learning Resources.

2022 Beginner Friendly Modern Data Engineering Career path With Learning Resources.

20
Comments 2
2 min read
Learn Ansible and how to Install it in Ubuntu 22.04.

Learn Ansible and how to Install it in Ubuntu 22.04.

Comments
3 min read
Uma breve Introdução ao processamento de dados em tempo real com Spark Structured Streaming e Apache Kafka

Uma breve Introdução ao processamento de dados em tempo real com Spark Structured Streaming e Apache Kafka

4
Comments
8 min read
Apache-Spark introduction for SQL developers

Apache-Spark introduction for SQL developers

2
Comments
7 min read
PySpark: uma breve an√°lise das palavras mais comuns em Dr√°cula, por Bram Stoker

PySpark: uma breve an√°lise das palavras mais comuns em Dr√°cula, por Bram Stoker

4
Comments 6
6 min read
INTRODUCTION TO PYTHON FOR DATA ENGINEERING

INTRODUCTION TO PYTHON FOR DATA ENGINEERING

Comments
4 min read
Create Jira Ticket on Prefect Task Failure

Create Jira Ticket on Prefect Task Failure

Comments
2 min read
Introdu√ß√£o √† an√°lise de dados com PySpark utilizando os dados dos campe√Ķes de League of Legends

Introdu√ß√£o √† an√°lise de dados com PySpark utilizando os dados dos campe√Ķes de League of Legends

3
Comments
8 min read
Pokemons Flow: desenvolvendo uma pipeline de dados com apache airflow para extração de pokemon via API

Pokemons Flow: desenvolvendo uma pipeline de dados com apache airflow para extração de pokemon via API

9
Comments
6 min read
Apache PySpark for Data Engineering

Apache PySpark for Data Engineering

6
Comments 4
9 min read
Data Engineering 101: Introduction to Data Engineering

Data Engineering 101: Introduction to Data Engineering

10
Comments
3 min read
Introduction to Python for Data Engineering

Introduction to Python for Data Engineering

4
Comments
5 min read
Kubernetes Was Never Designed for Batch Jobs

Kubernetes Was Never Designed for Batch Jobs

3
Comments 2
17 min read
Data Engineering 102: Introduction to Python for Data Engineering.

Data Engineering 102: Introduction to Python for Data Engineering.

5
Comments
10 min read
Introduction to Python for Data Engineering

Introduction to Python for Data Engineering

4
Comments
7 min read
Data Engineering 102: Introduction to Python for Data Engineering

Data Engineering 102: Introduction to Python for Data Engineering

Comments
3 min read
Python For Data Engineering

Python For Data Engineering

9
Comments
5 min read
DATA ENGINEERING 101:INTRODUCTION TO DATA ENGINNERING.

DATA ENGINEERING 101:INTRODUCTION TO DATA ENGINNERING.

5
Comments
2 min read
Online SQL Client for low code data management

Online SQL Client for low code data management

5
Comments 1
5 min read
Data Engineering 101: Introduction to Data Engineering.

Data Engineering 101: Introduction to Data Engineering.

3
Comments
6 min read
Hash Personal Identifiable Information (PII) in your ELT pipelines

Hash Personal Identifiable Information (PII) in your ELT pipelines

3
Comments
3 min read
Difference Between Data Engineer and Data Scientist?

Difference Between Data Engineer and Data Scientist?

7
Comments
3 min read
Learning Workflow Schedulers (Oozie)

Learning Workflow Schedulers (Oozie)

1
Comments
5 min read
Solving AttributeError: 'float' object has no attribute 'rint'

Solving AttributeError: 'float' object has no attribute 'rint'

3
Comments
2 min read
[Spark-k8s] ‚ÄĒ Getting started # Part 1

[Spark-k8s] ‚ÄĒ Getting started # Part 1

1
Comments
4 min read
Websites to find Dataset for your Data Engineering projects.

Websites to find Dataset for your Data Engineering projects.

5
Comments
1 min read
Data engineers must-see: The future trend of big data cloud services

Data engineers must-see: The future trend of big data cloud services

8
Comments
8 min read
Data Engineering Projects for Beginners

Data Engineering Projects for Beginners

17
Comments 2
2 min read
Data Pipelines with Apache Airflow - Book Review

Data Pipelines with Apache Airflow - Book Review

6
Comments
2 min read
loading...