DEV Community

# dataengineering

Posts

👋 Sign in for the ability to sort posts by relevant, latest, or top.
PySpark: missing value

PySpark: missing value

Comments
2 min read
"Day 61 of My Learning Journey: Setting Sail into Data Excellence! Today's Focus: Mathematics for Data Analysis (Graph - 1)

"Day 61 of My Learning Journey: Setting Sail into Data Excellence! Today's Focus: Mathematics for Data Analysis (Graph - 1)

1
Comments
1 min read
Data Engineer Academy Review

Data Engineer Academy Review

Comments
2 min read
HOW TO ADD A DATA DISK TO A VIRTUAL MACHINE

HOW TO ADD A DATA DISK TO A VIRTUAL MACHINE

Comments
3 min read
"Day 58 of My Learning Journey: Setting Sail into Data Excellence! Today's Focus: Maths for Data Analysis (Probability - 4)

"Day 58 of My Learning Journey: Setting Sail into Data Excellence! Today's Focus: Maths for Data Analysis (Probability - 4)

1
Comments
1 min read
DAG no Airflow para invocar Google Cloud Function

DAG no Airflow para invocar Google Cloud Function

Comments
3 min read
Top 10 Common Data Engineers and Scientists Pain Points in 2024

Top 10 Common Data Engineers and Scientists Pain Points in 2024

9
Comments
5 min read
"Day 60 of My Learning Journey: Setting Sail into Data Excellence! Today's Focus: Maths for Data Analysis (Probability - 6)

"Day 60 of My Learning Journey: Setting Sail into Data Excellence! Today's Focus: Maths for Data Analysis (Probability - 6)

1
Comments
1 min read
Demystifying:Azure Data Factory

Demystifying:Azure Data Factory

Comments
1 min read
Column Transformation in Machine Learning

Column Transformation in Machine Learning

1
Comments
3 min read
Data-driven customer acquisition: Machine Learning applied to Customer Lifetime Value

Data-driven customer acquisition: Machine Learning applied to Customer Lifetime Value

Comments
7 min read
"Day 56 of My Learning Journey: Setting Sail into Data Excellence! Today's Focus: Maths for Data Analysis (Probability - 2)

"Day 56 of My Learning Journey: Setting Sail into Data Excellence! Today's Focus: Maths for Data Analysis (Probability - 2)

1
Comments
1 min read
Final project part 5

Final project part 5

Comments
3 min read
"Day 55 of My Learning Journey: Setting Sail into Data Excellence! Today's Focus: Maths for Data Analysis (Probability - 1)

"Day 55 of My Learning Journey: Setting Sail into Data Excellence! Today's Focus: Maths for Data Analysis (Probability - 1)

1
Comments
2 min read
"Day 54 of My Learning Journey: Setting Sail into Data Excellence! Today's Focus: Maths for Data Analysis ( Perm & Comb - 9)

"Day 54 of My Learning Journey: Setting Sail into Data Excellence! Today's Focus: Maths for Data Analysis ( Perm & Comb - 9)

1
Comments
2 min read
"Completed Weeks 3 and 4 of the AI Engineering Journey!. Ready to tackle the next leg of the journey! 🚀"

"Completed Weeks 3 and 4 of the AI Engineering Journey!. Ready to tackle the next leg of the journey! 🚀"

1
Comments
1 min read
Final project part 6

Final project part 6

Comments
3 min read
How to Convert Dates in One Group into An Interval

How to Convert Dates in One Group into An Interval

5
Comments
2 min read
"Day 53 of My Learning Journey: Setting Sail into Data Excellence! Today's Focus: Maths for Data Analysis ( Perm & Comb - 8)

"Day 53 of My Learning Journey: Setting Sail into Data Excellence! Today's Focus: Maths for Data Analysis ( Perm & Comb - 8)

1
Comments
2 min read
ETL VS ELT (Data Pipeline)

ETL VS ELT (Data Pipeline)

Comments 1
1 min read
Variant in Apache Doris 2.1.0: a new data type 8 times faster than JSON for semi-structured data analysis

Variant in Apache Doris 2.1.0: a new data type 8 times faster than JSON for semi-structured data analysis

Comments
12 min read
"Day 48 of My Learning Journey: Setting Sail into Data Excellence! ⛵️ Today's Focus: Maths for Data Analysis ( Per & Com - 3)

"Day 48 of My Learning Journey: Setting Sail into Data Excellence! ⛵️ Today's Focus: Maths for Data Analysis ( Per & Com - 3)

1
Comments
2 min read
Desentrañando el Proceso ETL: La Columna Vertebral de la Ciencia de Datos

Desentrañando el Proceso ETL: La Columna Vertebral de la Ciencia de Datos

Comments
2 min read
"Day 47 of My Learning Journey: Setting Sail into Data Excellence! Today's Focus: Maths for Data Analysis ( P & C- 2)

"Day 47 of My Learning Journey: Setting Sail into Data Excellence! Today's Focus: Maths for Data Analysis ( P & C- 2)

1
Comments
1 min read
“Data has a Dream” — A Short comic about data mesh and how it can transform your company

“Data has a Dream” — A Short comic about data mesh and how it can transform your company

Comments
2 min read
How to Transpose Columns in Each Group to a Single Row

How to Transpose Columns in Each Group to a Single Row

7
Comments
2 min read
"Day 44 of My Learning Journey: Setting Sail into Data Excellence! Today's Focus: Mathematics for Data Analysis (Stats Day -22)

"Day 44 of My Learning Journey: Setting Sail into Data Excellence! Today's Focus: Mathematics for Data Analysis (Stats Day -22)

1
Comments
2 min read
"Day 45 of My Learning Journey: Setting Sail into Data Excellence! Today's Focus: Mathematics for Data Analysis (Stats Day -24)

"Day 45 of My Learning Journey: Setting Sail into Data Excellence! Today's Focus: Mathematics for Data Analysis (Stats Day -24)

1
Comments
1 min read
The Importance of Data in Decision Making

The Importance of Data in Decision Making

4
Comments
2 min read
Apache Doris 2.1.0: TPC-DS, Parallel Adaptive Scan, Local Shuffle, Arrow Flight-based HTTP Data API

Apache Doris 2.1.0: TPC-DS, Parallel Adaptive Scan, Local Shuffle, Arrow Flight-based HTTP Data API

Comments
29 min read
AI and Data Sets – Maximizing the Power of Data

AI and Data Sets – Maximizing the Power of Data

1
Comments
3 min read
"Day 43 of My Learning Journey: Setting Sail into Data Excellence! Today's Focus: Mathematics for Data Analysis (Stats Day -22)

"Day 43 of My Learning Journey: Setting Sail into Data Excellence! Today's Focus: Mathematics for Data Analysis (Stats Day -22)

1
Comments
2 min read
"Day 50 of My Learning Journey: Setting Sail into Data Excellence! Today's Focus: Maths for Data Analysis ( Per & Comb- 5)

"Day 50 of My Learning Journey: Setting Sail into Data Excellence! Today's Focus: Maths for Data Analysis ( Per & Comb- 5)

2
Comments
2 min read
"Day 42 of My Learning Journey: Setting Sail into Data Excellence! Today's Focus: Mathematics for Data Analysis (Stats Day -21)

"Day 42 of My Learning Journey: Setting Sail into Data Excellence! Today's Focus: Mathematics for Data Analysis (Stats Day -21)

1
Comments
1 min read
The Apache Iceberg Lakehouse: The Great Data Equalizer (disrupting the Snowflake/Databricks status quo)

The Apache Iceberg Lakehouse: The Great Data Equalizer (disrupting the Snowflake/Databricks status quo)

1
Comments
7 min read
Getting Started with SQL: An Overview of Its Role in Data Fields

Getting Started with SQL: An Overview of Its Role in Data Fields

Comments
3 min read
MWAA Plugins and Dependency Survival Guide

MWAA Plugins and Dependency Survival Guide

2
Comments
3 min read
Real Talk: My DE Academy Experience

Real Talk: My DE Academy Experience

Comments
2 min read
A deep dive into the concept and world of Apache Iceberg Catalogs

A deep dive into the concept and world of Apache Iceberg Catalogs

Comments
8 min read
📢 About job offers, innovation & data strategy 🔭

📢 About job offers, innovation & data strategy 🔭

Comments 3
3 min read
RisingWave workshop

RisingWave workshop

1
Comments
5 min read
Visualization in dbt

Visualization in dbt

1
Comments
3 min read
Xavier's Insight: Overcoming Data Hoarding Disorder

Xavier's Insight: Overcoming Data Hoarding Disorder

5
Comments
3 min read
Production and CI/CD in dbt

Production and CI/CD in dbt

1
Comments
3 min read
End-to-End Basic Data Engineering Tutorial (Spark, Dremio, Superset)

End-to-End Basic Data Engineering Tutorial (Spark, Dremio, Superset)

2
Comments
11 min read
The Pains of Data Ingestion

The Pains of Data Ingestion

16
Comments 3
6 min read
Final project part 3

Final project part 3

Comments
3 min read
Testing and documenting DBT models

Testing and documenting DBT models

Comments
3 min read
Final project part 2

Final project part 2

Comments
2 min read
Building a project in DBT

Building a project in DBT

Comments
5 min read
Final project part 1

Final project part 1

2
Comments
2 min read
The Role of Ontologies in Data Management

The Role of Ontologies in Data Management

Comments
6 min read
DBT (Data Build Tool)

DBT (Data Build Tool)

1
Comments 1
4 min read
My Data Engineering Library

My Data Engineering Library

Comments
2 min read
When Metrics Go Awry: Analyzing KPIs using machine learning, regression analysis, and Shapley values

When Metrics Go Awry: Analyzing KPIs using machine learning, regression analysis, and Shapley values

Comments
5 min read
XGBoost Training Speed: A Comparative Analysis

XGBoost Training Speed: A Comparative Analysis

Comments
2 min read
Shipping Data in Real Time Debezium : Part 1

Shipping Data in Real Time Debezium : Part 1

1
Comments
2 min read
Embarking on the Data Odyssey: A Deep Dive into Data Engineering for Tech Enthusiasts

Embarking on the Data Odyssey: A Deep Dive into Data Engineering for Tech Enthusiasts

Comments
3 min read
Big Data is dead & other stories

Big Data is dead & other stories

Comments
2 min read
My Experience with Apache Airflow

My Experience with Apache Airflow

6
Comments
3 min read
loading...