DEV Community

# dataengineering

Posts

👋 Sign in for the ability to sort posts by relevant, latest, or top.
JOIN the data analytics race: Apache Doris vs. ClickHouse, Databricks, and Snowflake

JOIN the data analytics race: Apache Doris vs. ClickHouse, Databricks, and Snowflake

Comments
6 min read
The "Shift-Left" Imperative: Implementing Data Contracts in CI/CD Pipeline

The "Shift-Left" Imperative: Implementing Data Contracts in CI/CD Pipeline

Comments
4 min read
Building a 75,000-Product Image Feature Dataset for the Amazon ML Challenge 2025

Building a 75,000-Product Image Feature Dataset for the Amazon ML Challenge 2025

1
Comments
4 min read
Getting Started Building a Data Platform

Getting Started Building a Data Platform

Comments
3 min read
Building a Universal Lakehouse Catalog: Beyond Iceberg Tables

Building a Universal Lakehouse Catalog: Beyond Iceberg Tables

Comments
10 min read
Real-time Data Analytics at Scale: Integrating Apache Flink and Apache Doris with Flink Doris Connector and Flink CDC

Real-time Data Analytics at Scale: Integrating Apache Flink and Apache Doris with Flink Doris Connector and Flink CDC

Comments
10 min read
Chinese DBA's Story: Hu Zhonghao - The Journey of Becoming a DBA for Domestic Distributed Databases

Chinese DBA's Story: Hu Zhonghao - The Journey of Becoming a DBA for Domestic Distributed Databases

Comments
7 min read
Comprehensive Guide: kwargs vs XCom in Python & Airflow

Comprehensive Guide: kwargs vs XCom in Python & Airflow

Comments
4 min read
Optimizing Kafka Performance: Best Practices for High Throughput and Low Latency

Optimizing Kafka Performance: Best Practices for High Throughput and Low Latency

Comments
7 min read
Tutorial: Intro to Apache Iceberg with Apache Polaris and Apache Spark

Tutorial: Intro to Apache Iceberg with Apache Polaris and Apache Spark

Comments
20 min read
Precise Data Extraction: Pattern-Based Partitioning for Structured Extraction

Precise Data Extraction: Pattern-Based Partitioning for Structured Extraction

Comments
3 min read
Chinese DBA's Story: Sui Haifeng - Grasp the two most important five-year periods of your career

Chinese DBA's Story: Sui Haifeng - Grasp the two most important five-year periods of your career

Comments
5 min read
Snowflake 自律化サービスがもたらすデータエンジニアの新時代2

Snowflake 自律化サービスがもたらすデータエンジニアの新時代2

Comments
1 min read
A Beginner’s Journey with PostgreSQL

A Beginner’s Journey with PostgreSQL

1
Comments
3 min read
🎓 Building a Smart LMS Assistant: RAG System with Pinecone for Multi-Source Learning Data

🎓 Building a Smart LMS Assistant: RAG System with Pinecone for Multi-Source Learning Data

Comments
3 min read
Big Data Processing (Hadoop, Spark)

Big Data Processing (Hadoop, Spark)

2
Comments
5 min read
Building a clean Energy Data Pipeline for Africa( from raw CSVs to MongoDB)

Building a clean Energy Data Pipeline for Africa( from raw CSVs to MongoDB)

Comments
1 min read
From APIs to Aquifers: A Developer's Guide to Smart Water Management Data

From APIs to Aquifers: A Developer's Guide to Smart Water Management Data

Comments
7 min read
Data in the Cloud: Understanding 6 Common Data Formats in Analytics

Data in the Cloud: Understanding 6 Common Data Formats in Analytics

Comments
3 min read
💥 Polars vs. Pandas: Why Your Next ETL Pipeline Should Run on Rust (Part 1/5)

💥 Polars vs. Pandas: Why Your Next ETL Pipeline Should Run on Rust (Part 1/5)

5
Comments
2 min read
Guia arquitetônico de ponta para a construção de uma plataforma de dados

Guia arquitetônico de ponta para a construção de uma plataforma de dados

Comments
6 min read
Python For Data Engineering

Python For Data Engineering

Comments
3 min read
Picking the Right Data Format for Your Workflow

Picking the Right Data Format for Your Workflow

Comments
3 min read
Data Automation: A Deep Dive

Data Automation: A Deep Dive

5
Comments
5 min read
🔍 Understanding 6 Common Data Formats in Data Analytics (With Examples)

🔍 Understanding 6 Common Data Formats in Data Analytics (With Examples)

Comments
4 min read
loading...