DEV Community

# dataengineering

Posts

👋 Sign in for the ability to sort posts by relevant, latest, or top.
Ultimate Directory of Apache Iceberg Resources

Ultimate Directory of Apache Iceberg Resources

Comments
14 min read
Understanding OLTP and Choosing the Right Database

Understanding OLTP and Choosing the Right Database

Comments
6 min read
Change Data Capture (CDC) when there is no CDC

Change Data Capture (CDC) when there is no CDC

Comments
11 min read
Why Apache Spark RDD is immutable?

Why Apache Spark RDD is immutable?

Comments
3 min read
Data Engineering in Observability: The Backbone of Modern Monitoring

Data Engineering in Observability: The Backbone of Modern Monitoring

1
Comments
5 min read
Achieving Clean and Scalable PySpark Code: A Guide to Avoiding Redundancy

Achieving Clean and Scalable PySpark Code: A Guide to Avoiding Redundancy

Comments
5 min read
Data Showdown: OLAP vs. OLTP – The Battle of Real-Time and Analytics Titans

Data Showdown: OLAP vs. OLTP – The Battle of Real-Time and Analytics Titans

Comments
5 min read
Clear Link Between DevSecOps and Data Engineering

Clear Link Between DevSecOps and Data Engineering

Comments
1 min read
Capture Browser XHR/Fetch API Response Automatically into JSON Files

Capture Browser XHR/Fetch API Response Automatically into JSON Files

Comments
1 min read
Building a User-Friendly, Budget-Friendly Alternative to dbt Cloud

Building a User-Friendly, Budget-Friendly Alternative to dbt Cloud

Comments
1 min read
Building Powerful Social Media APIs for Twitter and Telegram: A Developer's Journey

Building Powerful Social Media APIs for Twitter and Telegram: A Developer's Journey

1
Comments
3 min read
How SQL Spatial Data Solves Real-World Problems

How SQL Spatial Data Solves Real-World Problems

Comments
6 min read
Working with Gigantic Google BigQuery Partitioned Tables in DBT

Working with Gigantic Google BigQuery Partitioned Tables in DBT

1
Comments
3 min read
Secure Data Stack: Navigating Adoption Challenges of Data Encryption

Secure Data Stack: Navigating Adoption Challenges of Data Encryption

1
Comments
5 min read
The Ultimate Guide to Data Engineering

The Ultimate Guide to Data Engineering

Comments
2 min read
Understanding the Apache Iceberg Manifest File

Understanding the Apache Iceberg Manifest File

Comments
7 min read
Evolution of Data Sharding Towards Automation and Flexibility

Evolution of Data Sharding Towards Automation and Flexibility

Comments
15 min read
The Power of Data Analytics – Transforming Businesses with Insights

The Power of Data Analytics – Transforming Businesses with Insights

Comments
5 min read
Serverless PDF Processing with AWS Lambda and Textract

Serverless PDF Processing with AWS Lambda and Textract

8
Comments
9 min read
The Simplest Data Architecture

The Simplest Data Architecture

1
Comments
20 min read
🌐 Get started: What is MongoDB Operational Data Layer? (Part 1)

🌐 Get started: What is MongoDB Operational Data Layer? (Part 1)

Comments
2 min read
ETL Real Estate Data Engineering with Redfin: From Extraction to Visualization

ETL Real Estate Data Engineering with Redfin: From Extraction to Visualization

Comments
3 min read
End-to-End AWS KMS Encryption and Decryption Tutorial

End-to-End AWS KMS Encryption and Decryption Tutorial

2
Comments
3 min read
Cogumelos Mágicos: explorando e tratando dados nulos com Mage

Cogumelos Mágicos: explorando e tratando dados nulos com Mage

Comments
6 min read
Understanding Apache Iceberg Delete Files

Understanding Apache Iceberg Delete Files

1
Comments
4 min read
Apache Airflow

Apache Airflow

2
Comments
4 min read
Building and Managing Production-Ready Apache Airflow: From Setup to Troubleshooting

Building and Managing Production-Ready Apache Airflow: From Setup to Troubleshooting

Comments
2 min read
The Must-Have Features of Modern Data Transformation Tools

The Must-Have Features of Modern Data Transformation Tools

Comments
6 min read
An End-to-End Guide to dbt (Data Build Tool) with a Use Case Example

An End-to-End Guide to dbt (Data Build Tool) with a Use Case Example

Comments
4 min read
Data Pipeline Techniques in Action

Data Pipeline Techniques in Action

1
Comments
1 min read
From Data Lakes to Data Mesh: The Emerging Trends of Data Management and Analytics

From Data Lakes to Data Mesh: The Emerging Trends of Data Management and Analytics

1
Comments
8 min read
One Minute: DatAasee

One Minute: DatAasee

1
Comments
1 min read
Data Security Strategy Beyond Access Control: Data Encryption

Data Security Strategy Beyond Access Control: Data Encryption

2
Comments
5 min read
A beginner's guide to data engineering concepts, tools, and responsibilities.

A beginner's guide to data engineering concepts, tools, and responsibilities.

Comments
1 min read
Ensuring Data Integrity: Comparing Soda and Great Expectations for Quality Assurance

Ensuring Data Integrity: Comparing Soda and Great Expectations for Quality Assurance

1
Comments
4 min read
A Beginner's Guide To Data Engineering Concepts, Tools, And Responsibilities.

A Beginner's Guide To Data Engineering Concepts, Tools, And Responsibilities.

Comments
1 min read
Snowflake vs. BigQuery: Choosing the Right Cloud Platform for Your Data

Snowflake vs. BigQuery: Choosing the Right Cloud Platform for Your Data

Comments
2 min read
Building a data science career as a beginner. How can you do it?

Building a data science career as a beginner. How can you do it?

Comments
4 min read
Hiring Alert!

Hiring Alert!

Comments
1 min read
Top 5 Things You Should Know About Spark

Top 5 Things You Should Know About Spark

1
Comments
3 min read
Uploading Files Using Pre-Signed URLs to a Specific Storage Class

Uploading Files Using Pre-Signed URLs to a Specific Storage Class

Comments
2 min read
PySpark optimization techniques

PySpark optimization techniques

1
Comments
4 min read
Avoid These Top 10 Mistakes When Using Apache Spark

Avoid These Top 10 Mistakes When Using Apache Spark

5
Comments
8 min read
Data Engineer and Databricks

Data Engineer and Databricks

Comments
3 min read
Getting Started with Apache Kafka: A Beginner's Guide to Distributed Event Streaming

Getting Started with Apache Kafka: A Beginner's Guide to Distributed Event Streaming

1
Comments
5 min read
Unlocking the Potential of Data with Azure Data Engineers

Unlocking the Potential of Data with Azure Data Engineers

1
Comments
3 min read
RoadMap to Data-Analytics 2024!

RoadMap to Data-Analytics 2024!

3
Comments
2 min read
DBT and Software Engineering

DBT and Software Engineering

3
Comments
7 min read
Effective Techniques for Handling Imbalanced Datasets: My Proven Approach

Effective Techniques for Handling Imbalanced Datasets: My Proven Approach

Comments
3 min read
Understanding Apache Iceberg's metadata.json file

Understanding Apache Iceberg's metadata.json file

1
Comments
7 min read
The Developer’s Guide to Real-Time Data Platforms!

The Developer’s Guide to Real-Time Data Platforms!

7
Comments
6 min read
🌐 Get started: What is MongoDB operational data layer? (Part 2) 🌐

🌐 Get started: What is MongoDB operational data layer? (Part 2) 🌐

5
Comments
2 min read
🌐 开始使用: MongoDB Operational Data Layer 是什么? (第1部分)

🌐 开始使用: MongoDB Operational Data Layer 是什么? (第1部分)

5
Comments
1 min read
Mastering SQL Joins and Unions: Integrate Data for Incredible Insights

Mastering SQL Joins and Unions: Integrate Data for Incredible Insights

Comments
6 min read
Feature Engineering: The Ultimate Guide

Feature Engineering: The Ultimate Guide

1
Comments
2 min read
🦆 💏 🐘 Let PostgreSQL & duckdb "sql" together

🦆 💏 🐘 Let PostgreSQL & duckdb "sql" together

Comments 2
3 min read
What Apache Iceberg REST Catalog is and isn't

What Apache Iceberg REST Catalog is and isn't

9
Comments
3 min read
Transforming Data Engineering: A Business Domain Approach with Data Mesh

Transforming Data Engineering: A Business Domain Approach with Data Mesh

Comments
5 min read
Speeding Up Data on AWS: From Ingestion to Insights

Speeding Up Data on AWS: From Ingestion to Insights

4
Comments
11 min read
การนำเข้าข้อมูลจากไฟล์ CSV เข้ามาใน Posstgres : ทักษะเบื้องต้นของ Data Engineer

การนำเข้าข้อมูลจากไฟล์ CSV เข้ามาใน Posstgres : ทักษะเบื้องต้นของ Data Engineer

Comments
1 min read
loading...