Skip to content
Navigation menu
Search
Powered by
Search
Algolia
Search
Log in
Create account
DEV Community
Close
#
dataengineering
Follow
Hide
Posts
Left menu
👋
Sign in
for the ability to sort posts by
relevant
,
latest
, or
top
.
Right menu
Hands-on with Apache Iceberg & Dremio on Your Laptop within 10 Minutes
Alex Merced
Alex Merced
Alex Merced
Follow
Oct 31
Hands-on with Apache Iceberg & Dremio on Your Laptop within 10 Minutes
#
database
#
datascience
#
dataengineering
#
dataanalytics
Comments
Add Comment
19 min read
Mastering Workflow Automation with Apache Airflow for Data Engineering
Aditya Pratap Bhuyan
Aditya Pratap Bhuyan
Aditya Pratap Bhuyan
Follow
Oct 31
Mastering Workflow Automation with Apache Airflow for Data Engineering
#
dataengineering
#
apacheairflow
Comments
Add Comment
6 min read
Data Modeling - Entities and Events
Alex Merced
Alex Merced
Alex Merced
Follow
Oct 30
Data Modeling - Entities and Events
#
database
#
dataengineering
#
datamodeling
#
dataanalytics
Comments
Add Comment
6 min read
Análise de dados de tráfego aéreo em tempo real com Spark Structured Streaming e Apache Kafka
Geazi Anc
Geazi Anc
Geazi Anc
Follow
Oct 28
Análise de dados de tráfego aéreo em tempo real com Spark Structured Streaming e Apache Kafka
#
dataengineering
#
python
#
braziliandevs
#
spark
Comments
Add Comment
8 min read
My journey learning Apache Spark
Paulet Wairagu
Paulet Wairagu
Paulet Wairagu
Follow
Oct 26
My journey learning Apache Spark
#
spark
#
sql
#
dataengineering
Comments
Add Comment
2 min read
SQL "SELECT INTO" vs "INSERT INTO SELECT" statements.
Danwycliff Ndwiga
Danwycliff Ndwiga
Danwycliff Ndwiga
Follow
Oct 30
SQL "SELECT INTO" vs "INSERT INTO SELECT" statements.
#
sql
#
database
#
data
#
dataengineering
Comments
Add Comment
1 min read
My Journey into Data AI and Machine Learning
Lusanda Ndlovu
Lusanda Ndlovu
Lusanda Ndlovu
Follow
Oct 20
My Journey into Data AI and Machine Learning
#
softwaredevelopment
#
ai
#
machinelearning
#
dataengineering
Comments
Add Comment
1 min read
Why Data Security is Broken and How to Fix it?
Lulu Cheng
Lulu Cheng
Lulu Cheng
Follow
for
jarrid.xyz
Oct 15
Why Data Security is Broken and How to Fix it?
#
security
#
automation
#
devops
#
dataengineering
1
reaction
Comments
Add Comment
5 min read
From ETL and ELT to Reverse ETL
luminousmen
luminousmen
luminousmen
Follow
Oct 15
From ETL and ELT to Reverse ETL
#
dataengineering
#
bigdata
#
data
Comments
Add Comment
4 min read
*Mastering Informatica Intelligent Cloud Services (IICS) for Cloud Data Integration*
Rodolfo Mendivil
Rodolfo Mendivil
Rodolfo Mendivil
Follow
Oct 18
*Mastering Informatica Intelligent Cloud Services (IICS) for Cloud Data Integration*
#
iics
#
data
#
etl
#
dataengineering
1
reaction
Comments
Add Comment
3 min read
Building a Big Data Playground Sandbox for Learning
Abdullah Haggag
Abdullah Haggag
Abdullah Haggag
Follow
Oct 17
Building a Big Data Playground Sandbox for Learning
#
dataengineering
#
bigdata
#
opensource
4
reactions
Comments
Add Comment
5 min read
What is Data Engineering?
Norton Augusto Herrero dos Santos
Norton Augusto Herrero dos Santos
Norton Augusto Herrero dos Santos
Follow
Oct 12
What is Data Engineering?
#
dataengineering
#
datascience
Comments
Add Comment
1 min read
Explaining the History of Data Lakehouse
Pavol Z. Kutaj
Pavol Z. Kutaj
Pavol Z. Kutaj
Follow
Oct 14
Explaining the History of Data Lakehouse
#
lakehouse
#
dataengineering
#
warehouse
Comments
Add Comment
2 min read
End-to-End ETL and Sales Dashboard on WWI dataset in Microsoft Fabric
abdulmaleek mubaraq
abdulmaleek mubaraq
abdulmaleek mubaraq
Follow
Oct 8
End-to-End ETL and Sales Dashboard on WWI dataset in Microsoft Fabric
#
tutorial
#
dataengineering
#
devto
#
analytics
Comments
Add Comment
7 min read
All About Parquet Part 01 - An Introduction
Alex Merced
Alex Merced
Alex Merced
Follow
Oct 21
All About Parquet Part 01 - An Introduction
#
database
#
dataengineering
Comments
Add Comment
4 min read
All About Parquet Part 09 - Parquet in Data Lake Architectures
Alex Merced
Alex Merced
Alex Merced
Follow
Oct 21
All About Parquet Part 09 - Parquet in Data Lake Architectures
#
data
#
database
#
datascience
#
dataengineering
Comments
Add Comment
5 min read
All About Parquet Part 02 - Parquet's Columnar Storage Model
Alex Merced
Alex Merced
Alex Merced
Follow
Oct 21
All About Parquet Part 02 - Parquet's Columnar Storage Model
#
database
#
datascience
#
dataengineering
Comments
Add Comment
4 min read
All About Parquet Part 10 - Performance Tuning and Best Practices with Parquet
Alex Merced
Alex Merced
Alex Merced
Follow
Oct 21
All About Parquet Part 10 - Performance Tuning and Best Practices with Parquet
#
database
#
datascience
#
dataengineering
Comments
Add Comment
6 min read
All About Parquet Part 06 - Encoding in Parquet | Optimizing for Storage
Alex Merced
Alex Merced
Alex Merced
Follow
Oct 21
All About Parquet Part 06 - Encoding in Parquet | Optimizing for Storage
#
database
#
datascience
#
dataengineering
Comments
Add Comment
6 min read
All About Parquet Part 08 - Reading and Writing Parquet Files in Python
Alex Merced
Alex Merced
Alex Merced
Follow
Oct 21
All About Parquet Part 08 - Reading and Writing Parquet Files in Python
#
database
#
datascience
#
dataengineering
#
data
Comments
Add Comment
5 min read
All About Parquet Part 03 - Parquet File Structure | Pages, Row Groups, and Columns
Alex Merced
Alex Merced
Alex Merced
Follow
Oct 21
All About Parquet Part 03 - Parquet File Structure | Pages, Row Groups, and Columns
#
database
#
datascience
#
dataengineering
Comments
Add Comment
5 min read
Data Analysis: The Unsung Hero of Modern Business
Milcah03
Milcah03
Milcah03
Follow
Oct 7
Data Analysis: The Unsung Hero of Modern Business
#
datascience
#
dataengineering
#
writing
#
datastructures
Comments
Add Comment
2 min read
From a Unified Bronze Layer to Multiple Silver Layers: Streamlining Data Transformation in Databricks Unity Catalog
prakhyatkarri
prakhyatkarri
prakhyatkarri
Follow
Oct 20
From a Unified Bronze Layer to Multiple Silver Layers: Streamlining Data Transformation in Databricks Unity Catalog
#
databricks
#
unitycatalog
#
medallionarchitecture
#
dataengineering
1
reaction
Comments
Add Comment
5 min read
Analyzing Airbnb Listings in Chicago: A Power BI Dashboard Project
Raj Tiwari
Raj Tiwari
Raj Tiwari
Follow
Oct 7
Analyzing Airbnb Listings in Chicago: A Power BI Dashboard Project
#
datascience
#
dataengineering
#
data
1
reaction
Comments
Add Comment
4 min read
5 Best ETL Tools: A Comprehensive Comparison Guide
Sourabh Gupta
Sourabh Gupta
Sourabh Gupta
Follow
Oct 28
5 Best ETL Tools: A Comprehensive Comparison Guide
#
etl
#
datascience
#
dataengineering
#
learning
1
reaction
Comments
Add Comment
3 min read
Data Engineering with Scala: Mastering Real-Time Data Processing with Apache Flink and Google Pub/Sub
Geazi Anc
Geazi Anc
Geazi Anc
Follow
Oct 18
Data Engineering with Scala: Mastering Real-Time Data Processing with Apache Flink and Google Pub/Sub
#
dataengineering
#
scala
#
datascience
#
flink
1
reaction
Comments
Add Comment
15 min read
AWS DATA ENGINEER - 101
Sajjad Rahman
Sajjad Rahman
Sajjad Rahman
Follow
Oct 24
AWS DATA ENGINEER - 101
#
aws
#
dataengineering
#
awschallenge
#
awsbigdata
2
reactions
Comments
Add Comment
2 min read
The Journey From a CSV File to Apache Hive Table
Abdullah Haggag
Abdullah Haggag
Abdullah Haggag
Follow
Oct 24
The Journey From a CSV File to Apache Hive Table
#
hadoop
#
hive
#
bigdata
#
dataengineering
6
reactions
Comments
Add Comment
6 min read
Why Apache Spark RDD is immutable?
luminousmen
luminousmen
luminousmen
Follow
Sep 29
Why Apache Spark RDD is immutable?
#
dataengineering
#
bigdata
#
data
Comments
Add Comment
3 min read
All About Parquet Part 04 - Schema Evolution in Parquet
Alex Merced
Alex Merced
Alex Merced
Follow
Oct 21
All About Parquet Part 04 - Schema Evolution in Parquet
#
database
#
datascience
#
dataengineering
1
reaction
Comments
Add Comment
5 min read
Data Engineering in Observability: The Backbone of Modern Monitoring
Emmanuel Awa
Emmanuel Awa
Emmanuel Awa
Follow
Sep 26
Data Engineering in Observability: The Backbone of Modern Monitoring
#
dataengineering
#
observability
#
opensource
#
banking
1
reaction
Comments
Add Comment
5 min read
Oracle to Snowflake Migration: Steps, Challenges & Best Practices
Sourabh Gupta
Sourabh Gupta
Sourabh Gupta
Follow
Oct 28
Oracle to Snowflake Migration: Steps, Challenges & Best Practices
#
datascience
#
dataengineering
#
tutorial
#
learning
1
reaction
Comments
Add Comment
3 min read
Data Engineering in 2024: Innovations and Trends Shaping the Future
MissMati
MissMati
MissMati
Follow
Oct 27
Data Engineering in 2024: Innovations and Trends Shaping the Future
#
dataengineering
#
data
#
analytics
4
reactions
Comments
1
comment
13 min read
Achieving Clean and Scalable PySpark Code: A Guide to Avoiding Redundancy
Gustavo
Gustavo
Gustavo
Follow
Sep 19
Achieving Clean and Scalable PySpark Code: A Guide to Avoiding Redundancy
#
pyspark
#
dataengineering
#
cleancode
#
python
Comments
Add Comment
5 min read
Explaining CDC (Change Data Capture)
Pavol Z. Kutaj
Pavol Z. Kutaj
Pavol Z. Kutaj
Follow
Oct 11
Explaining CDC (Change Data Capture)
#
databricks
#
dataengineering
Comments
Add Comment
1 min read
Capítulo 2 - Modelos de Datos y Lenguajes de Consulta
Pablo Arango Ramirez
Pablo Arango Ramirez
Pablo Arango Ramirez
Follow
Oct 22
Capítulo 2 - Modelos de Datos y Lenguajes de Consulta
#
data
#
sql
#
nosql
#
dataengineering
2
reactions
Comments
Add Comment
8 min read
All About Parquet Part 05 - Compression Techniques in Parquet
Alex Merced
Alex Merced
Alex Merced
Follow
Oct 21
All About Parquet Part 05 - Compression Techniques in Parquet
#
database
#
datascience
#
dataengineering
1
reaction
Comments
Add Comment
5 min read
All About Parquet Part 07 - Metadata in Parquet | Improving Data Efficiency
Alex Merced
Alex Merced
Alex Merced
Follow
Oct 21
All About Parquet Part 07 - Metadata in Parquet | Improving Data Efficiency
#
data
#
database
#
dataengineering
#
datascience
1
reaction
Comments
Add Comment
5 min read
Clear Link Between DevSecOps and Data Engineering
Regnard Raquedan
Regnard Raquedan
Regnard Raquedan
Follow
Sep 13
Clear Link Between DevSecOps and Data Engineering
#
dataengineering
#
devops
#
devsecops
#
cloud
Comments
Add Comment
1 min read
Still Using SQL, Python, & Excel for Data Deduplication? Here's Why You Need Better Tools.
Farah Kim
Farah Kim
Farah Kim
Follow
Oct 17
Still Using SQL, Python, & Excel for Data Deduplication? Here's Why You Need Better Tools.
#
algorithms
#
ai
#
dataengineering
5
reactions
Comments
Add Comment
4 min read
Capture Browser XHR/Fetch API Response Automatically into JSON Files
Dendi Handian
Dendi Handian
Dendi Handian
Follow
Sep 12
Capture Browser XHR/Fetch API Response Automatically into JSON Files
#
help
#
dataengineering
#
chrome
#
javascript
Comments
Add Comment
1 min read
The True Cost of Poor Data Quality: Why It Matters and How to Improve It
Mark Yu
Mark Yu
Mark Yu
Follow
Oct 16
The True Cost of Poor Data Quality: Why It Matters and How to Improve It
#
database
#
datascience
#
dataengineering
#
management
3
reactions
Comments
Add Comment
6 min read
Building a User-Friendly, Budget-Friendly Alternative to dbt Cloud
Marco Porracin
Marco Porracin
Marco Porracin
Follow
Sep 8
Building a User-Friendly, Budget-Friendly Alternative to dbt Cloud
#
dbt
#
dataengineering
#
opensource
#
datascience
Comments
Add Comment
1 min read
O que é Engenharia de Dados?
Norton Augusto Herrero dos Santos
Norton Augusto Herrero dos Santos
Norton Augusto Herrero dos Santos
Follow
Oct 12
O que é Engenharia de Dados?
#
dataengineering
#
datascience
3
reactions
Comments
Add Comment
1 min read
How SQL Spatial Data Solves Real-World Problems
Nuthan Kishore
Nuthan Kishore
Nuthan Kishore
Follow
Sep 7
How SQL Spatial Data Solves Real-World Problems
#
firstpost
#
spatialdata
#
dataengineering
Comments
Add Comment
6 min read
Working with Gigantic Google BigQuery Partitioned Tables in DBT
Stephen
Stephen
Stephen
Follow
Sep 20
Working with Gigantic Google BigQuery Partitioned Tables in DBT
#
bigquery
#
dbt
#
dataengineering
#
googlecloud
1
reaction
Comments
Add Comment
3 min read
Handling Outliers 101: Why the IQR Method is Your Go-To Tool
allan-pg
allan-pg
allan-pg
Follow
Oct 10
Handling Outliers 101: Why the IQR Method is Your Go-To Tool
#
python
#
datascience
#
dataengineering
#
data
2
reactions
Comments
Add Comment
3 min read
Go vs Python for File Processing: A Performance and Architecture Perspective
Nico Bistolfi
Nico Bistolfi
Nico Bistolfi
Follow
Oct 9
Go vs Python for File Processing: A Performance and Architecture Perspective
#
python
#
go
#
performance
#
dataengineering
2
reactions
Comments
2
comments
5 min read
Exploring Data Operations with PySpark, Pandas, DuckDB, Polars, and DataFusion in a Python Notebook
Alex Merced
Alex Merced
Alex Merced
Follow
Oct 7
Exploring Data Operations with PySpark, Pandas, DuckDB, Polars, and DataFusion in a Python Notebook
#
python
#
database
#
datascience
#
dataengineering
2
reactions
Comments
Add Comment
13 min read
Secure Data Stack: Navigating Adoption Challenges of Data Encryption
Lulu Cheng
Lulu Cheng
Lulu Cheng
Follow
for
jarrid.xyz
Sep 3
Secure Data Stack: Navigating Adoption Challenges of Data Encryption
#
security
#
dataengineering
#
encryption
#
infosec
1
reaction
Comments
Add Comment
5 min read
Python 101: Introduction to Python as a Data Analytics Tool
Gichuki Edwin
Gichuki Edwin
Gichuki Edwin
Follow
Oct 7
Python 101: Introduction to Python as a Data Analytics Tool
#
python
#
analytics
#
datascience
#
dataengineering
Comments
Add Comment
3 min read
Ultimate Directory of Apache Iceberg Resources
Alex Merced
Alex Merced
Alex Merced
Follow
Oct 5
Ultimate Directory of Apache Iceberg Resources
#
database
#
dataengineering
#
datascience
#
elasticsearch
Comments
Add Comment
14 min read
Understanding OLTP and Choosing the Right Database
Chetan Gupta
Chetan Gupta
Chetan Gupta
Follow
Oct 4
Understanding OLTP and Choosing the Right Database
#
dataengineering
#
mongodb
#
postgressql
#
mysql
1
reaction
Comments
Add Comment
6 min read
Change Data Capture (CDC) when there is no CDC
Alex Merced
Alex Merced
Alex Merced
Follow
Oct 4
Change Data Capture (CDC) when there is no CDC
#
database
#
dataengineering
#
postgres
Comments
Add Comment
11 min read
The Ultimate Guide to Data Engineering
Milcah
Milcah
Milcah
Follow
Aug 27
The Ultimate Guide to Data Engineering
#
dataengineering
#
data
Comments
Add Comment
2 min read
Evolution of Data Sharding Towards Automation and Flexibility
Apache Doris
Apache Doris
Apache Doris
Follow
Aug 27
Evolution of Data Sharding Towards Automation and Flexibility
#
opensource
#
dataengineering
#
database
#
automation
Comments
Add Comment
15 min read
Data Showdown: OLAP vs. OLTP – The Battle of Real-Time and Analytics Titans
Chetan Gupta
Chetan Gupta
Chetan Gupta
Follow
Sep 29
Data Showdown: OLAP vs. OLTP – The Battle of Real-Time and Analytics Titans
#
bigdata
#
dataengineering
#
understanding
#
database
Comments
Add Comment
5 min read
The Power of Data Analytics – Transforming Businesses with Insights
ismail courr
ismail courr
ismail courr
Follow
Sep 8
The Power of Data Analytics – Transforming Businesses with Insights
#
dataengineering
#
datascience
#
startup
Comments
Add Comment
5 min read
Serverless PDF Processing with AWS Lambda and Textract
Olga Shabalina
Olga Shabalina
Olga Shabalina
Follow
for
AWS Community Builders
Sep 28
Serverless PDF Processing with AWS Lambda and Textract
#
cloudcomputing
#
serverless
#
lambda
#
dataengineering
9
reactions
Comments
1
comment
9 min read
The Simplest Data Architecture
Aram Panasenco
Aram Panasenco
Aram Panasenco
Follow
Sep 25
The Simplest Data Architecture
#
data
#
architecture
#
dataengineering
#
analytics
1
reaction
Comments
Add Comment
21 min read
loading...
We're a place where coders share, stay up-to-date and grow their careers.
Log in
Create account