DEV Community

# dataengineering

Posts

👋 Sign in for the ability to sort posts by relevant, latest, or top.
Big data models 📊 vs. Computer memory 💾

Big data models 📊 vs. Computer memory 💾

186
Comments 3
11 min read
Machine learning 101

Machine learning 101

85
Comments
8 min read
A Beginner’s Guide to Building LLM-Powered Applications with LangChain!

A Beginner’s Guide to Building LLM-Powered Applications with LangChain!

54
Comments 6
7 min read
Python Cheat Sheet for Data Engineers and Data Scientists!

Python Cheat Sheet for Data Engineers and Data Scientists!

52
Comments
3 min read
Modern Data Engineering RoadMap - 2024

Modern Data Engineering RoadMap - 2024

33
Comments 3
3 min read
Dummy Variable Trap in Machine Learning

Dummy Variable Trap in Machine Learning

19
Comments
2 min read
Data Analysis of the Titanic with Python!

Data Analysis of the Titanic with Python!

17
Comments 1
6 min read
The Pains of Data Ingestion

The Pains of Data Ingestion

16
Comments 3
6 min read
How NASCAR delivers realtime racing data to millions of fans around the world

How NASCAR delivers realtime racing data to millions of fans around the world

16
Comments
2 min read
[Free E-Book] The World of Vector Databases & AI Applications!

[Free E-Book] The World of Vector Databases & AI Applications!

14
Comments
3 min read
8 Rusty open source data projects to watch in 2024 🤩

8 Rusty open source data projects to watch in 2024 🤩

14
Comments 11
7 min read
How moving from Pandas to Polars made me write better code without writing better code

How moving from Pandas to Polars made me write better code without writing better code

12
Comments 3
14 min read
The Data Engineering Docker-Compose Starter Kit

The Data Engineering Docker-Compose Starter Kit

10
Comments
13 min read
Sentiment Analysis Using Python: A Beginner-Friendly Tutorial!

Sentiment Analysis Using Python: A Beginner-Friendly Tutorial!

10
Comments
4 min read
How Vector Databases Work: A Hands-On Tutorial!

How Vector Databases Work: A Hands-On Tutorial!

10
Comments
9 min read
Data Engineer vs. Business Intelligence Data Analyst

Data Engineer vs. Business Intelligence Data Analyst

9
Comments
4 min read
Top 10 Common Data Engineers and Scientists Pain Points in 2024

Top 10 Common Data Engineers and Scientists Pain Points in 2024

9
Comments
5 min read
Installing Python Packages in AWS Glue using AWS CodeArtifact

Installing Python Packages in AWS Glue using AWS CodeArtifact

9
Comments
6 min read
A Comprehensive Dive into the New Time-Series Storage Engine - Mito

A Comprehensive Dive into the New Time-Series Storage Engine - Mito

8
Comments
5 min read
Introduction to Data Science

Introduction to Data Science

8
Comments 4
2 min read
Different file formats, a benchmark doing basic operations

Different file formats, a benchmark doing basic operations

8
Comments 2
9 min read
The easiest way to navigate through MongoDB, PySpark, and Jupyter Notebook

The easiest way to navigate through MongoDB, PySpark, and Jupyter Notebook

7
Comments
3 min read
How to Transpose Columns in Each Group to a Single Row

How to Transpose Columns in Each Group to a Single Row

7
Comments
2 min read
Ultimate Guide: Best Books for Data Science with Ratings for All Levels

Ultimate Guide: Best Books for Data Science with Ratings for All Levels

7
Comments
8 min read
MLOps project setup: Airbyte - Supabase connection

MLOps project setup: Airbyte - Supabase connection

7
Comments
1 min read
Exploratory Data Analysis Using Data Visualization Techniques 📊.

Exploratory Data Analysis Using Data Visualization Techniques 📊.

6
Comments
5 min read
The Wrath of Unicron - When Airflow Gets Scary

The Wrath of Unicron - When Airflow Gets Scary

6
Comments
4 min read
What Is Data Analysis and How Can You Get Started?

What Is Data Analysis and How Can You Get Started?

6
Comments
4 min read
Data Modeling

Data Modeling

6
Comments
5 min read
My Experience with Apache Airflow

My Experience with Apache Airflow

6
Comments
3 min read
Engenharia de Dados com Scala: aprenda a fazer webscraping dos filmes mais assistidos da Netflix em cada país

Engenharia de Dados com Scala: aprenda a fazer webscraping dos filmes mais assistidos da Netflix em cada país

6
Comments 2
22 min read
Optimizing Data Analysis: A Guide to Handling Missing Data Effectively

Optimizing Data Analysis: A Guide to Handling Missing Data Effectively

6
Comments
3 min read
How to Import Existing Resources in your CloudFormation Stacks

How to Import Existing Resources in your CloudFormation Stacks

6
Comments
4 min read
Mais dados é melhor que um algoritmo mais eficiente

Mais dados é melhor que um algoritmo mais eficiente

6
Comments
3 min read
A mage on the Hero’s Journey: a fantasy epic on how a startup rose from the ashes

A mage on the Hero’s Journey: a fantasy epic on how a startup rose from the ashes

6
Comments
9 min read
Transform your Pandas Dataframes: Styles, 🎨 Colors, and 😎 Emojis

Transform your Pandas Dataframes: Styles, 🎨 Colors, and 😎 Emojis

6
Comments
9 min read
How to pivot data using Dynamic SQL in SQL Server

How to pivot data using Dynamic SQL in SQL Server

5
Comments 4
3 min read
A Step-by-Step Guide to Implementing Data Version Control

A Step-by-Step Guide to Implementing Data Version Control

5
Comments
4 min read
How to Convert Dates in One Group into An Interval

How to Convert Dates in One Group into An Interval

5
Comments
2 min read
What is data engineering and a B.I architecture

What is data engineering and a B.I architecture

5
Comments
6 min read
Data Engineering For Beginners: A Step-By-Step Guide

Data Engineering For Beginners: A Step-By-Step Guide

5
Comments
8 min read
Navigating the Data Engineering Landscape: From Raw Data to Insights

Navigating the Data Engineering Landscape: From Raw Data to Insights

5
Comments 1
7 min read
What's new and noteworthy on AWS - Summer 2023 edition

What's new and noteworthy on AWS - Summer 2023 edition

5
Comments
24 min read
Building ETL/ELT Pipelines For Data Engineers.

Building ETL/ELT Pipelines For Data Engineers.

5
Comments 2
2 min read
Xavier's Insight: Overcoming Data Hoarding Disorder

Xavier's Insight: Overcoming Data Hoarding Disorder

5
Comments
3 min read
Automating Talend Jobs Using Apache Airflow .

Automating Talend Jobs Using Apache Airflow .

5
Comments
3 min read
Handling NULL in the DBs

Handling NULL in the DBs

5
Comments 1
2 min read
Data teams can deliver 10x better to the rest of us

Data teams can deliver 10x better to the rest of us

5
Comments
3 min read
How to build an Anomaly Detector using BigQuery

How to build an Anomaly Detector using BigQuery

4
Comments
12 min read
Data Evolution - Databases to Data Lakehouse

Data Evolution - Databases to Data Lakehouse

4
Comments
4 min read
The Importance of Data in Decision Making

The Importance of Data in Decision Making

4
Comments
2 min read
Amazon Kinesis Data Streams (What is?, Benefits, Terminologies)

Amazon Kinesis Data Streams (What is?, Benefits, Terminologies)

4
Comments
2 min read
PyJaws: A Pythonic Way to Define Databricks Jobs and Workflows

PyJaws: A Pythonic Way to Define Databricks Jobs and Workflows

4
Comments
1 min read
SQL + Docker: The combo for Quick and Safe Query Testing

SQL + Docker: The combo for Quick and Safe Query Testing

4
Comments
5 min read
Data Version Control vs. Open Table Formats: Differences And Use Cases

Data Version Control vs. Open Table Formats: Differences And Use Cases

4
Comments
5 min read
Benchmarking Python Processing Engines: Who’s the Fastest?

Benchmarking Python Processing Engines: Who’s the Fastest?

3
Comments
4 min read
Transactions and the ACID principle, going a little deeper.

Transactions and the ACID principle, going a little deeper.

3
Comments
11 min read
KNIME Analytics Platform for Data Science-1

KNIME Analytics Platform for Data Science-1

3
Comments
4 min read
🦿🛴Smarcity garbage reporting automation w/ ollama

🦿🛴Smarcity garbage reporting automation w/ ollama

3
Comments 4
3 min read
The Mythical Data Team

The Mythical Data Team

3
Comments
6 min read
loading...