DEV Community

# dataengineering

Posts

👋 Sign in for the ability to sort posts by relevant, latest, or top.
XGBoost Training Speed: A Comparative Analysis

XGBoost Training Speed: A Comparative Analysis

Comments
2 min read
Big Data is dead & other stories

Big Data is dead & other stories

Comments
2 min read
Exploring Feature Stores: Personal Insights and Notes on Hopsworks pt.2

Exploring Feature Stores: Personal Insights and Notes on Hopsworks pt.2

1
Comments
1 min read
Benchmarking Python Processing Engines: Who’s the Fastest?

Benchmarking Python Processing Engines: Who’s the Fastest?

3
Comments
4 min read
AWS Kinesis - Stream Storage Layer

AWS Kinesis - Stream Storage Layer

2
Comments
3 min read
Extracting data with dlt

Extracting data with dlt

Comments
7 min read
Hands-on Guide to Enable Compute Nodes for Data Lake Analytics in Apache Doris

Hands-on Guide to Enable Compute Nodes for Data Lake Analytics in Apache Doris

Comments
4 min read
Since When Did APIs Become Databases?

Since When Did APIs Become Databases?

Comments
4 min read
Generating Avro Schemas from Go types

Generating Avro Schemas from Go types

Comments
5 min read
Amazon Quicksight vs Microsoft PowerBI

Amazon Quicksight vs Microsoft PowerBI

Comments
3 min read
Solving Pandas .to_sql Double Quotes Issue When Writing to Database

Solving Pandas .to_sql Double Quotes Issue When Writing to Database

Comments
1 min read
🦿🛴Smarcity garbage reporting automation w/ ollama

🦿🛴Smarcity garbage reporting automation w/ ollama

3
Comments 4
3 min read
What to think about when designing, building, managing and operating data systems.

What to think about when designing, building, managing and operating data systems.

1
Comments
8 min read
BigQuery best practices

BigQuery best practices

1
Comments
2 min read
The Mythical Data Team

The Mythical Data Team

3
Comments
6 min read
Saving Dataframes into Oracle Database with Python

Saving Dataframes into Oracle Database with Python

Comments
1 min read
Using data for predictive analytics

Using data for predictive analytics

Comments
6 min read
How to Use Pyinstaller to Generate an EXE File

How to Use Pyinstaller to Generate an EXE File

Comments
3 min read
Glue Data Brew- Data Profiling & Data Quality

Glue Data Brew- Data Profiling & Data Quality

Comments
3 min read
Transform your R Dataframes: Styles, 🎨 Colors, and 😎 Emojis

Transform your R Dataframes: Styles, 🎨 Colors, and 😎 Emojis

2
Comments
9 min read
Modern Data Engineering RoadMap - 2024

Modern Data Engineering RoadMap - 2024

28
Comments 1
3 min read
Introducción a los Data Lakes

Introducción a los Data Lakes

3
Comments
3 min read
Data Engineering Saga part 2

Data Engineering Saga part 2

2
Comments
3 min read
Exploring Feature Stores: Personal Insights and Notes on Hopsworks

Exploring Feature Stores: Personal Insights and Notes on Hopsworks

1
Comments
1 min read
Data Evolution - Databases to Data Lakehouse

Data Evolution - Databases to Data Lakehouse

4
Comments
4 min read
How to build an Anomaly Detector using BigQuery

How to build an Anomaly Detector using BigQuery

4
Comments
12 min read
How proficient is generated AI in transforming text or natural language into SQL?

How proficient is generated AI in transforming text or natural language into SQL?

Comments
4 min read
How NASCAR delivers realtime racing data to millions of fans around the world

How NASCAR delivers realtime racing data to millions of fans around the world

16
Comments
2 min read
VS Code Extensions for Data Engineering - Part 1

VS Code Extensions for Data Engineering - Part 1

2
Comments
2 min read
Build a federated query solution with Apache Doris, Apache Flink, and Apache Hudi

Build a federated query solution with Apache Doris, Apache Flink, and Apache Hudi

Comments
5 min read
Data Warehouse Concepts, focusing on the Kimball vs. Inmon methodologies

Data Warehouse Concepts, focusing on the Kimball vs. Inmon methodologies

2
Comments
9 min read
Beginner's guide to Apache Flink

Beginner's guide to Apache Flink

1
Comments
3 min read
🎀 Domaine.nc data as Jupyter on Kaggle 📊

🎀 Domaine.nc data as Jupyter on Kaggle 📊

Comments 1
1 min read
¿Quieres ser un Data Engineer certificado por AWS? ¡No te puedes perder esta certificación!

¿Quieres ser un Data Engineer certificado por AWS? ¡No te puedes perder esta certificación!

Comments
2 min read
Transform your Pandas Dataframes: Styles, 🎨 Colors, and 😎 Emojis

Transform your Pandas Dataframes: Styles, 🎨 Colors, and 😎 Emojis

6
Comments
9 min read
MLOps project setup: Airbyte - Supabase connection

MLOps project setup: Airbyte - Supabase connection

3
Comments
1 min read
MLOps project setup: Supabase

MLOps project setup: Supabase

2
Comments
1 min read
Top 5 Modern ETL Tools from AWS

Top 5 Modern ETL Tools from AWS

Comments
3 min read
Introduction to Data Science

Introduction to Data Science

8
Comments 4
2 min read
Im about to embark a journey into data engineering

Im about to embark a journey into data engineering

1
Comments
1 min read
Decoding a Data Model: Using SchemaSpy in Snowflake ❄️

Decoding a Data Model: Using SchemaSpy in Snowflake ❄️

Comments
4 min read
Test Driving Redshift AI-Driven Scaling

Test Driving Redshift AI-Driven Scaling

1
Comments
3 min read
What Is Data Analysis and How Can You Get Started?

What Is Data Analysis and How Can You Get Started?

6
Comments
4 min read
Flexibility in Integration Engineering

Flexibility in Integration Engineering

Comments
4 min read
Data Engineering for Beginners: Navigating the Foundations of a Data-Driven World

Data Engineering for Beginners: Navigating the Foundations of a Data-Driven World

1
Comments
3 min read
DATA ENGINEERING ROADMAP FOR BEGINNERS.

DATA ENGINEERING ROADMAP FOR BEGINNERS.

Comments
8 min read
Introducing Memphis Functions

Introducing Memphis Functions

2
Comments
3 min read
8 Rusty open source data projects to watch in 2024 🤩

8 Rusty open source data projects to watch in 2024 🤩

14
Comments 11
7 min read
Amazon EMR Summary

Amazon EMR Summary

1
Comments 1
2 min read
Mais dados é melhor que um algoritmo mais eficiente

Mais dados é melhor que um algoritmo mais eficiente

6
Comments
3 min read
Amazon Kinesis Firehose

Amazon Kinesis Firehose

2
Comments
2 min read
How Vector Databases Work: A Hands-On Tutorial!

How Vector Databases Work: A Hands-On Tutorial!

10
Comments
9 min read
Transfer SQL-> analytics 30x faster with ConnectorX + arrow + dlt

Transfer SQL-> analytics 30x faster with ConnectorX + arrow + dlt

2
Comments
1 min read
The easiest way to navigate through MongoDB, PySpark, and Jupyter Notebook

The easiest way to navigate through MongoDB, PySpark, and Jupyter Notebook

7
Comments
3 min read
Big data models 📊 vs. Computer memory 💾

Big data models 📊 vs. Computer memory 💾

186
Comments 3
11 min read
How to Import Existing Resources in your CloudFormation Stacks

How to Import Existing Resources in your CloudFormation Stacks

6
Comments
4 min read
Engenharia de Dados com Scala: aprenda a fazer webscraping dos filmes mais assistidos da Netflix em cada país

Engenharia de Dados com Scala: aprenda a fazer webscraping dos filmes mais assistidos da Netflix em cada país

5
Comments 2
22 min read
Unveiling the Robust Architecture of SQL Server A Comprehensive Overview

Unveiling the Robust Architecture of SQL Server A Comprehensive Overview

Comments
2 min read
Maximizing Database Efficiency Mastering Query Optimization in SQL

Maximizing Database Efficiency Mastering Query Optimization in SQL

Comments
2 min read
Multiple Regression

Multiple Regression

2
Comments
2 min read
loading...