DEV Community 👩‍💻👨‍💻

DEV Community 👩‍💻👨‍💻 is a community of 967,611 amazing developers

We're a place where coders share, stay up-to-date and grow their careers.

Create account Log in

# bigdata

Posts

đź‘‹ Sign in for the ability to sort posts by relevant, latest, or top.
Example of applying CDC to JSON files with PySpark

Example of applying CDC to JSON files with PySpark

Comments
7 min read
Technology will be the star of the World Cup

Technology will be the star of the World Cup

Reactions 4 Comments 1
2 min read
To study Apache Kafka Architecture in details, and how to install, deploy configure Apache kafka.

To study Apache Kafka Architecture in details, and how to install, deploy configure Apache kafka.

Reactions 4 Comments
3 min read
How to create Stored Procedure in MySQL

How to create Stored Procedure in MySQL

Reactions 2 Comments
1 min read
How to use delimiter in MySQL

How to use delimiter in MySQL

Reactions 2 Comments
1 min read
Apache Spark with java

Apache Spark with java

Reactions 4 Comments
5 min read
Playing PyFlink from Scratch

Playing PyFlink from Scratch

Reactions 1 Comments
4 min read
Playing PyFlink in a Nutshell

Playing PyFlink in a Nutshell

Reactions 5 Comments
5 min read
Podcast with Josh Long on Apache Pulsar and Spring

Podcast with Josh Long on Apache Pulsar and Spring

Reactions 3 Comments
1 min read
O que Ă© dark data?

O que Ă© dark data?

Reactions 7 Comments
1 min read
Optimizing massive MongoDB inserts, load 50 million records faster by 33%!

Optimizing massive MongoDB inserts, load 50 million records faster by 33%!

Reactions 6 Comments
12 min read
Get started with Power Apps canvas apps

Get started with Power Apps canvas apps

Reactions 5 Comments
20 min read
Docker Alternatives That Can Boost Your Productivity

Docker Alternatives That Can Boost Your Productivity

Reactions 1 Comments
4 min read
Building Apache Pinot and Presto

Building Apache Pinot and Presto

Reactions 1 Comments
4 min read
Apache-Spark introduction for SQL developers

Apache-Spark introduction for SQL developers

Reactions 2 Comments
7 min read
What is Big Data? Characteristics, types, and technologies

What is Big Data? Characteristics, types, and technologies

Reactions 1 Comments
11 min read
Design Pattern of Streaming Enrichment

Design Pattern of Streaming Enrichment

Comments
6 min read
Learning Big Data - Step by Step

Learning Big Data - Step by Step

Reactions 2 Comments
1 min read
SeaTunnel Connector Access Plan

SeaTunnel Connector Access Plan

Reactions 4 Comments
12 min read
Why we don’t use Spark

Why we don’t use Spark

Reactions 6 Comments
7 min read
Entrepreneurs must learn from Lord Ganesha!!!

Entrepreneurs must learn from Lord Ganesha!!!

Reactions 6 Comments
2 min read
Problemas modernos: Big Data - Um resumo do New York Times

Problemas modernos: Big Data - Um resumo do New York Times

Reactions 5 Comments
4 min read
meatballs.live 〜 remixing the Hacker News experience with Redis Stack — Part 3

meatballs.live 〜 remixing the Hacker News experience with Redis Stack — Part 3

Comments
7 min read
Spark tip: Disable Coalescing Post Shuffle Partitions for compute intensive tasks

Spark tip: Disable Coalescing Post Shuffle Partitions for compute intensive tasks

Reactions 1 Comments
3 min read
Top Skills You Need in Testing Big Data projects

Top Skills You Need in Testing Big Data projects

Comments
3 min read
meatballs.live 〜 remix your social news experience with Redis Stack + Hacker News — Part 2

meatballs.live 〜 remix your social news experience with Redis Stack + Hacker News — Part 2

Reactions 5 Comments
6 min read
Data Lake vs Data Warehouse

Data Lake vs Data Warehouse

Reactions 7 Comments
2 min read
Python For Data Engineering

Python For Data Engineering

Reactions 9 Comments
5 min read
Remix social news with Redis Stack — Part 1

Remix social news with Redis Stack — Part 1

Reactions 3 Comments
6 min read
Stream Processing Introduction

Stream Processing Introduction

Reactions 2 Comments 1
6 min read
How to run Amazon EMR Serverless with --packages flag

How to run Amazon EMR Serverless with --packages flag

Reactions 7 Comments 1
6 min read
The Relational DBs (RDB)

The Relational DBs (RDB)

Reactions 9 Comments 1
4 min read
The story behind Apache SeaTunnel’s evolving from a data integration component to an enterprise-level service

The story behind Apache SeaTunnel’s evolving from a data integration component to an enterprise-level service

Reactions 5 Comments
12 min read
Big Data Vs Small Data

Big Data Vs Small Data

Reactions 7 Comments
2 min read
Learning Workflow Schedulers (Oozie)

Learning Workflow Schedulers (Oozie)

Reactions 1 Comments
5 min read
Visual task orchestration & Drag & Drop, Scaleph Data integration practice based on SeaTunnel

Visual task orchestration & Drag & Drop, Scaleph Data integration practice based on SeaTunnel

Reactions 9 Comments
12 min read
The best Open-source lakehouse project, LakeSoul 2.0, supports snapshot, rollback, Flink, and Hive interconnection

The best Open-source lakehouse project, LakeSoul 2.0, supports snapshot, rollback, Flink, and Hive interconnection

Reactions 9 Comments
5 min read
A New One-stop AI development and production platform, AlphaIDE

A New One-stop AI development and production platform, AlphaIDE

Reactions 10 Comments
4 min read
There will be 175 Zettabytes of data in the world by 2025. Where will we store it?

There will be 175 Zettabytes of data in the world by 2025. Where will we store it?

Reactions 12 Comments 2
1 min read
Usage Guide:Quickly deploy an intelligent data platform with the One-stop AI development and production platform, AlphaIDE

Usage Guide:Quickly deploy an intelligent data platform with the One-stop AI development and production platform, AlphaIDE

Reactions 8 Comments
3 min read
How discord manage 300M socket connection

How discord manage 300M socket connection

Reactions 14 Comments
2 min read
How to filter columns in HBase Shell

How to filter columns in HBase Shell

Reactions 5 Comments
3 min read
Here is why you need a message broker

Here is why you need a message broker

Reactions 57 Comments 4
6 min read
Creating a Subtitle Search Engine using the Stanford Parts of Speech Tagger

Creating a Subtitle Search Engine using the Stanford Parts of Speech Tagger

Reactions 3 Comments
4 min read
Data Mesh: Scaling Delivery of Data as Product

Data Mesh: Scaling Delivery of Data as Product

Reactions 4 Comments 1
9 min read
Introduction to Reinforcement Learning

Introduction to Reinforcement Learning

Reactions 5 Comments
7 min read
Data engineers must-see: The future trend of big data cloud services

Data engineers must-see: The future trend of big data cloud services

Reactions 8 Comments
8 min read
New release! Support for Kubernetes, multiple connectors added, SeaTunnel 2.1.2 is here!

New release! Support for Kubernetes, multiple connectors added, SeaTunnel 2.1.2 is here!

Reactions 5 Comments
4 min read
Best Practices for Successful Data Quality

Best Practices for Successful Data Quality

Reactions 5 Comments
3 min read
What's new in Apache Spark 3.3.0

What's new in Apache Spark 3.3.0

Reactions 8 Comments 1
4 min read
Solved a practical business problem when using Hudi: LakeSoul supports null field non-override semanticssemantics

Solved a practical business problem when using Hudi: LakeSoul supports null field non-override semanticssemantics

Reactions 7 Comments
3 min read
Data Pipelines with Apache Airflow - Book Review

Data Pipelines with Apache Airflow - Book Review

Reactions 6 Comments
2 min read
What is big data analytics?

What is big data analytics?

Reactions 7 Comments
7 min read
Why Big Data Analytics Is In The Big Picture in Banking Market?

Why Big Data Analytics Is In The Big Picture in Banking Market?

Reactions 8 Comments 2
4 min read
What is the Lakehouse, the latest Direction of Big Data Architecture?

What is the Lakehouse, the latest Direction of Big Data Architecture?

Reactions 9 Comments
10 min read
Leveraging Change Data Capture for Fraud Detection using Arcion Cloud

Leveraging Change Data Capture for Fraud Detection using Arcion Cloud

Reactions 10 Comments
9 min read
Dynamic way doing ETL through Pyspark

Dynamic way doing ETL through Pyspark

Reactions 16 Comments 2
4 min read
BigQuery transactions over multiple queries, with sessions

BigQuery transactions over multiple queries, with sessions

Reactions 10 Comments
3 min read
Auto discovering and auto actions in data monitoring or How to drink coffee instead of routine tasks

Auto discovering and auto actions in data monitoring or How to drink coffee instead of routine tasks

Reactions 13 Comments
9 min read
May 9th in Streaming

May 9th in Streaming

Reactions 6 Comments
1 min read
loading...