DEV Community

# bigdata

Posts

ūüĎč Sign in for the ability to sort posts by relevant, latest, or top.
What is the Lakehouse, the latest Direction of Big Data Architecture?

What is the Lakehouse, the latest Direction of Big Data Architecture?

Reactions 8 Comments
10 min read
May 9th in Streaming

May 9th in Streaming

Reactions 6 Comments
1 min read
Build a real-time machine learning sample library using the best open-source project about big data and data lakehouse, LakeSoul

Build a real-time machine learning sample library using the best open-source project about big data and data lakehouse, LakeSoul

Reactions 9 Comments
7 min read
Dynamic way doing ETL through Pyspark

Dynamic way doing ETL through Pyspark

Reactions 9 Comments
4 min read
Leveraging Change Data Capture for Fraud Detection using Arcion Cloud

Leveraging Change Data Capture for Fraud Detection using Arcion Cloud

Reactions 10 Comments
9 min read
How to prepare for the GCP Professional Data Engineer certification

How to prepare for the GCP Professional Data Engineer certification

Reactions 16 Comments
8 min read
BigQuery transactions over multiple queries, with sessions

BigQuery transactions over multiple queries, with sessions

Reactions 8 Comments
3 min read
AUTO DISCOVERING AND AUTO ACTIONS IN DATA MONITORING or HOW TO DRINK COFFEE INSTEAD OF ROUTINE TASKS

AUTO DISCOVERING AND AUTO ACTIONS IN DATA MONITORING or HOW TO DRINK COFFEE INSTEAD OF ROUTINE TASKS

Reactions 10 Comments
9 min read
Fully Embracing K8s, Cisco Hangzhou Seeks to Support K8s Tasks Based on Apache DolphinScheduler

Fully Embracing K8s, Cisco Hangzhou Seeks to Support K8s Tasks Based on Apache DolphinScheduler

Reactions 4 Comments
5 min read
AWS Certified Big Data - Specialty Certification - Complete Study Guide

AWS Certified Big Data - Specialty Certification - Complete Study Guide

Reactions 4 Comments
4 min read
Apache Spark, Hive, and Spring Boot ‚ÄĒ Testing Guide

Apache Spark, Hive, and Spring Boot ‚ÄĒ Testing Guide

Reactions 32 Comments 1
18 min read
A Brief Comparison of Apache DolphinScheduler With Other Alternatives

A Brief Comparison of Apache DolphinScheduler With Other Alternatives

Reactions 4 Comments
10 min read
Design concept of a best opensource project about big data and data lakehouse

Design concept of a best opensource project about big data and data lakehouse

Reactions 9 Comments
9 min read
DATA SCIENCE SALARIES IN INDIA 2022

DATA SCIENCE SALARIES IN INDIA 2022

Reactions 4 Comments
4 min read
Details of 4 best opensource projects about big data you should try outÔľą‚Ö†ÔľČ

Details of 4 best opensource projects about big data you should try outÔľą‚Ö†ÔľČ

Reactions 7 Comments
5 min read
How to Build A System Popular Among Data Analysts?

How to Build A System Popular Among Data Analysts?

Reactions 7 Comments
5 min read
Create a Hadoop playground with Docker Desktop on Windows in minutes

Create a Hadoop playground with Docker Desktop on Windows in minutes

Reactions 6 Comments
4 min read
Characteristics of Big Data

Characteristics of Big Data

Reactions 4 Comments
8 min read
HIVE installation on WSL

HIVE installation on WSL

Reactions 6 Comments
3 min read
How to create a DIY Inexpensive Cloud Data Lake

How to create a DIY Inexpensive Cloud Data Lake

Reactions 5 Comments
3 min read
Quick use of CDC: A new demo from lakesoul makes it easier to set up the environment

Quick use of CDC: A new demo from lakesoul makes it easier to set up the environment

Reactions 8 Comments
5 min read
Big Data in Cloud Computing - AWS

Big Data in Cloud Computing - AWS

Reactions 14 Comments
2 min read
4 best opensource projects about big data you should try out

4 best opensource projects about big data you should try out

Reactions 15 Comments 3
3 min read
A new unified streaming and batch table storage solution similar to iceberg/hudi/delta lake but with several new functions

A new unified streaming and batch table storage solution similar to iceberg/hudi/delta lake but with several new functions

Reactions 8 Comments
2 min read
Fast Multivalue Look-ups For Huge Data Sets

Fast Multivalue Look-ups For Huge Data Sets

Reactions 5 Comments
6 min read
Apache Spark Unit Testing Strategies

Apache Spark Unit Testing Strategies

Reactions 7 Comments
3 min read
[OPINIÃO] Construindo uma Carreira como Data Engineer

[OPINIÃO] Construindo uma Carreira como Data Engineer

Reactions 2 Comments
2 min read
How to handle nested JSON with Apache Spark

How to handle nested JSON with Apache Spark

Reactions 3 Comments
3 min read
Presenting ML-based COVID-19 Risk Assessment App Pandemonium

Presenting ML-based COVID-19 Risk Assessment App Pandemonium

Reactions 3 Comments
3 min read
NodeJS - Get data from Redash v6 API

NodeJS - Get data from Redash v6 API

Reactions 6 Comments
2 min read
Building an Apache ECharts dashboard with React and Cube

Building an Apache ECharts dashboard with React and Cube

Reactions 14 Comments
11 min read
Quill- Most efficient Scala driver for Apache Cassandra and Spark

Quill- Most efficient Scala driver for Apache Cassandra and Spark

Reactions 2 Comments
4 min read
What are the best practices while using BigQuery?

What are the best practices while using BigQuery?

Reactions 10 Comments
2 min read
Building a Bubble Dashboard with Cube

Building a Bubble Dashboard with Cube

Reactions 9 Comments
14 min read
[ARTIGO] Data Warehouse, Data Lake e Data Lakehouse: Conceitos e Diferenças

[ARTIGO] Data Warehouse, Data Lake e Data Lakehouse: Conceitos e Diferenças

Reactions 4 Comments
3 min read
Dagster: The Best Free and Open-Source Alternative to Airflow With Python!

Dagster: The Best Free and Open-Source Alternative to Airflow With Python!

Reactions 3 Comments
1 min read
What is the SingleStore and why should we use it?

What is the SingleStore and why should we use it?

Reactions 9 Comments 2
3 min read
Machine Learning Lifecycle Process

Machine Learning Lifecycle Process

Reactions 41 Comments
4 min read
Introduction to Hive(A SQL layer above Hadoop)

Introduction to Hive(A SQL layer above Hadoop)

Reactions 6 Comments
9 min read
Cleaning And Normalizing Data Using AWS Glue DataBrew

Cleaning And Normalizing Data Using AWS Glue DataBrew

Reactions 11 Comments
9 min read
Introduction to Apache Spark, SparkQL, and Spark MLib.

Introduction to Apache Spark, SparkQL, and Spark MLib.

Reactions 11 Comments
15 min read
Data Lake explained

Data Lake explained

Reactions 6 Comments
4 min read
Build a small TA-Lib container image

Build a small TA-Lib container image

Reactions 2 Comments
2 min read
SPOTLIGHT: A GENTLE INTRODUCTION TO MACHINE LEARNING CONCEPTS IN PYTHON

SPOTLIGHT: A GENTLE INTRODUCTION TO MACHINE LEARNING CONCEPTS IN PYTHON

Reactions 5 Comments
5 min read
The World Beyond the Docker! $$ :)

The World Beyond the Docker! $$ :)

Reactions 5 Comments
2 min read
Vitess: Easy database deployment, clustering, and scaling!

Vitess: Easy database deployment, clustering, and scaling!

Reactions 5 Comments
5 min read
Zero to Deployment and Evolution Data Catalog!

Zero to Deployment and Evolution Data Catalog!

Reactions 4 Comments
6 min read
How to choose a MongoDB shard key

How to choose a MongoDB shard key

Reactions 8 Comments 1
3 min read
Big Data Open Source Frameworks

Big Data Open Source Frameworks

Reactions 3 Comments
5 min read
Scala Vs Python Syntax Cheat Sheet

Scala Vs Python Syntax Cheat Sheet

Reactions 3 Comments
5 min read
Scala For Beginners - Crash Course - Part 4

Scala For Beginners - Crash Course - Part 4

Reactions 2 Comments
4 min read
Scala For Beginners - Crash Course - Part 5

Scala For Beginners - Crash Course - Part 5

Reactions 3 Comments
6 min read
Scala For Beginners - Crash Course - Part 2

Scala For Beginners - Crash Course - Part 2

Reactions 2 Comments
6 min read
Scala For Beginners - Crash Course - Part 3

Scala For Beginners - Crash Course - Part 3

Reactions 2 Comments
6 min read
Creating and running Spark Jobs in Scala on Cloud Dataproc !!!

Creating and running Spark Jobs in Scala on Cloud Dataproc !!!

Reactions 6 Comments
3 min read
Best extensions for JupyterLab!!

Best extensions for JupyterLab!!

Reactions 5 Comments
3 min read
Understanding Apache Hive LLAP

Understanding Apache Hive LLAP

Reactions 3 Comments
7 min read
Airbyte: Data Integration / CDC Solution for Modern Data Teams!

Airbyte: Data Integration / CDC Solution for Modern Data Teams!

Reactions 5 Comments
12 min read
Build an analytics app with React and Cube.js

Build an analytics app with React and Cube.js

Reactions 8 Comments
9 min read
Getting started with Spark

Getting started with Spark

Reactions 7 Comments 2
6 min read
loading...