DEV Community

Cover image for Down the Data Rabbit Hole: Alice's Adventure from Pandas to DuckDB Wonderland
Prayson Wilfred Daniel
Prayson Wilfred Daniel

Posted on • Updated on

Down the Data Rabbit Hole: Alice's Adventure from Pandas to DuckDB Wonderland

Alice falling

Prologue: Falling Down the Data Rabbit Hole

In a universe where data unfolds like a never-ending story, meet Alice, a data analyst with an adventurous spirit. One day, a flicker of curiosity leads her to an unexpected rabbit hole. With a heart full of excitement, she dives in. This is no ordinary journey: itโ€™s Aliceโ€™s vibrant plunge from the familiar world of Pandas into the magical realm of DuckDB. Join her as she discovers Wonderland on this captivating adventure, where every discovery is a thrill and every insight a delight.


tea party

Chapter 1: The Pandas Tea Party

Alice found herself at a peculiar tea party, hosted by none other than the Mad Hatter, known to the data world as Pandas. Surrounded by stacks of CSV files, she began her usual routine:

import pandas as pd
from pathlib import Path

# At the Pandas tea party, surrounded by piles of CSV files
file_list = Path("/teaparty").glob('cards*.csv')
concatenated_df = pd.concat((pd.read_csv(file) 
for file in file_list), 
ignore_index=True)

Enter fullscreen mode Exit fullscreen mode

As she worked, concatenating file after file, Alice couldn't help but feel overwhelmed. The process was familiar but cumbersome, much like the chaotic chatter around the tea party table.


chasing

Chapter 2: Chasing the DuckDB White Rabbit

Just then, Alice spotted a White Rabbit, darting past the tea party. This rabbit, known as DuckDB, beckoned her with the promise of a more efficient way to handle her data. Intrigued, Alice followed and discovered a world where data was not a burden but a delight:

import duckdb

# Chasing the White Rabbit into a world of simpler data handling
concatenated_df = duckdb.query("""
      SELECT 
         * 
      FROM './teaparty/cards*.csv'
""").df()
Enter fullscreen mode Exit fullscreen mode

In awe, Alice watched as multiple files blended into one with a simple query, no longer needing to juggle them manually. The White Rabbit had shown her a more elegant and less memory-intensive path.


queen

Chapter 3: The DuckDB Wonderland of Transformations

As Alice ventured deeper into this wonderland, she encountered the Queen of Hearts, who demanded swift and complex data transformations. In the realm of DuckDB, this was not only possible but surprisingly straightforward:

transformed_df = duckdb.query("""
    SELECT
        queen,
        SUM(soldier) as total_soldiers,
        AVG(hearts) as avg_hearts
    FROM './teaparty/cards*.csv'
    GROUP BY queen
""").df()
Enter fullscreen mode Exit fullscreen mode

With DuckDB, Alice easily satisfied the Queen's demands, performing tasks that once seemed as daunting as painting the roses red.

Epilogue: Awakening in a New World of Data Processing

As Alice awoke from her journey, she found herself back in her world, but with a new perspective on data processing. The lessons from DuckDB Wonderland had transformed her approach, making her tasks more efficient and her analyses more powerful.

Join Alice in her adventure and explore the wonders of DuckDB. Let it transform your data processing tasks from a mundane chore into a magical experience, just like a day spent in Wonderland.


Until then, keep on coding like a Cheshire Cat.

Top comments (0)