DEV Community

FlyingBanana
FlyingBanana

Posted on

PyGWalker: A Python Library for Exploratory Data Analysis with Visualization

PyGWalker: A Python Library for Exploratory Data Analysis with Visualization

PyGWalker can simplify your Jupyter Notebook data analysis and data visualization workflow, by turning your pandas dataframe (and polars dataframe) into a Tableau-style User Interface for visual exploration.

PyGWalker (pronounced like "Pig Walker", just for fun) is named as an abbreviation of "Python binding of Graphic Walker". It integrates Jupyter Notebook (or other jupyter-based notebooks) with Graphic Walker, a different type of open-source alternative to Tableau. It allows data scientists to analyze data and visualize patterns with simple drag-and-drop operations.

Visit Google Colab, Kaggle Code, Binder or Graphic Walker Online Demo to test it out!

PyGWalker will add more support such as R in the future.

Getting Started

Tested Environments

  • [x] Jupyter Notebook
  • [x] Google Colab
  • [x] Kaggle Code
  • [x] Jupyter Lab (WIP: There're still some tiny CSS issues)
  • [x] Databricks Notebook (Since version 0.1.4)
  • [x] Jupyter Extension for Visual Studio Code (Since version 0.1.4)
  • [x] Hex Projects (Since version 0.1.4)
  • [x] Most web applications compatiable with IPython kernels. (Since version 0.1.4)
  • [ ] ...feel free to raise an issue for more environments.

Setup pygwalker

Before using pygwalker, make sure to install the packages through the command line using pip or conda.

pip

pip install pygwalker
Enter fullscreen mode Exit fullscreen mode

Note

For an early trial, you can install with pip install pygwalker --upgrade to keep your version up to date with the latest release or even pip install git+https://github.com/Kanaries/pygwalker@main to obtain latest features and bug-fixes.

Conda-forge

conda install -c conda-forge pygwalker
Enter fullscreen mode Exit fullscreen mode

or

mamba install -c conda-forge pygwalker
Enter fullscreen mode Exit fullscreen mode

See conda-forge feedstock for more help.

Use pygwalker in Jupyter Notebook

Import pygwalker and pandas to your Jupyter Notebook to get started.

import pandas as pd
import pygwalker as pyg
Enter fullscreen mode Exit fullscreen mode

You can use pygwalker without breaking your existing workflow. For example, you can call up Graphic Walker with the dataframe loaded in this way:

df = pd.read_csv('./bike_sharing_dc.csv', parse_dates=['date'])
gwalker = pyg.walk(df)
Enter fullscreen mode Exit fullscreen mode

And you can use pygwalker with polars (since pygwalker>=0.1.4.7a0):

import polars as pl
df = pl.read_csv('./bike_sharing_dc.csv',try_parse_dates = True)
gwalker = pyg.walk(df)
Enter fullscreen mode Exit fullscreen mode

You can even try it online, simply visiting Binder, Google Colab or Kaggle Code.

travel-ani-0-light

That's it. Now you have a Tableau-like user interface to analyze and visualize data by dragging and dropping variables.

travel-ani-1-light

Cool things you can do with Graphic Walker:

  • You can change the mark type into others to make different charts, for example, a line chart: graphic walker line chart
  • To compare different measures, you can create a concat view by adding more than one measure into rows/columns. graphic walker area chart
  • To make a facet view of several subviews divided by the value in dimension, put dimensions into rows or columns to make a facets view. The rules are similar to Tableau.
    graphic walker scatter chart

  • You can view the data frame in a table and configure the analytic types and semantic types.
    page-data-view-light

  • You can save the data exploration result to a local file

For more detailed instructions, visit the Graphic Walker GitHub page.

Resources

  • Check out more resources about Graphic Walker on Graphic Walker GitHub
  • We are also working on RATH: an Open Source, Automate exploratory data analysis software that redefines the workflow of data wrangling, exploration and visualization with AI-powered automation. Check out the Kanaries website and RATH GitHub for more!
  • If you encounter any issues and need support, join our Slack or Discord channels.

Top comments (0)