DEV Community

Cover image for Exploratory Data Analysis With 2 Lines Of Python Pandas
abhiphull
abhiphull

Posted on

Exploratory Data Analysis With 2 Lines Of Python Pandas

In this post, I will talk about how to do data analysis using Python Pandas. The purpose of this post is not to introduce Pandas. There is already ton of tutorials and wikis on Pandas. I would suggest following links If you want to learn about the basics of Python Pandas...

https://pandas.pydata.org/
https://www.nbshare.io/

This post is about introducing a Pandas package which would make the data analysis much faster.

Requirements

  1. Python 3.5+
  2. Jupyter Notebook
  3. Pandas-Profiling

For the example, I have downloaded the covid-19 US data from following github link...

https://raw.githubusercontent.com/nytimes/covid-19-data/master/us-states.csv

Let us first read the above CSV and read in to our Pandas dataframe.

import numpy as np
import pandas as pd
df = pd.read_csv('us-states.csv')

Now our data frame is ready. Let us now do some initial data analysis. Pandas has describe() function which gives us nice summary. But in this post, I will introduce Pandas library Pandas-Profiling which is kind of extension to describe() utility but with lot more data analysis and information with just one line of code.
Note: Pandas-Profiling will work in the Jupyter-notebook, it will embed a interactive widget.
Let us import the Pandas-Profiling and run the exploratory analysis.

from pandas_profiling import ProfileReport
profile = ProfileReport(df, title='Pandas Analysis', explorative=True)
profile.to_notebook_iframe()

Once you run the above code, you would see lot of information. Below is few of the snapshots that I have put together to give you a glimpse of output which you would see after running the above code.

Alt Text

Alt Text

Alt Text

Alt Text

Remember all the above snapshots are widgets and in your notebook, you would be able to interact and choose different options too. Before i wrap up this post, I want you to look at few more pandas packages which are excellent addons to have.
https://nbshare-io.github.io/top-pandas-packages/

Top comments (0)