DEV Community

Satwik Kansal
Satwik Kansal

Posted on

Plotting a group level comparison histogram in pandas

Let's say I have a DataFrame like below,

years = [2014, 2014, 2014, 2015, 2015, 2015, 2015]
vehicle_types = ['Truck', 'Truck', 'Car', 'Bike', 'Truck', 'Bike', 'Car']
companies = ["Mercedez",  "Tesla", "Tesla", "Yamaha", "Tesla", "BMW", "Ford"]

df = pd.DataFrame({'year': years,
                    'vehicle_type': vehicle_types,
                    'company': companies
                   })

df.head()

Alt Text

And I want to plot the distribution of vehicle types per year, something like this,

Alt Text

Turns out, this can easily be done in one line with pandas,

df.groupby(['year'])['vehicle_type'].value_counts().unstack().plot.bar()

It's amazing how a single statement takes care of,

  • Null counts
  • Plotting the histogram bars side by side
  • And aesthetics like labels, legends, etc.

The critical part here was the unstack function and how it fits well with the multi-index created by value_counts().

Alt Text

Top comments (0)