DEV Community

Cover image for Seaborn - Unlock the Power of Data Visualization
Debjoty Mitra
Debjoty Mitra

Posted on

Seaborn - Unlock the Power of Data Visualization

What is Seaborn?

Seaborn is a matplotlib-based Python data visualization library. It provides a sophisticated interface for creating visually appealing and instructive statistical visuals. It has beautiful default styles, and it is also designed to work very well with Pandas data frame objects.

Installation

To install Seaborn on your system, you have to run this code on your command line:

>> pip install seaborn
Enter fullscreen mode Exit fullscreen mode

Importing

To use seaborn in our code, first, we have to import it. We will import the seaborn module under the name sns (the tidy way):

Seaborn

Data

Now, where can we find our data to visualize? The good thing about Seaborn is that it comes with built-in data sets.

Image description

Let's talk about some plots that help us see how a set of data is spread out. These plots are:

1. displot

Image description

Now, the displot is showing our data in histogram form. We only need to pass a single column from our data frame to displot.

2. jointplot

jointplot() lets you match up two scatterplots for two sets of data by letting you choose wh*ich* parameter to compare:

  • “scatter”
  • “reg”
  • “resid”
  • “kde”
  • “hex”

Image description

Image description

Image description

Here are different kinds of plots we can create by changing the value of the kind attribute.

3. pairplot

pairplot will plot pairwise relationships across an entire data frame (for the numerical columns) and supports a color hue argument (for categorical columns). We just have to pass the data through the method.

Image description

4. kdeplot

kdeplots are Kernel Density Estimation plots. These KDE plots replace every single observation with a Gaussian (Normal) distribution centered around that value.

Image description

Categorical Data Plots

5. B*arplot and Countplot*

B*arplot*

These extremely similar plots enable the extraction of aggregate data from a categorical feature in the data. The barplot is an all-purpose plot that aggregates categorical data based on a function, by default the mean. We can use an estimator attribute to change the default. Here we are using standard deviation.

Image description

Countplot

This is essentially the same as a barplot except the estimator is explicitly counting the number of occurrences. Which is why we only pass the x value:

Image description

6. B*oxplot and Violinplot*

Boxplots and violinplots are two types of graphs that can be used to demonstrate the distribution of categorical data. A box plot, also known as a box-and-whisker plot, is a type of graph that displays the distribution of quantitative data in a manner that makes it easier to make comparisons between different variables or between different levels of a categorical variable. The whiskers extend to show the rest of the distribution, with the exception of points that are determined to be "outliers" using a method that is a function of the interquartile range. The box displays the quartiles of the dataset, while the whiskers show the rest of the distribution.

B*oxplot*

Image description

Violinplot

A violin plot plays a similar role as a box and whisker plot. It shows the distribution of quantitative data across several levels of one (or more) categorical variables such that those distributions can be compared. Unlike a box plot, in which all of the plot components correspond to actual data points, the violin plot features a kernel density estimation of the underlying distribution.

Top comments (0)