# Data Science: What is a box plot?

A box plot, also known as a box and whisker plot, is a graphical representation of a dataset that shows the distribution of values in the data. It is a useful tool for visualizing the spread and skewness of a dataset, as well as identifying outliers.

The box plot can be used to compare the distribution of multiple datasets by creating a box plot for each dataset and placing them side by side. It is also possible to overlay box plots on top of each other to compare the distributions more closely.

• Box plot is a graphical representation of a dataset that shows the distribution of values in the data.

• The top line is maximum value.
• Bottom line is minimum value.
• The Centre line is Median.
• Top of the box is 75th percentile value.
• Bottom of the box is 25th percentile value.
• You see those circles outside yes those are called 'outliers'.
• Lets see how to create one with python.

• Start by importing necessary packages.
• We will use seaborn to create the plot.
``````import seaborn as sns
import matplotlib.pyplot as plt
``````
• Lets use some inbuilt dataset that comes with seaborn. called taxis. and set the style of the graph as white grid.
``````sns.set(style="whitegrid")

``````
• Now define values for the x-axis and y-axis. and define a list of cities you want to create box plot for.
``````x = "pickup_borough"
y = "total"
cities = ["Queens"]
``````
• Create the plot with `sns.boxplot()` function, and provide `df` as data. set x as x y as y and order boxplot in order of cities list. Now use `plt.show()` function to show the graph.
``````ax = sns.boxplot(data=df, x=x, y=y, order=cities)

plt.show()
``````

Result

I hope this tutorial has helped you understand the basics of box plots. If you have any questions comment them down below I will be more than happy to answer them.