DEV Community

loading...
Cover image for How To Combine Two Line Charts In Seaborn and Python?

How To Combine Two Line Charts In Seaborn and Python?

Kedar Ghule
I love to code and travel. They're both filled with a lot of exploration and adventure.
Originally published at kedar.hashnode.dev ・7 min read

In this tutorial, we will learn how to combine two charts, specifically two line charts using seaborn and python. When we combine two charts, they share a common x-axis while having different y-axes. Suppose you have two line charts - A and B. When we combine and merge these two line charts into one line chart, they will have a common x-axis. However, the y-axis of line chart A will be on the left and the y-axis of line chart B will be on the right or vice versa.

Let us combine two line charts using seaborn in Python.

Importing the libraries and dataset

We kick things off by importing the necessary libraries for our tutorial. Next, we will import our dataset. You can find the dataset at this link. Here is a download link for the same. The dataset is by the World Bank on Brazil's Environment Indicators.

import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
%matplotlib inline

df = pd.read_csv('environment_bra.csv')
df.head()
Enter fullscreen mode Exit fullscreen mode

importing-libraries-and-dataset.png

Simple Exploratory Data Analysis And Data Cleaning

As you can see the first row does not contain any useful data. Instead, it specifies what each column contains. This row can be removed. Let us drop the first row and then check the length of the resulting dataframe.

df.drop(index=0, inplace=True)
df.head()

len(df)
Enter fullscreen mode Exit fullscreen mode

And the output is -

eda-1.png

Great! Next, let us check for any missing values in the dataframe.

df.isna().sum()
Enter fullscreen mode Exit fullscreen mode

eda-2.png

Good! There are no missing values in the dataframe. Now, let us get some information on the dataframe using the .info() function.

df.info()
Enter fullscreen mode Exit fullscreen mode

eda-3.png

Hmmm. The Year and the Value columns are numerical but have their datatype as "object". This may cause some problems for us while doing the visualizations. To be on the safer side, let us convert the Year column to int and the Value column to float.

df['Value'] = df['Value'].astype(float)
df['Year'] = df['Year'].astype('int64')
df.info()
Enter fullscreen mode Exit fullscreen mode

eda-4.png

Fantastic! Next, let us see all the unique values in the Indicator Name column. The Indicator Name column contains the name of the environment indicators for which we have the data.

df['Indicator Name'].unique()
Enter fullscreen mode Exit fullscreen mode

eda-5.png

Wow! That's a lot of information packed in one dataset.

Data Visualization

Let us use line charts to plot the information for the following indicators - 'Agricultural land (% of land area)' and 'Forest area (sq. km)'.

First, let's start with the 'Agricultural land (% of land area)' indicator. We will create a new dataframe that only has the details of this indicator.

agri_land = df[df['Indicator Name'] == 'Agricultural land (% of land area)']
agri_land.head()
Enter fullscreen mode Exit fullscreen mode

dv-1.png

Great! Now, let us plot a line chart for this using seaborn to see the trend of the increase/decrease in agricultural land cover in Brazil from 1961 to 2016.

First, we define the figure size. I have defined it as (12, 6). Feel free to use a different figure size. Remember, the format for figure size is (length, height).

Next, we will call the lineplot() function from seaborn and specify the x and y axis values and the dataframe.

We use sns.despine() to get rid of the top and the right hand side border that comes with the chart.

Lastly, we use plt.ylabel() to specify the label of the y-axis and plt.title() to specify the title of the line chart.

fig, ax = plt.subplots(figsize=(12,6))
lineplot = sns.lineplot(x=agri_land['Year'], y=agri_land['Value'], data=agri_land)
sns.despine()
plt.ylabel('% of land area')
plt.title('Agricultural land cover trend in Brazil', pad=20);
Enter fullscreen mode Exit fullscreen mode

After running this code, you will get the below line chart visualization.

line-chart-brazil-agriculture-land.png

Similarly, let us draw a line chart for the 'Forest area (sq. km)' indicator.

forest_land = df[df['Indicator Name'] == 'Forest area (sq. km)']
forest_land.reset_index(inplace=True)
fig, ax = plt.subplots(figsize=(12,6))
lineplot = sns.lineplot(x=forest_land['Year'], y=forest_land['Value'], data=forest_land)
sns.despine()
plt.ylabel('% of land area')
plt.title('Forest land area trend in Brazil', pad=20);
Enter fullscreen mode Exit fullscreen mode

After running this code, you will get the below line chart.

line-chart-brazil-forest-land.png

Now let us combine the above plots and try to draw come conclusions from the resulting combination line chart.

First, we will use the same code that we used to plot the line chart for the agricultural land cover indicator. We will just add in two more parameters called label and legend to the seaborn lineplot() function.

We will do the same for the Forest Cover chart as well. However, before we write the code for the forest cover line chart, we need to write a code that will help combine these two charts. That line of code is -

ax2 = ax.twinx()

The twinx() function is a function in the axes module of matplotlib library. It is used to create a twin y-axis that will share the x-axis with the original y-axis. This new y-axis will be on the right side of the chart.

So, in our visualization, the left side y-axis is for the Agricultural land cover while the right side y-axis is for the forest cover line chart.

# Line Chart For Agricultural Land Cover
fig, ax = plt.subplots(figsize=(12,6))
lineplot = sns.lineplot(x=agri_land['Year'], y=agri_land['Value'], data=agri_land, 
                        label = 'Agricultural land cover', legend=False)
sns.despine()
plt.ylabel('% of land area')
plt.title('Agricultural land cover trend in Brazil', pad=20);

# Line Chart For Forest Cover
ax2 = ax.twinx()
lineplot2 = sns.lineplot(x=forest_land['Year'], y=forest_land['Value'], ax=ax2, color="r", 
                         label ='Forest Cover', legend=False) 
sns.despine(right=False)
plt.ylabel('% of land area')
ax.figure.legend();
Enter fullscreen mode Exit fullscreen mode

dv-3.png

With the increase in agricultural land, there was a decrease in the forest covered land. However, this visualization isn't good! As you can see, the forest cover indicator does not have data before the 1990s. We need to change this visualization so we see the trends from 1990 to 2016.

This time, however, let us write a function that lets you display a single line chart or a combined line chart - according to the parameters you pass.

Let us write the below quick_line_plot() function. It takes in 4 arguments - df1 and y_label1 which are the compulsory parameters and df2 and y_label2 which are optional.

def quick_line_plot(df1, title, y_label1, df2=None, y_label2=None):
    """
    df1: Dataframe 1
    y_label1: Y axis label for the plot of dataframe 1
    df2: Dataframe 2 (optional)
    y_label2: Y axis label for the plot of dataframe 2 (optional)
    """
    df1 = df1.sort_values(by='Year')
    year_list = df1.Year.unique()
    year_max = year_list[-1]
    year_min = year_list[0]
    x_tick_list = list(range(year_min, year_max, 2))
    Label1 = df1['Indicator Name'][1]

    fig, ax = plt.subplots(figsize=(12,6))
    lineplot = sns.lineplot(x=df1['Year'], y=df1['Value'], data=df1, label = Label1, legend=False)
    lineplot.set(xlim=(year_min-1, year_max+1))
    plt.xticks(x_tick_list, rotation =45)  # Rotate the x-axis labels
    sns.despine()
    plt.ylabel(y_label1)
    plt.title(title, pad=20)

    if df2 is not None:
        ax2 = ax.twinx()
        Label2 = df2['Indicator Name'][1]
        lineplot2 = sns.lineplot(x=df2['Year'], y=df2['Value'], ax=ax2, color="r", label =Label2, legend=False) 
        sns.despine(right=False)
        plt.ylabel(y_label2)
    ax.figure.legend()
Enter fullscreen mode Exit fullscreen mode

We sort the first dataframe by the values in the Year column. Next, we get the maximum and minimum year values and create a list of years using the range function with the step value as 2. This will help set our x-axis scale of the graph as 1 unit = 2 years.

Then we plot the line chart for the first dataframe.

Now, if and only if the second dataframe has been passed into the arguments of the function, we will proceed with turning the above line chart into a combination line chart.

Let us see this function in action. Let us plot the 'Agricultural land (% of land area)' indicator line chart. You'll see that writing this function helped us avoid repeating the code.

quick_line_plot(agri_land, 'Agricultural land cover trend in Brazil', '% of land area')
Enter fullscreen mode Exit fullscreen mode

line-chart-brazil-agriculture-land-2.png

Similarly, let us use the quick_line_plot() function and plot a line chart for the 'Forest area (sq. km)' indicator.

quick_line_plot(forest_land, 'Forest area trend in Brazil', 'sq. km')
Enter fullscreen mode Exit fullscreen mode

line-chart-brazil-forest-land-2.png

Finally, let us make the correct combo line chart of Agricultural land and Forest cover using the quick_line_plot() function.

quick_line_plot(agri_land[agri_land.Year >= 1990], 'Agricultural and Forest land trend in Brazil', '% of land area', 
                forest_land, 'sq. km')
Enter fullscreen mode Exit fullscreen mode

line-chart-brazil-agriculture-and-forest-land.png

Much better! As you can see, we can conclude that there has been a decline in forest cover with the rise in agricultural land in Brazil through this visualization.

After looking up this topic on Google, I found that this was indeed true. Here are some links I found about the topic -

Conclusion

Here is a link to the code for the above tutorial.

Thank you for reading this article. I would love to hear from you! Do flood me with comments and share this post if you found it useful! You can get in touch with me via Twitter or LinkedIn .

Until next time! Have a good day! :)

Discussion (2)

Collapse
dalmatiaevents profile image
dalmatiaevents

Great work.

Collapse
kedarghule profile image
Kedar Ghule Author

Thank you so much! :)