Patricia Carro Garcia for Anvil

Posted on Apr 13, 2022 • Edited on Dec 20, 2022

Build a Web App with Pandas

#python #pandas #datascience #webdev

A Web App to Show your Insights to the World

Pandas is one of the most popular data science libraries for analysing and manipulating data. But what if you want to present your data to other people in a visual manner? With Anvil, you can wrangle data with pandas and build a dashboard for it, entirely in Python.

In this tutorial, we're going to build a dashboard app using the Netflix Movies and TV Shows dataset by Shivam Bansal. We'll use Pandas to manipulate our data, then Plotly to visualize it and turn it into an interactive dashboard.

These are the steps that we will cover:

We'll start by creating our Anvil app
Then, we'll design our dashboard using Anvil components
We'll upload and read our CSV data
We'll clean and prepare the data for plotting
Using Plotly, we will build our plots
Finally, we will add some finishing touches to the design

Let's get started!

Step 1: Create your Anvil app

Go to the Anvil editor and click on “Blank App”, and choose "Material Design". We just want an empty app, so we'll delete the current Form1 and then add a new Blank Panel form:

Now let's rename our app. Click on the app’s name, on the top left corner of the screen. This takes us to the General Settings page, where we can modify our app’s name.

Step 2: Design your dashboard

We will build our user interface by adding drag-and-drop components from the Toolbox. First, add a ColumnPanel, where we will fit all our other components. Then, let’s add three Plots into the ColumnPanel - we can adjust their sizes by clicking and dragging their edges (use Ctrl+Drag for finer adjustments).

We’ll also drop a Label into our app and place it above the plots. Select your Label and, in the Properties Panel on the right, change the text to “Netflix Content Dashboard”. Set the font_size to 32 and, if you want, you can customize the font.

Finally, we’ll set the spacing_above property of all the components to large (you can find this under the Layout section of the Properties Panel).

Step 3: Upload your data into Anvil

We need our data to be accessible to our code in Anvil to be able to work with it. We can upload our dataset as a Static Data File, which will make it available in our Server Module.

To do that, first, we need to add the Data Files service. Click the + button in the Sidebar Menu, and select Data Files. Then, add the dataset by drag-and-dropping the file onto the Data Files panel or clicking the Upload button and selecting it from your computer.

Now that our file has been uploaded into Anvil, we can read its contents. Go back to the App Browser and, under Server Code, click on Add Server Module. We will be writing our code in our Server Module, since we need to be in the server side to use pandas and the data API. If you want to learn more about the structure of the web and the differences between Client and Server code, check out our Client vs Server Code in Anvil explainer.

First, we’re gonna need to install Pandas into our app. Go to Settings, click on Python versions and select Python 3.10 (Beta) from the Python version dropdown. Pandas is included as part of the Standard base environment, so we’ll go ahead and select it from the Base packages dropdown. This is how your Python Version and Packages page should look like:

If you want to try a different base environment, or you want to install other custom packages, you can learn how to in our documentation.

The Python 3.10 environment is only available on a paid plan. If you don’t have a paid plan, don’t worry - just follow these instructions and click Start Trial when prompted.

Now, we can import pandas at the top of the module and write the csv_to_df function, which will fetch our dataset and convert it into a Pandas DataFrame. We will get our file path using data_files['filename'].

import pandas as pd

def csv_to_df(f):
  df = pd.read_csv(data_files[f], index_col=0)
  return df

With this, we are now able to transform our data into a dataframe.

Data exploration

With our data in a DataFrame, we can start exploring it. For this, we will want to see print statements as we fiddle with the data, so we’ll create explore – a function in which we’ll call our csv_to_df function and write our print statements.

@anvil.server.callable
def explore():
  netflix_df = csv_to_df('netflix_titles.csv')
  print(netflix_df.head())

We made this function available to the client side by decorating it as @anvil.server.callable, which means we can call it from the client using anvil.server.call to be able to see our print statements in our console. So let's do just that - add the following line to your Form1 __innit__ function:

class Form1(Form1Template):
  def __init__(self, **properties):
    self.init_components(**properties)

    anvil.server.call('explore')

All we have to do now is run our app to see the output:

After that, we can continue adding print statements to our function on the Server and run it as we explore the data.

Note that pandas will assume you're on a narrow display, so it may not show all your dataframe columns. To avoid this, you will need to modify pandas's display width by adding the line pd.set_option('display.width', 0) to your explore function.

Step 4: Get the data into shape

Before we can plot our data, we need to clean it and prepare it for plotting. We'll do all the necessary transformations to our data inside a function that we'll call prepare_netflix_data, which we'll place above our explore function. That way, when we're done transforming our data, we'll be able to easily print out our results using explore.

The first thing we need to do is call our csv_to_df function to fetch our data.

def prepare_netflix_data():
  netflix_df = csv_to_df('netflix_titles.csv')

Our figures will only use data from the type, country and date_added columns, so we'll slice our DataFrame using the loc property.

netflix_df = netflix_df.loc[:,['type', 'country', 'date_added']]

There’s also some missing values that we’ll have to deal with, so that we don't include missing data in our plots. Pandas has a few different ways to handle missing values, but in this case we’ll just use dropna to get rid of them:

netflix_df = netflix_df.dropna(subset=['country'])

The country column contains the production country of each movie or TV Show in Netflix. However, some of them have more than one country listed. For simplicity, we're going to assume the first one mentioned is the most important one, and we'll ignore the others. We’ll also create a separate DataFrame that only contains the value counts for this column, sorted by country – this will be useful later on when we input the data into our plots.

netflix_df['country'] = [countries[0] for 
  countries in netflix_df['country'].str.split(',')]
country_counts = pd.DataFrame(
  netflix_df['country'].value_counts().rename_axis(
    'countries').reset_index(name='counts')
  ).sort_values(by=['countries'])

Our date_added variable currently contains only strings, which is not a very easy format to work with when we want to plot something in chronological order. Because of this, we'll convert it into datetime format using to_datetime, which will allow us to easily order the data by year later on.

netflix_df['date_added'] = pd.to_datetime(netflix_df['date_added'])

We'll return our transformed dataframe and the country_counts variable that we just created. This is how the full function should look by the end:

def prepare_netflix_data():
  netflix_df = csv_to_df('netflix_titles.csv')

  netflix_df = netflix_df.loc[:,['type', 'country', 'date_added']]
  netflix_df = netflix_df.dropna()
  netflix_df['country'] = [countries[0] for countries in netflix_df['country'].str.split(',')]
  country_counts = pd.DataFrame(
    netflix_df['country'].value_counts().rename_axis('countries').reset_index(name='counts')
    ).sort_values(by=['countries'])
  netflix_df['date_added'] = pd.to_datetime(netflix_df['date_added'])
  return netflix_df, country_counts

Finally, we can print the output of prepare_netflix_data inside our explore function, to make sure everything looks right.

def explore():
  netflix_df = csv_to_df('netflix_titles.csv')

  # print(netflix_df.head())
  print(prepare_netflix_data())

This is what the output should look like when we run our app:

Now that we're happy with how our data looks, we can move on to building our plots.

Step 5: Build the plots

Our data is in a Pandas DataFrame, so we can only work with it on the Server, where we can access external packages. This means that we’ll have to create our plots in our Server Module.

Our dashboard contains three figures:

A map showing the number of films per production country
A content type pie chart
A line chart of content added through time

Now, we need to create them - import plotly.graph_objects at the top of the Server Module, and write the create_plots function, which should be decorated as @anvil.server.callable. Before anything else, we'll call our prepare_netflix_data function to fetch our transformed data. After that, we'll first create and return our map plot, using Plotly's Scattergeo.

import plotly.graph_objects as go

@anvil.server.callable
def create_plots():
  netflix_df, country_counts = prepare_netflix_data()

  fig1 = go.Figure(
    go.Scattergeo(
      locations=sorted(netflix_df['country'].unique().tolist()), 
      locationmode='country names',  
      text = country_counts['counts'],
      marker= dict(size= country_counts['counts'], sizemode = 'area')))

  return fig1

In order to display our figures, we will need to access the output our Server function in the Client side, so we'll call create_plots using anvil.server.callable inside the __init__ method in our Form1 code. Anvil’s Plot component has a figure property, with which we can set the figures that we want to display:

class Form1(Form1Template):
  def __init__(self, **properties):
    self.init_components(**properties)

    # anvil.server.call('explore')

    fig1 = anvil.server.call('create_plots')

    self.plot_1.figure = fig1

We can check that everything is working like we want it to by running our app. This is how it looks so far:

Let's now do the same with our other two plots. We'll add a pie chart and a line chart to our create_plots function:

@anvil.server.callable
def create_plots():
  netflix_df, country_counts = prepare_netflix_data()

  fig1 = go.Figure(
    go.Scattergeo(
      locations=sorted(netflix_df['country'].unique().tolist()), 
      locationmode='country names',  
      text = country_counts['counts'],
      marker= dict(size= country_counts['counts'], sizemode = 'area')))

  fig2 = go.Figure(go.Pie(
    labels=netflix_df['type'], 
    values=netflix_df['type'].value_counts()
  ))

  fig3 = go.Figure(
    go.Scatter(
      x=netflix_df['date_added'].dt.year.value_counts().sort_index().index, 
      y=netflix_df['date_added'].dt.year.value_counts().sort_index()
    ))

  return fig1, fig2, fig3

And we'll call them from the client in the same way we did our previous plot:

class Form1(Form1Template):
  def __init__(self, **properties):
    self.init_components(**properties)

    # anvil.server.call('explore')

    fig1, fig2, fig3 = anvil.server.call('create_plots')

    self.plot_1.figure = fig1
    self.plot_2.figure = fig2
    self.plot_3.figure = fig3

With that, we have a functional dashboard:

Step 6: Finishing touches

With everything set up, we can move onto customizing our plots and styling our app. For my dashboard, I decided to go with a Netflix-like color scheme.

We'll use Plotly's built-in styling capabilities to modify the look of our figures. Let's start by updating the plots' markers. We need to add some lines of code to our current plots to set their colors, sizes and positioning.

def create_plots():
  netflix_df, country_counts = prepare_netflix_data()

  fig1 = go.Figure(
      go.Scattergeo(
      locations=sorted(netflix_df['country'].unique().tolist()), 
      locationmode='country names',  
      text = country_counts['counts'],
      marker= dict(
        size= country_counts['counts'],
        line_width = 0,
        sizeref = 2,
        sizemode = 'area',
        color='#D90707' # Making the map bubbles red
      ))
  )

  fig2 = go.Figure(go.Pie(
    labels=netflix_df['type'], 
    values=netflix_df['type'].value_counts(),
    marker=dict(colors=['#D90707', '#A60311']), # Making the pie chart two different shades of red
    hole=.4, # Adding a hole to the middle of the chart
    textposition= 'inside', 
    textinfo='percent+label'
  ))

  fig3 = go.Figure(
    go.Scatter(
      x=netflix_df['date_added'].dt.year.value_counts().sort_index().index, 
      y=netflix_df['date_added'].dt.year.value_counts().sort_index(),
      line=dict(color='#D90707', width=3) # Making the line red
    ))

This is how the dashboard looks with the modified markers:

Now, let's move on to modifying our plots' layouts, for which we'll use Plotly’s update_layout inside our create_plots function. With it, we'll be able to change our figures's margins, set their titles and background color, and make other small styling changes.

fig1.update_layout(
  title='Production countries',
  font=dict(family='Raleway', color='white'), # Customizing the font
  margin=dict(t=60, b=30, l=0, r=0), # Changing the margin sizes of the figure
  paper_bgcolor='#363636', # Setting the card color to grey
  plot_bgcolor='#363636', # Setting background of the figure to grey
  hoverlabel=dict(font_size=14, font_family='Raleway'),
  geo=dict(
    framecolor='rgba(0,0,0,0)',
    bgcolor='rgba(0,0,0,0)',
    landcolor='#7D7D7D',
    lakecolor = 'rgba(0,0,0,0)',))

fig2.update_layout(
  title='Content breakdown by type',
  margin=dict(t=60, b=30, l=10, r=10),
  showlegend=False,
  paper_bgcolor='#363636',
  plot_bgcolor='#363636',
  font=dict(family='Raleway', color='white'))

fig3.update_layout(
  title='Content added over time',
  margin=dict(t=60, b=40, l=50, r=50),
  paper_bgcolor='#363636',
  plot_bgcolor='#363636',
  font=dict(family='Raleway', color='white'),
  hoverlabel=dict(font_size=14, font_family='Raleway'))

return fig1, fig2, fig3

Let's now change the background colour of our app to a dark grey. To do so, open theme.css, which is in the Assets section of the App Browser. Inside it, CTRL+F to find the body section, and add the line background-color: #292929; to it. This is how it should look:

body {
  font-family: Roboto, Noto, Arial, sans-serif;
  font-size: 14px;
  line-height: 1.4286;
  background-color: #292929;
}

After adjusting the plot sizes and changing our Label's color to white, our dashboard is ready to be shared! All you need to do now is publish your app to make it available to the world.

Clone the App

If you want to check out the source code for our Netflix dashboard, click on the link below and clone the app:

Clone the app

This app, as cloned, will only work if you're on a Personal plan and above, or on a Full Python interpreter trial.

New to Anvil?

If you're new here, welcome! Anvil is a platform for building full-stack web apps with nothing but Python. No need to wrestle with JS, HTML, CSS, Python, SQL and all their frameworks – just build it all in Python.

Yes – Python that runs in the browser. Python that runs on the server. Python that builds your UI. A drag-and-drop UI editor. We even have a built-in Python database, in case you don’t have your own.

Why not have a play with the app builder? It's free! Click here to get started:

Get building

DEV Community

Build a Web App with Pandas

A Web App to Show your Insights to the World

Step 1: Create your Anvil app

Step 2: Design your dashboard

Step 3: Upload your data into Anvil

Data exploration

Step 4: Get the data into shape

Step 5: Build the plots

Step 6: Finishing touches

Clone the App

New to Anvil?

Top comments (0)

Read next

Evaluating Medical Retrieval-Augmented Generation (RAG) with NVIDIA AI Endpoints and Ragas

Best Version Control Practices Every React Development Team Needs To Know

Testing Push Notification: Test Push Notification in iOS Simulator & Android

LLM Quantization: Balancing Accuracy and Efficiency for Real-World Deployments