DEV Community

Mac McCarthy
Mac McCarthy

Posted on

Pandas and Beer

For the first time since its inception in 1982, the Great American Beer Festival has been canceled, in turn, replaced by an opportunity for a lonely pub crawl. Digital passports were handed out, and "attendees" got their first looks at the available breweries in their area:

Alt Text

Now, that view is all well and good. Provides the needed information, along with directions or a website a user could peruse to find beers that can quench their unshakeable thirst during this pandemic.

What is it missing? Python. And Pandas.

After scraping (may write a future post) the data into a CSV, I had a collection separated by three columns: Brewery name, Location (city/state), and Deal.

As always, the first step is to import all of that and turn it into a DataFrame:

import pandas as pd
df = pd.read_csv("breweries.csv")

pd.set_option("max_colwidth", None)

The set_option was an important addition. Otherwise, crucial information would be cut off in the limited column width.

1) Find all breweries in a city

This required a simple column lookup via location:

denver_breweries = df.loc[df["Location"] == "Denver, Colorado"]

This got me a better view of the possibilities around me:

Alt Text

2) Find breweries according to deals

Now, what if I wasn't alone on my boozy adventures? What if I actually had friends that wanted to see me? This is when a Buy One - Get One Free deal would come in handy. I could search by substrings within the Deal column.

def find_bogo_deals(brew_list):    
    new_df = brew_list[brew_list['Deal'].str.contains("BOGO") | brew_list['Deal'].str.contains("buy one, get one", case=False)]
    df = pd.DataFrame(new_df)
    return df

find_bogo_deals(denver_breweries)

I declared the variable denver_breweries at the beginning of the notebook

Now, we are cooking with fire!

Alt Text

But what if we wanted the good stuff? Those special releases that have been brewed and barrel-aged since last year, in anticipation for the fall of 2020. Then we would just change up our substring filter using regex.

import re

def find_specials(brew_list):

    special_list = ["special", "exclusive", "special release"]

    esc_lst = [re.escape(s) for s in special_list]   

    spec_pattern = '|'.join(esc_lst)

    new_df = brew_list[brew_list['Deal'].str.contains(spec_pattern, case=False)]    

    df = pd.DataFrame(new_df)

    return df


find_specials(denver_breweries)

Alt Text

Excellent! This list should ensure that I won't be driving.

Obviously, we could do a lot more manipulation here: splitting up location to be City - State; cleaning up the text in the Deal column to remove newlines; even exporting that CSV into a SQL database to play around some more.

In any case, I hope this post inspired you to break out your notebooks and apply Python & Pandas to your everyday life. GABF deals end October 17th (this Saturday), so don your mask and get out there to support your local breweries in these trying times.

Top comments (0)