DEV Community

Muhammad
Muhammad

Posted on • Originally published at muhammadraza.me on

My Favorites Pandas Tricks

In this post I will be writing about my favorite tricks about pandas that I use when doing some data analysis.

  • Finding unique values
import pandas as pd
data = pd.read_csv('https://gist.githubusercontent.com/tiangechen/b68782efa49a16edaf07dc2cdaa855ea/raw/0c794a9717f18b094eabab2cd6a6b9a226903577/movies.csv')
data.Film.unique()
Enter fullscreen mode Exit fullscreen mode

This will print only unique values in Film column in the csv.

  • Filtering Data

Lets say in the dataset you are only looking for movies that had audience score above 50 and were comedy only. You can use filtering in this case which is really useful.

new_data = (data.Audience score % > 50) & (data.Genre == 'Comedy')
Enter fullscreen mode Exit fullscreen mode
  • Saving to csv.

Pandas have a function that allows you to save data to csv file. For instance in order to save all the unique movie names we have to convert it to a data frame

uniq = data.Film.unique()
out = pd.DataFrame(uniq)
out.to_csv('uniq.csv')
Enter fullscreen mode Exit fullscreen mode

This will create a csv file of unique names.

  • Groupby

This allows us to group data into groups. For instance if we want to look at the count of movies according genre we can use groupby.

data.groupby('Genre').Film.agg(['count'])
Enter fullscreen mode Exit fullscreen mode

This will out put the total numbers of movies for each genre. You can also use other parameters like sum , mean and median.

  • String Operations

You can also use string operations when working with text data.

lower case a specific column

data['Genre'] = data['Genre'].str.lower()
Enter fullscreen mode Exit fullscreen mode

This will lowercase your Genre column in the data. you can also use upper() for uppercase and you can also apply your own regex by using replace.

Anyways these were my favorite things about pandas and I hope you enjoyed reading it. Let me know in the comments what’s your favorite thing about pandas.

Top comments (4)

Collapse
 
mmphego profile image
Mpho Mphego

Nice post.
I created a pandas utility package that's available on PyPI.
Contributions are very welcome: github.com/mmphego/pandas_utility

Check it out and contribute where you can.

Happy coding.

Collapse
 
mraza007 profile image
Muhammad

Nice !!

Collapse
 
mellen profile image
Matt Ellen

I haven't used pandas, so I apologise if I'm jumping the gun, but your filtering example doesn't look like valid python.

Does pandas do something to allow python syntax to change?

Collapse
 
mraza007 profile image
Muhammad

No its not changing the syntax its actually column name which is really weird Audience score % but its totally correct so in this case you might need to change the column name since this has alot of spaces