DEV Community

josh friedlander
josh friedlander

Posted on

4 Easy Pieces

Here are some minor hacks to improve your experience with Pandas and Jupyter Notebook. Jupyter Notebooks are incredible tools; Donald Knuth spent years unsuccessfully trying to make literate programming a thing, but this might finally be the form in which it takes off.

(By the way, you may think that this post is named after the "Feynman Lectures for Dummies" book Six Easy Pieces, but in fact that book probably referenced the 1970 film Five Easy Pieces, to which this post serves as a prequel in title, though not in substance.)

0

I usually start off my notebooks, after the usual import statements, with:

pd.options.display.max_rows = 999
pd.options.display.max_columns = 99

It's often convenient to be able to see a whole df, or a big chunk of it, while working with it. Also, if your notebook is getting out of hand because you always use the same Untitled7.ipynb (who among us has not etc.), consider using the Codefolding extension in nbextensions so you can hide away all the old code you swear you might need someday.

1

If you've run a calculation in a Jupyter notebook but forgot to assign it, Jupyter's got your back. In the next cell you can assign it as _.

[1] long_calculation.solve() 
# Aaaah, no!
[2] result = _
# Phew

2

Ben Lindsay says

One of the most underrated #Pandas functions in #Python is .query(). I use it all the time.
data = data.query('age==42')
looks so much nicer than:
data = data[data['age'] == 42]
And it allows chaining like:
data = data.query('age >18').query('age < 32')

I tend to agree. I like brevity, although it's hard to get over how grotesquely unpythonic it looks (even by the already low standards of Pandas).

2.5

Lastly: you can't do chained comparison in Pandas (18 < df.age < 60 ) but a Stack Overflow rando points out that you sort-of can like this

df['18 <= age <= 60']

or with the abovementioned query

df.query('18 <= age <= 60')

It all goes through some kind of eval-like lexical parsing, which is generally not a good idea in Python. Yeah I know, I don't like it either.

Happy data scienc-ing.

Top comments (0)