DEV Community

Cover image for How to Remember Pandas Index Methods
Jeff Hale
Jeff Hale

Posted on • Updated on

How to Remember Pandas Index Methods

When method names are similar, it's difficult to keep them separate in your mind.
This makes remembering them harder.

Pandas has a slew of methods for creating and adjusting a DataFrame index.
This is a brief guide to help you create a little mental space between methods for easier memorization.

The Jupyter Notebook is on Kaggle here.

import pandas as pd
import numpy as np
Enter fullscreen mode Exit fullscreen mode

Make a DataFrame without specifying an index (you get a default index).

df = pd.DataFrame(dict(a=[1,2,3,4], b=[2,5,6,4]))
df
Enter fullscreen mode Exit fullscreen mode
a b
0 1 2
1 2 5
2 3 6
3 4 4

Make a DataFrame with an index by using the index keyword argument.

df2 = pd.DataFrame(dict(a=[1,2,3,4], b=[2,5,6,4]), index = [1,2,5,6])
df2
Enter fullscreen mode Exit fullscreen mode
a b
1 1 2
2 2 5
5 3 6
6 4 4

Move a column to be the index with .set_index()

df3 = df2.set_index("a")
df3
Enter fullscreen mode Exit fullscreen mode
b
a
1 2
2 5
3 6
4 4

Rename the index values from scratch with .index

df3.index = [2,3,4,5]
df3
Enter fullscreen mode Exit fullscreen mode
b
2 2
3 5
4 6
5 4

Note that index is a property of the DataFrame not a method, so the syntax is different.

Nuke the index values and start over from 0 with .reset_index()

df4 = df3.reset_index()
df4
Enter fullscreen mode Exit fullscreen mode
index b
0 2 2
1 3 5
2 4 6
3 5 4

If you don't want the index to become a column, pass drop=True to reset_index().

df5 = df3.reset_index(drop=True)
df5
Enter fullscreen mode Exit fullscreen mode
b
0 2
1 5
2 6
3 4

Reorder the rows with .reindex()

df6 = df5.reindex([2,3,1,0])
df6
Enter fullscreen mode Exit fullscreen mode
b
2 6
3 4
1 5
0 2

Passing a value that isn't in the index results in a NaN.

df7 = df5.reindex([2,3,1,0,6])
df7
Enter fullscreen mode Exit fullscreen mode
b
2 6.0
3 4.0
1 5.0
0 2.0
6 NaN

Advice

Ideally, add an index when you create your DataFrame with index =.

If reading from a .csv file you can set an index column by passing the column number.

For example:

df = pd.read_csv(my_csv, index_col=3)

Or pass index_col=False to exlcude.

How to set or change the index:

  • df.set_index() - move a column to the index

  • df.index - add an index manually

  • df.reset_index() - reset the index to 0, 1, 2 ...

  • df.reindex() - reorder the rows

Word associations to remember:

  • set_index() - move column

  • index - manual

  • reset_index() - reset

  • reindex - reorder

Wrap

I hope this article helped you create a little mental space to keep Pandas index methods straight. If it did, please give it some love so other people can find it, too.

I write about Data Science, Dev Ops, Python and other stuff. Check out my other articles if any of that sounds interesting.

Follow me and connect:
Medium
Dev.to
Twitter
LinkedIn
Kaggle
GitHub

Reset Button

Happy indexing!

Top comments (0)