loading...
Cover image for How to Remember Pandas Index Methods

How to Remember Pandas Index Methods

discdiver profile image Jeff Hale Updated on ・4 min read

When method names are similar, it's difficult to keep them separate in your mind.
This makes remembering them harder.

Pandas has a slew of methods for creating and adjusting a DataFrame index.
This is a brief guide to help you create a little mental space between methods for easier memorization.

The Jupyter Notebook is on Kaggle here.

import pandas as pd
import numpy as np

Make a DataFrame without specifying an index (you get a default index).

df = pd.DataFrame(dict(a=[1,2,3,4], b=[2,5,6,4]))
df
a b
0 1 2
1 2 5
2 3 6
3 4 4

Make a DataFrame with an index by using the index keyword argument.

df2 = pd.DataFrame(dict(a=[1,2,3,4], b=[2,5,6,4]), index = [1,2,5,6])
df2
a b
1 1 2
2 2 5
5 3 6
6 4 4

Move a column to be the index with .set_index()

df3 = df2.set_index("a")
df3
b
a
1 2
2 5
3 6
4 4

Rename the index values from scratch with .index

df3.index = [2,3,4,5]
df3
b
2 2
3 5
4 6
5 4

Note that index is a property of the DataFrame not a method, so the syntax is different.

Nuke the index values and start over from 0 with .reset_index()

df4 = df3.reset_index()
df4
index b
0 2 2
1 3 5
2 4 6
3 5 4

If you don't want the index to become a column, pass drop=True to reset_index().

df5 = df3.reset_index(drop=True)
df5
b
0 2
1 5
2 6
3 4

Reorder the rows with .reindex()

df6 = df5.reindex([2,3,1,0])
df6
b
2 6
3 4
1 5
0 2

Passing a value that isn't in the index results in a NaN.

df7 = df5.reindex([2,3,1,0,6])
df7
b
2 6.0
3 4.0
1 5.0
0 2.0
6 NaN

Advice

Ideally, add an index when you create your DataFrame with index =.

If reading from a .csv file you can set an index column by passing the column number.

For example:

df = pd.read_csv(my_csv, index_col=3)

Or pass index_col=False to exlcude.

How to set or change the index:

  • df.set_index() - move a column to the index

  • df.index - add an index manually

  • df.reset_index() - reset the index to 0, 1, 2 ...

  • df.reindex() - reorder the rows

Word associations to remember:

  • set_index() - move column

  • index - manual

  • reset_index() - reset

  • reindex - reorder

Wrap

I hope this article helped you create a little mental space to keep Pandas index methods straight. If it did, please give it some love so other people can find it, too.

I write about Data Science, Dev Ops, Python and other stuff. Check out my other articles if any of that sounds interesting.

Follow me and connect:
Medium
Dev.to
Twitter
LinkedIn
Kaggle
GitHub

Reset Button

Happy indexing!

Posted on by:

discdiver profile

Jeff Hale

@discdiver

Into data science, data science education, developer experience, cloud computing, Python, ethics, entrepreneurship, writing, community building, the outdoors, and Hawaiian shirts.

Discussion

pic
Editor guide