DEV Community

Cover image for Pandas map()function with example
Beryl Chebet
Beryl Chebet

Posted on

Pandas map()function with example

Map() function allows us to transform data in a DataFrame or series one value at a time.A dataframe is a table with a value corresponding to a row and column entry.An example of creating a dataframe is as shown:

import pandas as pd
table=pd.DataFrame({'Age':[16,18,25,27,28,30,45],'Gender':['Female','Male','Female','Male','Male','Male','Female'],'Marks':[66,70,88,90,95,50,88]})
Enter fullscreen mode Exit fullscreen mode

The output should look like this

table DataFrame
In the above example we created a DataFrame table using pd.DataFrame() having the columns age,gender,marks.Entries are assigned to the respective columns as shown in the square brackets.
Now lets jump into using the map()function

Mapping female with 1 and male with 0 then displaying result on a different column.

table['sex_num']=table.Gender.map({'Female':1,'Male':0})
table.loc[:,['Gender','sex_num']]
Enter fullscreen mode Exit fullscreen mode

table['sex_num'] creates a new column sex_num,Gender.map specifies the values to map with in the column Gender. Dataframe.loc is used for accessing multiple columns . In this case we want to access the columns 'Gender' and 'sex_num'.

Comparison of Gender and sex num

To find deviation from mean mark

dev_mean=table.Marks.mean()
table['Deviation_From_Mean']=table.Marks.map(lambda p:p-dev_mean)
table.loc[:,['Deviation_From_Mean','Marks']]
Enter fullscreen mode Exit fullscreen mode

table.Marks.mean() tells pandas to calculate the marks mean and assign it to dev_mean. table['Deviation_From_Mean']_creates a new column deviation from mean and maps a lambda function _lambda p:p-dev_mean to each value of the column.A lambda function can take a number of arguments & execute an expression. The lambda function has a keyword, a variable and an expression.The keyword is lambda & must be included whenever you're using lambda function. In lambda p:p-dev_mean p _stands for each of the entries in the marks column. The expression _p-dev_mean subtracts the dev_mean from each of the entries.

Output

Top comments (0)