DEV Community

Purva Masurkar
Purva Masurkar

Posted on

Day 5-7 of 100 Days of ML Code: Data Manipulation with Pandas

"Embrace the struggle, trust the process, and persist with effort."

To start using the Pandas library, we need to import it into our Python environment. For this example, I downloaded a Shop Customer Data dataset from Kaggle that provides a detailed analysis of an imaginary shop's ideal customers. To read the CSV file into our Pandas DataFrame, we can use the pd.read_csv() function.

Indexing

Indexing in Pandas refers to the process of selecting specific rows and/or columns from a Pandas DataFrame or Series. There are several ways to perform indexing in Pandas:

iloc(): This method allows you to select rows and columns by integer location.

Image description

loc(): This method allows you to select rows and columns by label or boolean mask.

Image description

Boolean Indexing: This method allows you to filter a DataFrame by a boolean condition.

Image description

Filtering

Filtering in Pandas refers to the process of selecting a subset of data from a DataFrame based on certain conditions.

Image description

Image description

Updating Rows and Columns

Updating rows and columns in a Pandas DataFrame involves changing the values of specific cells, rows, or columns.

Updating a specific cell:

Image description

Updating a specific row:

Image description

Updating a specific column:

Image description

In Pandas, there are two methods along with filter that are commonly used for transforming data:

apply: applies a function to a DataFrame or a Series along an axis.

Image description

map: applies a function to each element of a Series.

Image description

Adding/ Removing Rows and Columns

append(): You can use this method to add one or more rows to an existing DataFrame.

Image description

drop(): You can use this method to remove one or more rows from a DataFrame.

Image description

Sorting Data

Sorting data is an important operation in data analysis. In Pandas, you can sort data using the sort_values() method.

Image description

Image description

Grouping and Aggregating Data

Grouping and aggregating data is a common operation in data analysis. In Pandas, you can group data using the groupby() method and aggregate it using the agg() method.

Image description

Top comments (0)