DEV Community

Cover image for Pandas-Basics In Short
MdMusfikurRahmanSifar
MdMusfikurRahmanSifar

Posted on

Pandas-Basics In Short

Pandas is a python library that is used to analyse data. It is a table themed library like spreadsheet in excel unlike numpy which had a matrixlike theme. It allows us to analyse, manipulate and explore huge amount of data.

For the basics, we will discuss a few topics-

  • Series
  • DataFrame
  • Missing data
  • Groupby
  • Merging,joining & concatinating

To start we need to-

import numpy as np
import pandas as pd
Enter fullscreen mode Exit fullscreen mode

Series:

syn=pd.Series(data,index)

series ex
Here data and index can be edited and fixed according to our need. It can be list, numpy array or even dictionary.

series datatype
Here's some examples-(look up the variations)

pd-03

pd-04

pd-05
If index not mentioned then by default it is added from 0
pd-06

pd-07
In dictionary the keys are the index and values are data
pd-08
Series is just an idea but we won't see it most often. Its like a string in a list. It won't show a table rather a tablelike presentation. Now what we will use is dataframe which gives us our expected output.

DataFrame:

syn=pd.DataFrame(data,index,columns)
It is the fundamental topic. So we need to know about some of the usage and applications-

  • Selection and indexing
  • Conditional selection
  • Creating new column
  • Removing column-row

Selection and indexing:

Selecting a row-column:

selection
Syntax:
Column selection: arr[column]- returns a series
Row selection: arr.loc[row]- returns a series
Row selection: arr.iloc[row number(starts from 0)]

column

row

Selecting range:

By rows-

select range

By columns-

range select col

Selecting data:

By combining previous methods we can get a data from the dataframe

data selection

Conditional Selection:

Here we apply condition. If we just apply condition, it gives us boolean result. If we call the dataframe then it gives us values true to the condition and NaN in the false ones.

cond. select

cond. select

We can even combine conditions by 'and'/'or'. But here in pandas to combine conditions we use '&'/'|' instead of 'and'/'or'.
17
18

Creating columns:

syn: arr[column name]=data of the column
19

Removing row-column:

axis=0 -> Row
axis=1 -> Column
inplace=True is used to make the change permanent
20
21
by default axis=0

Missing value:

Adding missing value:

use np.nan in data

23

Removing NaN:

By default .dropna() removes row with NaN
For column use .dropna(axis=1)
24
We can also spare row or columns with certain number of true values
25
26

Filling missing values:

27

Groupby:

We can group common data in a column and work with them
28
29
After .groupby() all common data gets stored...it doesn't print other then when we work with them. Like-
30
31
try .min(), .max(), .describe(), .mean() etc.

Concatenating, Merging, Joining:

Concatenating:

to attach column-wise or row-wise: pd.concat([],axis= ) by default axis=0
32
33

Merging:

to attach regarding common column
34
35

Joining:

to attach regarding common index
36

37

Summary:

38

This was the basics of pandas. It is really the fundamental stuff. There are features related to file handling, data analysis, plotting etc.
Let's keep exploring...let's dive together😉

Top comments (0)