Here today we are talking about pandas, what are data frame and how to create them. So first see about pandas.
Pandas
Pandas is an open-source Python library providing high-performance data manipulation and analysis tools using its powerful data structures. The name Pandas is derived from the word Panel Data – an Econometrics from Multidimensional data.
[pandas] is derived from the term "panel data", an econometrics term for data sets that include observations over multiple time periods for the same individuals.
Pandas have so many uses that it might make sense to list the things it can't do instead of what it can do.
This tool is essentially your data’s home. Through pandas, you get acquainted with your data by cleaning, transforming, and analyzing it.
we import as follows:
>>> import pandas as pd
Python has three main Data Structure :
1 . Series :_ Series is a one-dimensional labeled array capable of holding any data type (integers, strings, floating point numbers, Python objects, etc.). The axis labels are collectively referred to as the index. The basic method to create a Series is to call:
>>> s = pd.Series(data, index=index)
Here, data
can be many different things:
- a Python dict
- an ndarray
- a scalar value (like 5)
2 . Data Frame : DataFrame is a 2-dimensional labeled data structure with columns of potentially different types. You can think of it like a spreadsheet or SQL table, or a dict of Series objects. It is generally the most commonly used pandas object. Like Series, DataFrame accepts many different kinds of input:
- Dict of 1D ndarrays, lists, dicts, or Series
- 2-D numpy.ndarray
- Structured or record ndarray
- A
Series
- Another
DataFrame
>>> d = {'one' : pd.Series([1., 2., 3.], index=['a', 'b', 'c']), 'two' : pd.Series([1., 2., 3., 4.], index=['a', 'b', 'c', 'd'])}
>>> df = pd.DataFrame(d)
>>> df
one two
a 1.0 1.0
b 2.0 2.0
c 3.0 3.0
d NaN 4.0
3 . Panel : Panel is a somewhat less-used, but still important container for 3-dimensional data. The term panel data is derived from econometrics and is partially responsible for the name pandas: pan(el)-da(ta)-s
. The names for the 3 axes are intended to give some semantic meaning to describing operations involving panel data and, in particular, econometric analysis of panel data. However, for the strict purposes of slicing and dicing a collection of DataFrame objects, you may find the axis names slightly arbitrary:
- items: axis 0, each item corresponds to a DataFrame contained inside
- major_axis: axis 1, it is the index (rows) of each of the DataFrames
- minor_axis: axis 2, it is the columns of each of the DataFrames
>>> wp = pd.Panel(data)
The most common and used data structure in pandas is DataFrame
. Now we see different ways to make dataframe
using pandas.
The first one is, creating Dataframe
by using list of list:
Example:
import pandas as pd
data = [['Ram', 10], ['Aman', 15], ['Rishi', 14]]
df = pd.DataFrame(data, columns = ['Name', 'Age'])
df
Next methode is to create Dataframe
by using python dict
or ndarray
.
Example:
import pandas as pd
data = {'Name':['Ram', 'jhon', 'krish', 'jack'],
'Age':[20, 21, 19, 18]}
df = pd.DataFrame(data)
df
Next is by importing data from csv
files. For this we use pd.read_csv()
function.
Example:
import pandas as pd
df = pd.read_csv('data.csv')
The next way is by connecting DataBase
. We can create a DataFrame
by using DataBase
also. We take an example code which connects SQLite
database and creates dataframe
.
For this, first create an Connection
Object, and then use pd.read_sql_query()
for creating dataframe
.
import pandas as pd
import sqlite3
conn = sqlite3.connect("database.db")#put name of database
df = pd.read_sql_query(query)
There are some methods from which we can create data frame
in pandas but there are several more ways to create data frames
. Pandas IO tools
support multiple types of file format for reading and writing data such as CSV
, JSON
, HTML
, SAS
, and Many more. For reading more about Pandas IO Tools
go here or open this link:
https://pandas.pydata.org/docs/user_guide/io.html#io
Thanks for reading
Top comments (0)