DEV Community

Anderson Braz
Anderson Braz

Posted on • Originally published at andersonbraz.com on

Data Science in Python: Pandas Read Sources

In this post I show basic knowledge and notes for data science beginners. You will find in this post an link to Jupyter file with code and execution.

Pandas Basics

Pandas is a fast, powerful, flexible and easy to use open source data analysis and manipulation tool, built on top of the Python programming language.

Use the following import convention:

import pandas as pd

Enter fullscreen mode Exit fullscreen mode

Important

Here I continue the content of the previous post Data Science in Python: Pandas Introduction

This post I consider three sources: CSV, XLSX and SQL Query

Read and Write CSV

pd.read_csv('origin-file.csv', header=None, nrows=5)
pd.to_csv('destin-file.csv')

Enter fullscreen mode Exit fullscreen mode

Read and Write Excel

pd.read_excel('origin-sheet.xlsx')
pd.to_excel('destin-sheet.xlsx', sheet_name='Sheet1')

Enter fullscreen mode Exit fullscreen mode

Read and Write to SQL Query or Database Table

from sqlahchemy import create_engine
engine = create_engine('sqlite:///:memory:')

pd.read_sql('SELECT * FROM my_table;', engine)
pd.read_sql_table('my_table', engine)
pd.read_sql_query('SELECT * FROM my_table;', engine)

Enter fullscreen mode Exit fullscreen mode

Conclusion

Pandas is flexible and easy to use analysis and manipulation data with external sources.

See on Practice - Code and Execution

colab.research.google.com/drive/1XAr9EMsuwH..

Credits

Photo by Markus Spiske on Unsplash

Top comments (0)