Awesome Data Science with Python

github logo ・1 min read

I have created a list of useful python packages for data science.

r0f1 / datascience

Curated list of python packages and tutorials for data science.

Data Science Awesome

Core

pandas | Data structures built on top of numpy.
scikit-learn | Core ML library.
matplotlib | Plotting library.
seaborn | Python data visualization library based on matplotlib.
pandas_summary | Basic statistics using DataFrameSummary(df).summary().
pandas_profiling | Descriptive statistics using ProfileReport.
sklearn_pandas | Helpful DataFrameMapper class.
janitor | Clean messy column names.
missingno | Missing data visualization.

Pandas and Jupyter

General ticks: link
nteract | Open Jupyter Notebooks with doubleclick.
modin | Parallelization library for faster pandas DataFrame.
xarray | Extends pandas to n-dimensional arrays.
blackcellmagic | Code formatting for jupyter notebooks.
pivottablejs | Drag n drop Pivot Tables and Charts for jupyter notebooks.
qgrid | Pandas DataFrame sorting.
nbdime | Diff two notebook files, Alternative Github App: ReviewNB.

Extraction

textract | Extract text from any document.

Big Data

spark | DataFrame for big data.
spark cheatsheet
dask | Pandas DataFrame for big data…

Sometimes, I have also linked to Youtube Talks, other Github Repos that contain short examples, etc.

Want to contribute? Let me know.

twitter logo DISCUSS (1)
markdown guide
 

Short examples are great in this space. Appreciate the list.

Classic DEV Post from Apr 5

How NOT to ask for help

Every now and then I would review and answer questions on stackoverflow regardi...

Florian Rohrer profile image
such software.. much wow!

Hey there reader...

Do you prefer sans serif over serif?

You can change your font preferences in the "misc" section of your settings. ❀️