I have created a list of useful python packages for data science.
r0f1 / datascience
Curated list of Python resources for data science.
Awesome Data Science with Python
A curated list of awesome resources for practicing data science using Python, including not only libraries, but also links to tutorials, code snippets, blog posts and talks.
Core
pandas - Data structures built on top of numpy.
scikit-learn - Core ML library.
matplotlib - Plotting library.
seaborn - Data visualization library based on matplotlib.
pandas_summary - Basic statistics using DataFrameSummary(df).summary()
.
pandas_profiling - Descriptive statistics using ProfileReport
.
sklearn_pandas - Helpful DataFrameMapper
class.
missingno - Missing data visualization.
rainbow-csv - Plugin to display .csv files with nice colors.
Environment and Jupyter
General Jupyter Tricks
Fixing environment: link
Python debugger (pdb) - blog post, video, cheatsheet
cookiecutter-data-science - Project template for data science projects.
nteract - Open Jupyter Notebooks with doubleclick.
papermill - Parameterize and execute Jupyter notebooks, tutorial.
nbdime - Diff two notebook files, Alternative GitHub App: ReviewNB.
RISE -β¦
Sometimes, I have also linked to Youtube Talks, other Github Repos that contain short examples, etc.
Want to contribute? Let me know.
Top comments (2)
Short examples are great in this space. Appreciate the list.