DEV Community

Cover image for The 10 Trending Python Repositories on GitHub (May 2022)
Eduardo Blancas
Eduardo Blancas

Posted on • Originally published at ploomber.io

The 10 Trending Python Repositories on GitHub (May 2022)

A few months ago, I discovered that GitHub keeps track of trending repositories, and since then, I often take a look at it to see what's up. So this month, I decided to share my thoughts on what I found; let's get started!

DALL·E Mini

AI model that generates images from text.

dall e mini

The announcement of Open AI's DALL·E 2 took the community by storm, but given that it's not available, it's no surprise that this project is seeing significant interest.

PaddleNLP

NLP library with pre-trained models.

paddle nlp

PaddleNLP is a library for Natural Language processing. It provides a comprehensive set of Chinese transformer models, and its design is based on Hugging Face's Transformer library.

ColossalAI

A framework for large-scale Deep Learning parallel training.

colossal

As transformer architectures become the standard in many CV and NLP tasks, better performance comes with larger model sizes. Colossal AI aims to provide a simple AI to train large models in parallel.

DeepFaceLive

A library to swap faces from a website or video.

DeepFaceLive

DeepFaceLive allows changing the face in real-time or from a recording. Imagine hopping on a Zoom call and looking like Keanu Reeves!. Crazy!

Label Studio

A data labeling tool for audio, text, images, videos, and time series via a UI.

LabelStudio

Getting accurately labeled data is the first task in many ML projects. Label Studio supports many types of data and offers a graphical user interface to do it.

Intermission: Ploomber

Ploomber is a framework to develop pipelines interactively (Jupyter, VSCode) and deploy them to the cloud (K8s, Airflow AWS, SLURM).

ploomber

Interactive tools like Jupyter make it hard to develop maintainable projects; Ploomber allows data scientists to keep the interactive workflow they are used to but embrace best practices from software engineering to ease the transition to production.

DevOps Exercise

A collection of >2.2k DevOps interview questions.

devops exercises

The first non-AI repository on the list! This repository hosts more than 2.2k DevOps questions to help you prepare for your interview!

PaddleOCR

A library for creating Optical Character Recognition tools.

paddle ocr

PaddleOCR supports many OCR-related algorithms to help users through data production, model training, compression, inference, and deployment.

DeepFaceLab

DeepFaceLab is a library to replace faces in videos.

deepface lab

Another deepfakes library! According to the repository, more than 95% deepfake videos are created with DeepFaceLab.

IVY

Ivy aims to provide a single interface for ML frameworks.

ivy

With the explosion of computational frameworks such as JAX, TensorFlow, PyTorch, MXNet, and NumPy, it's hard for practitioners to keep up and master them. Ivy aims to unify them so you can write once and export to any of them.

Airflow

Airflow is a platform to author, schedule, and monitor workflows.

airflow

Airflow is one of the most widely used platforms for managing workflows. It allows you to define workflows as directed acyclic graphs of tasks and schedule them.


Originally posted at ploomber.io

Top comments (1)

Collapse
 
zeyuchen profile image
Zeyu Chen

PaddleNLP is awesome!