Active Learning for Natural Language Processing

#machinelearning #activelearning #ai

More than 90% of machine learning applications improve with human feedback. For example, a model that classifying news articles into pre-defined topics has been trained on 1000s of examples where humans have manually annotated the topics. However, if there are tens of millions of news articles, it might not be feasible to manually annotate even 1% of them. If we only sample randomly, we will mostly get popular topics like “politics” that the machine learning model can already identify accurately. So, we need to be smarter about how we sample. This talk is about “Active Learning”, the process of deciding what raw data is the most optimal for human review, covering: Uncertainty Sampling; Diversity Sampling; and some advanced methods like Active Transfer Learning.

Robert Munro has worked as a leader at several Silicon Valley machine learning companies and also led AWS’s first Natural Language Processing and Machine Translation solutions. Robert is the author of Human-in-the-Loop Machine Learning, covering practical methods for Active Learning, Transfer Learning, and Annotation. Robert organizes Bay Area NLP, the world’s largest community of Language Technology professionals. Robert is also a disaster responder and is currently helping with the response to COVID-19.

The slides are available on this link.

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.

DEV Community

Active Learning for Natural Language Processing

Top comments (0)

Read next

Top AI Search Engines for Business & Startups in 2025

The Ghost of AI Past, Present, and Future

Gemini 2.0 Released, Reminding of "AI Hitting the Wall" Talks

Dec 12 - Virtual AI, Machine Learning and Computer Vision Meetup with Meta AI