Active Learning with Transformer-Based Machine Learning Models

#devops #productivity #tooling

The combination of active learning and transformer-based machine learning models provides a powerful tool for efficiently training deep learning models. By leveraging active learning, data scientists are able to reduce the amount of labeled data required to train a model while still achieving high accuracy. This post will explore how transformer-based machine learning models can be used in an active learning setting, as well as which models are best suited for this task.

What is Active Learning?

Active learning is an iterative process that uses feedback from previously acquired labels to inform the selection of new data points to label. It works by continuously selecting the most informative unlabeled data points that have the greatest potential to improve the model’s performance when labeled and incorporated into training. This iterative process creates an efficient workflow that allows you to quickly get high quality models with minimal effort. With each iteration, the performance increases, allowing to observe the improvement of a machine learning model.

Source: Active Learning with AutoNLP and Prodigy

For example, an experiment on the MRPC dataset with the bert-base-uncased transformer model found that 21 % fewer examples were needed using the active learning approach in contrast to using a fully labeled dataset from the start.

Transformer-Based Machine Learning Models for Active Learning

Transformer-based machine learning models such as

are well suited for active learning due to their ability to capture context information in text data. These models have been shown to achieve state-of-the-art results on many natural language processing tasks such as question answering, sentiment analysis, and document classification. By utilizing these types of models in an active learning setting, you can quickly identify the most important samples that need labeling and use them to effectively train your model. Additionally, these models are very easy to deploy on cloud platforms like AWS or Azure, making it even more convenient to use them in an active learning environment.

How we approach active learning in Kern AI refinery

In refinery, we use SOTA transformer models from Huggingface to create embeddings from text datasets.

This is usually done at the start of a new project because having the embedding for all of our text data allows us to quickly find similar records by calculating the cosine similarity of each embedded text. This can drastically increase the labeling speed.

After some labeling of the data is done, we are able to use these text embeddings to train simple machine learning algorithms, such as a Logistic Regression or a Decision Tree. We do not use these embeddings to train a transformer-based model again, because the embeddings are of such a high quality that even simple models provide high-accuracy results. While you save time and money through the active learning approach, you also save a lot of computational workload down the road.

In conclusion, transformer-based machine learning models provide a powerful tool for efficiently training deep learning models using active learning techniques. By leveraging their ability to capture contextual information from text data, you can quickly identify which samples should be labeled next in order to effectively train your model with minimal effort and cost. Furthermore, these types of models are highly scalable and easy to deploy on cloud platforms making them ideal for use in an active learning setting. With all these advantages combined together, it’s no wonder why transformer-based machine learning models are becoming increasingly popular among developers and data scientists alike.