DEV Community

David Mezzetti for NeuML

Posted on • Updated on • Originally published at neuml.hashnode.dev

Train a QA model

The Hugging Face Model Hub has a wide range of models that can handle many tasks. While these models perform well, the best performance is often found when fine-tuning a model with task-specific data.

Hugging Face provides a number of full-featured examples to assist with training task-specific models. When building models from the command line, these scripts are a great way to get started.

txtai provides a training pipeline that can be used to train new models programmatically using the Transformers Trainer framework.

This example trains a small QA model and then further fine-tunes it with a couple new examples (few-shot learning).

Install dependencies

Install txtai and all dependencies.

pip install txtai[pipeline-train] datasets pandas
Enter fullscreen mode Exit fullscreen mode

Train a SQuAD 2.0 Model

The first step is training a SQuAD 2.0 model. SQuAD is a question-answer dataset that poses a question with a context along with the identified answer. It's also possible to not have an answer. See the SQuAD dataset website for more information.

We'll use a tiny Bert model with a portion of SQuAD 2.0 for efficiency purposes.

from datasets import load_dataset
from txtai.pipeline import HFTrainer

ds = load_dataset("squad_v2")

trainer = HFTrainer()
trainer("google/bert_uncased_L-2_H-128_A-2", ds["train"].select(range(3000)), task="question-answering", output_dir="bert-tiny-squadv2")
print("Training complete")
Enter fullscreen mode Exit fullscreen mode

Fine-tune with new data

Next we'll add a few additional examples. Fine-tuning a QA model will help with framing a certain type of question or improve performance for a specific use-case.

For smaller models with a narrow use case, this helps the model zero in on the types of questions that are to be asked. In this case, we want to tell the model exactly the types of information we're looking for when asking for ingredients. This will help improve confidence in the answers the model is generating.

# Training data
data = [
    {"question": "What ingredient?", "context": "Pour 1 can whole tomatoes", "answers": "tomatoes"},
    {"question": "What ingredient?", "context": "Dice 1 yellow onion", "answers": "onion"},
    {"question": "What ingredient?", "context": "Cut 1 red pepper", "answers": "pepper"},
    {"question": "What ingredient?", "context": "Peel and dice 1 clove garlic", "answers": "garlic"},
    {"question": "What ingredient?", "context": "Put 1/2 lb beef", "answers": "beef"},
]

model, tokenizer = trainer("bert-tiny-squadv2", data, task="question-answering", num_train_epochs=10)
Enter fullscreen mode Exit fullscreen mode

Test the model

Now we're ready to test the results! The following sections run a question against the original model only trained with SQuAD 2.0 and the further fine-tuned model.

from transformers import pipeline

questions = pipeline("question-answering", model="bert-tiny-squadv2")
questions("What ingredient?", "Peel and dice 1 shallot")
Enter fullscreen mode Exit fullscreen mode
{'answer': 'dice 1 shallot',
 'end': 23,
 'score': 0.05128436163067818,
 'start': 9}
Enter fullscreen mode Exit fullscreen mode
from transformers import pipeline

questions = pipeline("question-answering", model=model.to("cpu"), tokenizer=tokenizer)
questions("What ingredient?", "Peel and dice 1 shallot")
Enter fullscreen mode Exit fullscreen mode
{'answer': 'shallot', 'end': 23, 'score': 0.13187439739704132, 'start': 16}
Enter fullscreen mode Exit fullscreen mode

See how the results are more confident and have a better answer. This method allows using a smaller model with a narrow set of functionality with the upside of increased speed. Give it a try with your own data!

Top comments (1)

Collapse
 
madilraza profile image
MUHAMMAD ADIL RAZA

Hey NeuML what a Great Peace of Tutorial you are writing .
i want to invite you to My medium Publication to Write your Blogs There and kickstart your Journey There .
medium.com/marsec-developers
this is the Link to our Medium Publication
either you can mail me directly at founder@marsecdev.com
hope to see you soon