Facebook launches Dynaboard an evaluation-as-a-service for NLP

#machinelearning #nlp #news

In Natural Language Processing it is very difficult to gauge the performance of a model. Facebook has launched Dynaboard which ranks state-of-the-art language models like BERT, RoBERTa, ALBERT, T5, and DeBERTa on four common NLP tasks. The tasks are-

Natural Language Inference
Question Answering
Sentiment Analysis
Hate Speech

For evaluating the models for these tasks first a new performance evaluation parameter was created that is known as Dynascore.
It takes into consideration different metrics which include

Accuracy - how many examples did the model get right as a percentage
Compute - To account for computation, we measure the number of examples that a model can process per second on its instance in our evaluation cloud
Memory - We average the memory usage over the duration that the model is running, with measurements taken each N seconds
Robustness - We evaluate robustness of a model's prediction by measuring changes after adding perturbations to the examples
Fairness - we perform perturbations of original datasets by changing, for instance, noun phrase gender (e.g., replacing “sister” with “brother”, or “he” with “they”) and by substituting names with others that are statistically predicative of another race or ethnicity. For the purposes of Dynaboard scoring, a model is considered more “fair” if its predictions don’t change after such a perturbation

Dynascore is calculated by giving different weightage to these metrics and combining them depending on the type of task. First the tasks mentioned above which form the Dynabench were solved statically. Dynaboard has helped to make this process more dynamic.

The objectives achieved by Dynaboard are-

Reproducibility
Accessibility
Backwards Compatibility
Forward Compatibility
Prediction Costs

To know more about Dynaboard read the official FB blog and to know about further details of implementation read the paper.

DEV Community

Facebook launches Dynaboard an evaluation-as-a-service for NLP

Top comments (0)

Read next

From Web Developer to Machine Learning Engineer: A Transformational Journey to Boost Your Career

Eye Blinking and Lip Sync for Animal Images Using AI in Python

Extracting Sensitive Data via Remote Timing Attacks on Efficient Language Models

Robust Interpretable Reasoning via Neurosymbolic Program Synthesis