Classifiers' Evaluation Metrics

#datascience

Confusion matrix
Confusion matrix is a table that holds True and False Positive values ('TP' and 'FP'), as well as True and False Negative values ('TN' and 'FN').

Image

What is important for the project
For example, we have an image classifier, which identifies if a rock is a precious stone or not(e.g., diamond) and we use it for automated mining.
In this context, we may want to get as many stones as possible ('TP'), even if we have some not precious stones identified as diamonds ('FP'). Because it could be sorted out by an expert at a later stage.
Now let's imagine, that we are buying these stones by using our image classifier algorithm. We do not want to buy not precious stones('FP'), so our model should be very careful regarding False Positive predictions.

Common Evaluation Metrics
To evaluate and quantify the performance of a classification model, we can use common evaluation metrics: accuracy, balanced accuracy, precision, recall (a.k.a. sensitivity and True Positive Rate), Specificity (=1-False Positive Rate), ROC (=TPR vs FPR) and F1 score.
As we can see there are many options to choose from regarding evaluation metrics. However, all of these metrics can be calculated using confusion matrix values(TP, FP, TN, and FN). So, the main idea is to know what metrics are most important for the project, and how well balanced is the target we are trying to predict (classify).
The most general approach would be to choose a few metrics to optimize (e.g., accuracy, recall, precision, F1 score, ROC-AUC).

DEV Community

Classifiers' Evaluation Metrics

Top comments (0)

Read next

InternLM-XComposer2-4KHD: A Pioneering Large Vision-Language Model Handling Resolutions from 336 Pixels to 4K HD

Exploring Multicollinearity: Strategies for Detecting and Managing Correlated Predictors in Regression Analysis

Characterization of Large Language Model Development in the Datacenter

GenN2N: Generative NeRF2NeRF Translation