Recently, I published an article about binary classification metrics. The article gives a brief explanation of the most traditional metrics and presents less famous ones like NPV, Specificity, MCC and EER.
In this article, I decided to share the implementation of these metrics for Deep Learning frameworks. It includes recall, precision, specificity, negative predictive value (NPV), f1-score, and Matthews' Correlation Coefficient (MCC). You can use it in both Keras or TensorFlow v1/v2.
The Code
Here's the complete code for all metrics:
import numpy as np
import tensorflow as tf
from keras import backend as K
def recall(y_true, y_pred):
true_positives = K.sum(K.round(K.clip(y_true * y_pred, 0, 1)))
possible_positives = K.sum(K.round(K.clip(y_true, 0, 1)))
recall_keras = true_positives / (possible_positives + K.epsilon())
return recall_keras
def precision(y_true, y_pred):
true_positives = K.sum(K.round(K.clip(y_true * y_pred, 0, 1)))
predicted_positives = K.sum(K.round(K.clip(y_pred, 0, 1)))
precision_keras = true_positives / (predicted_positives + K.epsilon())
return precision_keras
def specificity(y_true, y_pred):
tn = K.sum(K.round(K.clip((1 - y_true) * (1 - y_pred), 0, 1)))
fp = K.sum(K.round(K.clip((1 - y_true) * y_pred, 0, 1)))
return tn / (tn + fp + K.epsilon())
def negative_predictive_value(y_true, y_pred):
tn = K.sum(K.round(K.clip((1 - y_true) * (1 - y_pred), 0, 1)))
fn = K.sum(K.round(K.clip(y_true * (1 - y_pred), 0, 1)))
return tn / (tn + fn + K.epsilon())
def f1(y_true, y_pred):
p = precision(y_true, y_pred)
r = recall(y_true, y_pred)
return 2 * ((p * r) / (p + r + K.epsilon()))
def fbeta(y_true, y_pred, beta=2):
y_pred = K.clip(y_pred, 0, 1)
tp = K.sum(K.round(K.clip(y_true * y_pred, 0, 1)), axis=1)
fp = K.sum(K.round(K.clip(y_pred - y_true, 0, 1)), axis=1)
fn = K.sum(K.round(K.clip(y_true - y_pred, 0, 1)), axis=1)
p = tp / (tp + fp + K.epsilon())
r = tp / (tp + fn + K.epsilon())
num = (1 + beta ** 2) * (p * r)
den = (beta ** 2 * p + r + K.epsilon())
return K.mean(num / den)
def matthews_correlation_coefficient(y_true, y_pred):
tp = K.sum(K.round(K.clip(y_true * y_pred, 0, 1)))
tn = K.sum(K.round(K.clip((1 - y_true) * (1 - y_pred), 0, 1)))
fp = K.sum(K.round(K.clip((1 - y_true) * y_pred, 0, 1)))
fn = K.sum(K.round(K.clip(y_true * (1 - y_pred), 0, 1)))
num = tp * tn - fp * fn
den = (tp + fp) * (tp + fn) * (tn + fp) * (tn + fn)
return num / K.sqrt(den + K.epsilon())
def equal_error_rate(y_true, y_pred):
n_imp = tf.count_nonzero(tf.equal(y_true, 0), dtype=tf.float32) + tf.constant(K.epsilon())
n_gen = tf.count_nonzero(tf.equal(y_true, 1), dtype=tf.float32) + tf.constant(K.epsilon())
scores_imp = tf.boolean_mask(y_pred, tf.equal(y_true, 0))
scores_gen = tf.boolean_mask(y_pred, tf.equal(y_true, 1))
loop_vars = (tf.constant(0.0), tf.constant(1.0), tf.constant(0.0))
cond = lambda t, fpr, fnr: tf.greater_equal(fpr, fnr)
body = lambda t, fpr, fnr: (
t + 0.001,
tf.divide(tf.count_nonzero(tf.greater_equal(scores_imp, t), dtype=tf.float32), n_imp),
tf.divide(tf.count_nonzero(tf.less(scores_gen, t), dtype=tf.float32), n_gen)
)
t, fpr, fnr = tf.while_loop(cond, body, loop_vars, back_prop=False)
eer = (fpr + fnr) / 2
return eer
Almost all the metrics in the code are described in the article previously mentioned. Therefore, you can find a detailed explanation there.
How to use in Keras or TensorFlow
If you use Keras or TensorFlow (especially v2), it’s quite easy to use such metrics. Here’s an example:
model = ... # define you model as usual
model.compile(
optimizer="adam", # you can use any other optimizer
loss='binary_crossentropy',
metrics=[
"accuracy",
precision,
recall,
f1,
fbeta,
specificity,
negative_predictive_value,
matthews_correlation_coefficient,
equal_error_rate
]
)
model.fit(...) # train your model
As you can see, you can compute all the custom metrics at once. Please, remember that:
- as they are binary classification metrics, you can only use them in binary classification problems. Maybe you’ll have some results for multiclass or regression problems, but they will be incorrect.
- they are supposed to be used as metrics only. It means you can’t use them as losses. In fact, your loss must always be “binary_crossentropy”, since it's a binary classification problem.
Final Words
