A confusion matrix is a table that describes the performance of a classifier/classification model. It contains information about the **actual and prediction classifications** done by the classifier and this information is used to evaluate the performance of the classifier.

Note that the confusion matrix is only used for classification tasks, and as such cannot be used in regression models or other non-classification models.

Before we go on, let's look at some terms.

- Classifier: A classifier is basically an algorithm that uses "knowledge" gotten from training data to map input data to a particular category or class. Classifiers are either binary classifiers or multi-class/multi-categorical/multi-label/multi-output classifiers.
- Training and test data: When building a classification model/classifier, datasets are split into
**training data**and**test data**which have associated labels. A label is an expected output which is the category or class data belongs to. - Actual classifications: This is the expected output (labels) on the data.
- Prediction classifications: This is the output given by the classifier for a particular input data.

**An example**: Let's say we have built a classifier to categorize an input image of a car as either a sedan or not, and we have an image in our dataset that has been labeled as a non-sedan but the classification model classifies as a sedan.

In this scenario, the actual classification is **non-sedan** while the prediction classification is **sedan**.

### Types of Confusion Matrices

There are two types of confusion matrices:

- 2-class confusion matrix
- Multi-class confusion matrix

### 2-Class Confusion Matrix

A 2-class as the name implies is a confusion matrix that describes the performance of a binary classification model. A 2-class matrix for the **sedan** classifier I described earlier can be visualized as such:

In this visualization, we have two sections which have been outlined. We have the **predicted** classifications section which contains two subsections for each of the classes and the **actual** classifications section which has two subsections for each of the classes.

If this is your first time seeing a confusion matrix, I know you must be wondering what all the variables in the table represent. It is quite simple actually, I will explain as simply as I can, but before I do it is important to know that these variables represent a number of predictions.

##### The variable a

The variable **a** falls under the **Non-sedan** sub-section in both the **Actual** and **Predicted** classification sections. This means **a** predictions were made that correctly classified an image of a non-sedan [as a non-sedan].

##### The variable b

The variable **b** falls under the **Non-sedan** sub-section in the **Actual** classification section and under the **Sedan** sub-section in the **Predicted** classification section. This means **b** predictions were made that incorrectly classified an image of a non-sedan as a sedan.

##### The variable c

The variable **c** falls under the **Sedan** sub-section in the **Actual** classification section and under the **Non-sedan** sub-section in the **Predicted** classification section. This means **c** predictions were made that incorrectly classified an image of a sedan as a non-sedan.

##### The variable d

The variable **d** falls under the **Sedan** sub-section in both the **Actual** and **Predicted** classification sections. This means **d** predictions were made that correctly classified an image of a sedan [as a sedan].

Easy peasy lemon squeezy. (I hope? 😅)

##### But wait, we're not done yet.......

Now we have our confusion matrix for our **sedan** classifier, but how does this help us ascertain our classifier's performance/efficiency?

To ascertain the performance of a classifier using the confusion matrix and the data it contains, there are some standard metrics that we can calculate [for] using the data(variables) in the confusion matrix.

##### Accuracy

Accuracy in a 2-Class confusion matrix is the ratio of the total number of correct predictions to the total number of predictions.

From our confusion matrix, we can see that **a** and **d** predictions were made that correctly classified the input image and **b** and **c** predictions were made that incorrectly classified the input image.

Therefore, accuracy can be calculated as:

**Accuracy** = (**a + d**) / (**a + b + c + d**)

Where, **a + d** is the total number of correct predictions and **a + b + c + d** is the total number of predictions made.

##### True positives, True negatives, False positives and False negative

With relation to our classifier and confusion matrix:

**True positives (TP)** are the number of predictions where an image of a sedan is correctly classified [as a sedan].

From our confusion matrix, the variable **d** is also the **TP**.

**True negatives (TN)** are the number of predictions where an image of a non-sedan is correctly classified [as a non-sedan].

From our confusion matrix, the variable **a** is also our **TN**.

**False positives (FP)** are the number of predictions where an image of a non-sedan is incorrectly classified as a sedan.

From our confusion matrix, the variable **b** is also our **FP**.

**False negatives (FN)** are the number of predictions where an image of a sedan is incorrectly classified as a non-sedan.

From our confusion matrix, the variable **c** is also our **FN**.

##### True Positive Rate

The true positive rate is a ratio of the **true positives** to the sum of the **true positives** and **false negatives**. It shows how often the classifier classifies an image of a sedan as a sedan.

Therefore, the true positive rate can be calculated as:

**True Positive Rate** = **d** / (**c + d**)

Where **d** is **TP** and **c** is **FN**

True positive rate is also known as **recall** or **sensitivity**

##### False Positive Rate

The false positive rate is a ratio of the **false positives** to the sum of the **true negatives** and **false positives**. It shows how often the classifier classifies an image of a non-sedan as a sedan.

Therefore, the false positive rate can be calculated as:

**False Positive Rate** = **b** / (**a + b**)

Where **a** is **TN** and b is **FP**

##### True Negative Rate

The true negative rate is a ratio of the **true negatives** to the sum of the **true negatives** and **false positives**. It shows how often the classifier classifies an image of a non-sedan as a non-sedan.

Therefore, the false positive rate can be calculated as:

**True Negative Rate** = **a** / (**a + b**)

Where **a** is **TN** and b is **FP**

The true negative rate is also known as **specificity**.

##### False Negative Rate

The false negative rate is a ratio of the **false negatives** to the sum of the **false negatives** and **true positives**. It shows how often the classifier classifies an image of a sedan as a non-sedan.

Therefore, the false positive rate can be calculated as:

**False Negative Rate** = **c** / (**c + d**)

Where **d** is **TP** and **c** is **FN**

##### Precision

The precision is a ratio of the **true positives** to the sum of the **true positives** and **false positives**. It shows how often the classifier classifies an input image as a sedan and it turns out to be correct.

It is calculated as:

**Precision** = **d** / **(b + d)**

Where **d** is **TP** and **b** is **FP**

### An Example

Suppose we have the image below as the confusion matrix for our classifier, we can use the metrics defined above to evaluate its performance.

From the confusion matrix, we can see that:

- 4252 predictions were made that correctly classified a non-sedan [as a non-sedan]. Therefore, our variable
aand theTrue Negative (TN)is 4252.- 875 predictions were made that incorrectly classified a non-sedan as a sedan. Therefore, our variable
band theFalse Positive (FP)is 875.- 421 predictions were made that incorrectly classified a sedan as a non-sedan. Therefore, our variable
cand theFalse Negative (FN)is 421.- 4706 predictions were made that correctly classified a sedan [as a sedan]. Therefore, our variable
dand theTrue Positive (TP)is 4706

Using the data we've "extracted", we can calculate the aforementioned metrics and ascertain the performance of the classifier. We can already tell that the classifier performs well since the number of correct predictions is greater than the number of incorrect predictions.

##### Accuracy

Accuracy = **(a + d)** / **(a + b + c + d)**

= **(4252 + 4706)** / **(4252 + 875 + 421 + 4706)**

= **(8958)** / **(10254)**

= **0.8736102984201287**

Accuracy = **0.87**

Therefore the classifier has an accuracy of 0.87 which is 87%

##### True Positive Rate

TPR = **TP / (TP + FN)**

= **4706** / **(4706 + 421)**

= **4706** / **5127**

= **0.917885703140238**

TPR = **0.92**

Therefore the classifier has a True Positive Rate of 0.92 which is 92%

##### False Positive Rate

FPR = **FP / (FP + TN)**

= **875** / **(875 + 4252)**

= **875** / **5127**

= **0.1706651062999805**

FPR = **0.17**

Therefore the classifier has a False Positive Rate of 0.17 which is 17%

##### True Negative Rate

TNR = **TN / (TN + FP)**

= **4252 / (4252 + 875)**

= **4252 / 5127**

= **0.8293348937000195**

TNR = **0.83**

Therefore the classifier has a True Negative Rate of 0.83 which is 83%

##### False Negative Rate

FNR = **FN / (FN + TP)**

= **421 / (421 + 4706)**

= **421 / 5127**

= **0.082114296859762**

FNR = **0.08**

Therefore the classifier has a False Negative Rate of 0.08 which is 8%

##### Precision

Precision = **TP / (TP + FP)**

= **4706 / (4706 + 875)**

= **4706 / 5581**

= **0.8293348937000195**

Precision = **0.83**

Therefore the classifier has a Precision of 0.83 which is 83%

### How to generate a confusion matrix using Python

```
import matplotlib.pylab as plt
import itertools
import numpy as np
from sklearn.metrics import confusion_matrix
def plot_confusion_matrix(cm, classes,normalize=False):
plt.figure(figsize = (5,5))
plt.imshow(cm, interpolation='nearest', cmap=plt.cm.Blues)
plt.title('Confusion matrix')
plt.colorbar()
tick_marks = np.arange(len(classes))
plt.xticks(tick_marks, classes, rotation=90)
plt.yticks(tick_marks, classes)
if normalize:
cm = cm.astype('float') / cm.sum(axis=1)[:, np.newaxis]
thresh = cm.max() / 2.
for i, j in itertools.product(range(cm.shape[0]), range(cm.shape[1])):
plt.text(j, i, cm[i, j],
horizontalalignment="center",
color="white" if cm[i, j] > thresh else "black")
plt.tight_layout()
plt.ylabel('Actual')
plt.xlabel('Predicted')
dict_characters = {0: 'Non-sedan', 1: 'Sedan'}
y_pred = model.predict(test_data)
y_pred_classes = np.argmax(y_pred, axis=1)
y_true = np.argmax(test_labels, axis=1)
confusion_mat = confusion_matrix(y_true, y_pred_classes)
plot_confusion_matrix(confusion_mat, classes = list(dict_characters.values()))
```

To generate a confusion matrix, we utilize numpy, matplotlib.pylab to visualize the matrix, the confusion_matrix function from the sklearn.metrics package to generate the confusion matrix, and itertools for looping/iteration.

First, we define a function **plot_confusion_matrix** that takes the generated confusion matrix and expected/possible classes as arguments and the uses matplotlib.pylab to visualize the confusion matrix.

In the snippet, we assume we have already have our trained model and training and test data with associated labels.

**dict_characters** is a dictionary of the two possible classes, in our case, "**non-sedan**" and "**sedan**".

**y_pred** is a numpy array of predictions done by the classifier on the test data

**model** is our trained classifier/algorithm

**test_data** is our test data

**y_pred_classes** is a numpy array of indices relative to **y_pred** which is the array of predictions done by the classifier on the test data.

**y_true** is a numpy array of indices relative to the actual/correct labels of the test_data.

**test_labels** is a list of labels of the test data.

Using the above, we use the **confusion_matrix** function from **sklearn.metrics** to generate the confusion matrix, passing in the correct values (**y_true**) and the estimated values returned by the classifier (**y_pred_classes**) and we store the generated confusion matrix in a variable **confusion_mat**.

We then pass the confusion matrix (**confusion_mat**) and a list of the values of our possible classes (**dict_characters**) as arguments to the **plot_confusion_matrix** function which then visualizes the confusion matrix.

In my next post, I [hopefully] would be writing on the multi-class confusion matrix.

## Top comments (3)

hi I have 101 classes and their accuracies, I want to draw confusion matrix for them. My code is in Pytorch

A portion of my code is following

video_pred = [np.argmax(x[0]) for x in output]

Hi, Manza. I've not worked with Pytorch but I believe it should be similar to the example I gave.

From the snippet you gave, since you already have the confusion matrix (confusion_matrix), what you need to do is create a dictionary of all the classes like this:

Then pass your confusion matrix (confusion_matrix) and the dictionary of classes (dict) to the function plot_confusion_matrix in the example I gave or a similar function, probably one you've written to your preferences.

That should output a visualization of your confusion matrix.

Nice little summary. Not confusing at all ;)