Precision vs Recall made clear without any formulas

#machinelearning #datascience

Have you too been struggling to understand what precision and recall ACTUALLY mean? Not in terms of formulae, but, actually how to interpret it? Well, I struggled with this too.

So, let me try and explain you, without using any formulae, but just intuition and a real, practical example, what they actually mean.

For this, we'll take 2 different binary classification scenarios:

Scenario 1. Classifying whether a person have committed murder(=1) or not(=0).
Scenario 2. Classifying whether a person has diabetes(=1) or not(=0).

Now, what if I were to ask you, which of these have more dire consequences:

Falsely marking 0 as 1, or
Falsely marking 1 as 0?

Let's take the first case: the murder one:

Falsely marking 0 as 1: You falsely classify an innocent person of murder.

Consequence: That person dies.
Falsely marking 1 as 0: You falsely classify the murderer as innocent.

Consequence: The murderer goes free

Which of these 2 cases is the one having more dire consequences?

Don't answer yet.

Let's do the same for second case now: the diabetes one:

Falsely marking 0 as 1: You falsely classify a person as having diabetes, even though that person doesn't have it.

Consequence: Starts to take medicines, a few side effects, maybe, maybe, death because of those medicines
Falsely marking 1 as 0: You falsely classify a person as NOT having diabetes, even though that person actually has.

Consequence: A definite death

Now let's answer that question about which of these scenarios is more dire:

Wrongly convicting an innocent person of murder, and subsequently letting that person die of death punishment is far, far more dire than letting a murderer go free.
Wrongly diagnosing a diabetic person as NOT having diabetes, and letting that person go without any medication, which ultimately will lead to the person's death is far, far more dire than wrongly diagnosing a non-diabetic person as diabetic.

That means:

For case 1, the consequence of Falsely marking 0 as 1(False Positive) if greater than Falsely marking 1 as 0(False Negative)
For case 2, the consequence of Falsely marking 1 as 0(False Negative) if greater than Falsely marking 0 as 1(False Positive)

And this is where our precision and recall comes into play.

Wherever the consequences of False Positive is greater than that of False Negative (case 1 here), go with higher precision.
Wherever the consequences of False Negative is greater than that of False Positive (case 2 here), go with higher recall.

Let me expand on this with another example, but, this time a little less intuitive, but, more close to numbers:

Consider a certain binary classification problem in which true 1 cases are 8, but, your model gave 10. In this case, those 2 extra 1's are false positives. This is what precision can mathematically tell you. This here has a precision of: 8/(8+2) => 0.8 => 80%.
Now consider that true no. of 1 outcomes are actually 10, but, your model gave only 8 (rest 2 were classified as 0). In this case, those 2 less 1's, which were classified as 0's are false negatives. This is what recall can mathematically tell you. This here has a recall of: 10/(10+2) => 0.8333 => 83.33%.

I hope it helps...

Do point it out in case you find issues in this. I'm still learning all of this..

DEV Community

Precision vs Recall made clear without any formulas

For this, we'll take 2 different binary classification scenarios:

Let's take the first case: the murder one:

Let's do the same for second case now: the diabetes one:

Now let's answer that question about which of these scenarios is more dire:

Top comments (0)

Read next

How to Run Samurai on Google Colab

New Voice Command System Tackles Variable-Length Speech for Improved Live Transcription

LLMs' Overparameterization: Performance-Efficiency Trade-Off Uncovered

Enhancing LLM Performance at Scale with CDN-Based Knowledge Injection