DEV Community

Cover image for Intuitive Introduction to Logistic Regression (Understanding the mathematics behind the model)
Ogunbiyi Ibrahim
Ogunbiyi Ibrahim

Posted on

Intuitive Introduction to Logistic Regression (Understanding the mathematics behind the model)

Image description

Introduction

I've been working with Logistic Regression to fit my data and make good predictions as a beginner. But as soon as I'm done with that I feel empty and the reason for that was simply because I was performing the same task iteratively (i.e. Just fitting and predicting which could get boring because of not understanding what's going on behind the scene). I've always thought of how is the model able to perform the predictions.

But then I sat down one day and studied how the Logistic regression can perform its prediction. Funny enough, I was able to have a little knowledge of it, so I will be sharing my knowledge on what's learned so far. Relax and feel free because it's going to be interesting. Heads up! thanks in advance.

What is Logistic Regression?

I will like to use the definition of Aurelian Geron which he defined in his book which says:

Logistic Regression (also called Logit Regression) is commonly used to estimate the probability that an instance belongs to a particular class.

To better understand the above definition. I love to use this analogy to explain. Consider for instance if a doctor was observing a patient record to know if the patient has a headache or not. Given all those records the doctor can predict that the patient has a headache (this is usually referred to as the positive class and it is labeled 1) or the patient does not have a headache (negative class usually labeled with 0). This is how Logistic Regression works as well. If the estimated probability (i.e. the chance of the headache occurring) is greater than 50% or 0.5 then the model predicts 1 (positive class) or else 0 (the negative class).

Image description

Please note that Logistic Regression is not based only on predicting discrete values (i.e. 0 and 1 or whether something is this or not) usually referred to as binary classification. You can also use it to predict more than 2 labels. For instance, you wanted to build a model that can predict whether an image is a dog, cat, or wolf. Logistic Regression can also do that. This is referred to as Multinomial Classification.

How Does Logistic Regression Makes its Prediction?

Image description

Hmm, now we've gotten to the part where I and most people are curious about. To comprehend how it makes its prediction I will go through the formula step by step. I will be using a binary classification to explain and it will be based on one input variable(or predictor variable) in this article with that we can grasp how it predicts more input variables and also the Multinomial Classification.

Consider, for instance, we want to predict whether an individual will go bankrupt given his credit card balance.

In simple mathematical terms, this is how Logistic Regression does its prediction.


Pr(individual=bankruptcy|balance)

The above expression can be interpreted as the probability of an individual going into bankruptcy given his balance. The pipe symbol | is used to denote given.

However, that doesn't stop there. The above expression is just a way of how a lazy statistician will explain the formula of Logistic Regression.

Going deep the above expression miraculously transforms into this

Image description

Now you might be thinking what the heck is this. Let's unpack what we have above.

Image description
To better understand the above equation let's look at another equation that we might be familiar with.

Image description

The above equation is the most popular linear regression formula. More conventionally it is written as

Looking at the above equation we can see that it looks somewhat like the logistic regression if not all. Linear regression is used to find the best fit line that can be used to predict the value of y. Check the image below.
Image description

Let's say we want to predict the value of y where the value of x (input variable) is 5

Image description

Now we might be curious about the two parameters.
Image description These two parameters are needed for our model to pick the best fit line (i.e. they are needed for us to accurately predict the values of y). However, to get the values for these parameters there are mathematical approaches that can help us solve those values such as normal equation formula, gradient boosting, SVD, etc. But since our main focus is not on linear regression. We will head back to how we can calculate the parameters as well for Logistic Regression.

Note that the formula is used for predicting whether an individual will go bankrupt given his balance. On the other hand, if we want to calculate whether an individual will not go bankrupt given his credit card balance we can easily do that using the beautiful law of probability. The formula is stated later in the article.

To calculate the parameters in a Logistic function we use a technique called the maximum like hood estimation. Now before I go into the mathematical details of this technique. I will like to point out some terms in computer science called abstraction and leap of faith. This means that you don't necessarily need to go deep down into details of how things work (I.e. if you don't know how things work for now it's okay as you advance in your learning you will surely know). Thank to those mathematicians and statisticians that came up with this formula. So therefore I won't be talking about how the parameters are being calculated using maximum likely hood. However, if you wish to know more about it you can check out this link.

Another way of using Logistic regression is in terms of the log odds ratio. Before we do that let's define the odds

Odds can be defined as the probability that an event will occur divided by the probability that an event will not occur.

From the logistic function. The probability that an individual will go bankrupt is

Image description

The probability that the individual will not go bankrupt is:

Image description

Now for us to find the odds, we divide the probability of the individual going bankrupt by the probability of not going bankrupt

Image description

When you perform the cross multiplication your result should be this:

Image description

Now when we take the natural log of both sides we have:

Image description

Remember that the natural log of e^a = a

We can see that the Logistic regression formula now looks like that of linear regression. That means we can arrive at logistic regression in two ways.

  1. By finding the probability estimate of the logistic function based on x
    Image description

  2. By finding the log odds of the linear function based on x
    Image description

Now that we've understood that let's look at the reason why we can predict classification problems using linear regression.

Why Linear Regression Won't work

The reason why Linear regression won't work simply is that looking at the below image we can be sure that using linear regression the model can predict values that are less than 0 i.e. we can have negative values and also the model can predict values greater than 1, imagine our model predicts values such as 5. This is not efficient enough. The goal of classification is to have values between 0 and 1.

Image description
(c)Author:Saishruthi Swaminathan Linear Regression
The logistic function solves this problem. The numerator part of it ensures that the values are greater than 0 or equal to 0 and also the denominator ensures that the values are less than or equal to 1. With this, the Logistic regression graphs form an S shape curve, unlike the Linear regression which is just a straight line.

Image description
(c) Author: z_ai towardsdatascience

Conclusion

Thank you for taking your time to read this article Logistic regression is one of the popular estimators used to predict classification problems. This article is just a beginner guide on how it makes its prediction. You can check books and blog posts if you wish to know more about the mathematical concept behind them. Thank you.

Top comments (0)