DEV Community

Ajaykrishnan Selucca
Ajaykrishnan Selucca

Posted on

Machine Learning - Over-fitting & Under-fitting

In my last post on "BIAS and VARIANCE" we heard about two words - Under-Fit and Over-Fit. In this post, I am going to tell you precise;y what is Over-fitted and Under-Fitted model.

UNDER-FITTING:

It occurs when the model is too simple, say when there is Low Variance and High Bias. When the accuracy of the model is too low than our expectation, the model we have built is said to be under-fit.

Below is the graphical representation of an under-fit model.
(The red dots in the graph describes the data points, where major of those data points are present away from the line)

Alt Text

OVER-FITTING:

It occurs when the model is too complex, when there is Low Bias and High variance. (The machine learning model that we build, should not always be 100% accurate, which generally means Over-fitted model)

Below is the graphical representation of an Over-fit model.
(The line is drawn as per the red dots i.e. the data points)

Alt Text

Bias and Variance both contribute to errors in a model (but ideally there should be a right fit point, where both the bias and variance deviate from their value) but it's the prediction error that you want to minimize, not the bias or variance specifically.

Below is the graphical representation of the right fit point, where the model will have a good accuracy, without being over-fit or under-fit.

Alt Text

Ideally we want low variance and low bias. In reality, though, there's usually a trade-off.

A suitable fit should acknowledge significant trends in the data and play down or even omit minor variations.

This might mean re-randomizing our training, test data or using cross-validation, adding new data to better detect underlying patterns or even switching algorithms. Specifically, this might entail switching from linear regression to non-linear regression to reduce bias by increasing variance.

Top comments (0)