Machine Learning - Over-fitting & Under-fitting

#machinelearning #datascience

In my last post on "BIAS and VARIANCE" we heard about two words - Under-Fit and Over-Fit. In this post, I am going to tell you precise;y what is Over-fitted and Under-Fitted model.

UNDER-FITTING:

It occurs when the model is too simple, say when there is Low Variance and High Bias. When the accuracy of the model is too low than our expectation, the model we have built is said to be under-fit.

Below is the graphical representation of an under-fit model.
(The red dots in the graph describes the data points, where major of those data points are present away from the line)

OVER-FITTING:

It occurs when the model is too complex, when there is Low Bias and High variance. (The machine learning model that we build, should not always be 100% accurate, which generally means Over-fitted model)

Below is the graphical representation of an Over-fit model.
(The line is drawn as per the red dots i.e. the data points)

Bias and Variance both contribute to errors in a model (but ideally there should be a right fit point, where both the bias and variance deviate from their value) but it's the prediction error that you want to minimize, not the bias or variance specifically.

Below is the graphical representation of the right fit point, where the model will have a good accuracy, without being over-fit or under-fit.

Ideally we want low variance and low bias. In reality, though, there's usually a trade-off.

A suitable fit should acknowledge significant trends in the data and play down or even omit minor variations.

This might mean re-randomizing our training, test data or using cross-validation, adding new data to better detect underlying patterns or even switching algorithms. Specifically, this might entail switching from linear regression to non-linear regression to reduce bias by increasing variance.

DEV Community

Machine Learning - Over-fitting & Under-fitting

Top comments (0)

Read next

20 Open Source Tools I Recommend to Build, Share, and Run AI Projects

Seamless Integration of Hugging Face AI Models via API for Any Application

Reinforcement Learning with Human Feedback (RLHF) for Large Language Models (LLMs)

Golden-Retriever: High-Fidelity Agentic Retrieval Augmented Generation for Industrial Knowledge Base