Before speaking about bias and variance, let's understand what hypothesis set is and how we are going to define it. First of all, when you train a model, you are seeking a hypothesis function over the entire space. It does not matter you train a linear regression, logistic regression or a deep network, you always have to understand what a hypothesis set is and how you're going to find a function you are looking for.
If we create a model to approximate the given target function, it means we define a hypothesis set. Our trained model is a point from it, which can be far or close to the target function.
Now let's define our hypothesis set. Let's assume we have chosen a model, which defines hypothesis set H.
There is an optimal point, where bias and variance are in a good position and their values are reasonable. To find that optimal point, we need to draw the curves for every value, which depends on the complexity of the model. By saying the complexity of the model, we mean the complexity of the hypothesis set, the size of it.
Our goal is to minimize the total loss, which consists of bias, variance, and small noise. These curves show that increasing the complexity of the model, will decrease the bias, but the variance will increase and as a result, the total loss will be high.
We can't take a too simple model, which can't even approximate the target function and can't take too big one either, because it has high variance.
This is a known problem in the machine learning sphere, specifically in deep learning. Every specialist knows about Underfitting or High Bias and Overfitting or High Variance. These are the main problems everybody faces and there are a lot of approaches to fix them. People tried to solve this in the following ways.
- Model Selection / Early Stopping
- Normalization Functions
- Augmentation Techniques
You can find more in the Following Article