In Machine Learning, Bias Variance Trade off is an important concept when it comes to choosing a machine learning algorithm for a problem. Bias is the expectation in error and variance is the variability in the model.
In this blog we will look at why we need bias variance trade off, how it affects the accuracy and how to optimally select bias and variance and avoid overfitting and underfitting conditions.
Why do we need to know about Bias Variance trade off?
For those who are wondering why do we need to know about bias variance trade off is that because whenever we use a machine learning model,
We would want our data to have a perfect fit.
We would want it to be consistent and have less errors in prediction, and
We would also want similar results when trained with similar datasets
So, understanding what bias and variance and their trade off mean in a machine learning model is very important.
To be precise, Bias and Variance trade offs help:
To avoid overfitting and underfitting conditions
To have consistencies in predictions
Errors in Machine Learning models:
The errors in any machine learning model is mainly because of bias and variance errors. There are some irreducible errors in machine learning that cannot be avoided. But, let’s see how to reduce errors due to bias and variance.
Errors due to Bias:
So, Bias is mathematically the expected error in the predictions of a model. It can be simply understood as the errors that occur due to assumptions we make in the model.
For example, when we try to solve a linear regression problem, we assume that the target has a linear relationship with its features, which may not be right and the errors due to the linearity in the model in this case are bias errors.
Bias is formally defined as
E[Y` - Y]
Y` - Predicted Value
Y - Actual Value(to be predicted)
High bias signifies that the model is underfitting
Low bias signifies that the model is overfitting
Errors due to Variance:
So, Variance is a measure of variability in the results predicted by the model. To put this in a simple way, variance quantifies the difference between prediction when we change the dataset.
High variance signifies that the model is overfitting
Low variance signifies that the model is underfitting
To understand the Bias and Variance errors better let’s look at the bullseye diagram given below:
In the bullseye diagram, the red circle represents the target and here are the findings:
Having low bias and low variance results in the model predicting values very close to the target, as we can see the values predicted are on the red circle which is the target.
Having high variance and low bias means the predictions are highly variable. Such a model is not consistent as we would get high variation in the predictions.
Having high bias and low variance means the model is consistent but the predictions are far away from the target. So, even this situation is not desirable.
Having high bias and high variance is a total disaster. The model is not consistent and the predictions are highly variable.
Now, let us take an example. Say we have 2 training datasets. Our task is to fit a curve on these datasets. Remember a machine learning model’s accuracy is a measure of how good it does on a test dataset.
So, if we assume a linear relationship between the target and its features as shown in the below figure. Both the datasets have a very similar length in such a case the model is consistent because the difference in prediction is very low. But the line may not be a correct representation of the points, hence it adds to an error which is the bias error.
And if we assume a polynomial curve, the difference in the prediction is huge as shown in the below figure. The predictions may not be even true on a test dataset and a situation like this will give rise to the variance error.
So, Bias Variance trade off is nothing but trying to get an optimal bias and variance for a model. When we try to increase the bias, the variance decreases and when we try to increase the variance, the bias decreases.
So, How do we get an optimal bias and variance?
It actually depends on the training algorithm to get an optimal bias and variance. These are some the ways to do it:
Reducing the dimensionality of the data - this means removing some features that add to the variance
Regularization in linear regression and Artificial neural networks will help.
Using mixture models and ensemble methods are the most proven way to improve the model to optimize bias and variance.
In K-Nearest Neighbor algorithm, using the optimal value of K can optimize bias and variance.
To Conclude, bias and variance trade off is one of the very important concepts that has to be used while choosing a training algorithm for any given problem.