Linear Regression: -
Linear Regression establishes a relationship between dependent variable (Y) and one or more independent variables (X) using a best fit straight line (also known as regression line).
What is Polynomial Regression?
* Polynomial Regression is a form of linear regression in which the relationship between the independent variable x and dependent variable y is modeled as an nth degree polynomial. Polynomial regression fits a nonlinear relationship between the value of x and the corresponding conditional mean of y, denoted E (y |x).
* Polynomial Regression models are usually fit with the method of least squares.
* It is special type of regression Linear Regression where we fit the polynomial equation on the data with a curvilinear relationship between the dependent and independent variables.
We know that our data is correlated, but the relationship doesn’t look linear. So hence depending on what the data looks like, we can do a polynomial regression on the data to fit a polynomial equation to it.
Why do we need Polynomial Regression?
* We fit our model and notice it performs badly.
* We see that actual line and best fit, actual value has kind of curve.
* That’s where, polynomial Regression comes to the play, it predicts the best fit line that follows the pattern(curve) of the data, as shown in the pic below:
* Polynomial Regression is generally used when the points in the data are not captured by the Linear Regression Model.
As we increase the degree in the model, it tends to increase the performance of the model. However, increasing the degrees of the model also increases the risk of over-fitting and under-fitting the data.
Linear Regression is basically the first-degree Polynomial. I hope the below image makes it clear.
How to find the right degree of the equation?
In order to find the right degree for the model to prevent over-fitting or under-fitting, we can use:
1. Forward Selection: This method increases the degree until it is significant enough to define the best possible model.
2. Backward Selection: This method decreases the degree until it is significant enough to define the best possible model.
Advantages of using Polynomial Regression:
* Polynomial provides the best approximation of the relationship between the dependent and independent variable.
* Broad range of function can be fit under it. It basically fits a wide range of curvature.
Disadvantages of using Polynomial Regression
* The presence of one or two outliers in the data can seriously affect the results of the nonlinear analysis.
* These are too sensitive to the outliers.
* In addition, there are unfortunately fewer model validation tools for the detection of outliers in nonlinear regression than there are for linear regression.
Hope this blog is clear enough to explain the reason why we opt for Polynomial Regression over Simple Linear Regression in machine Learning.