ML using Linear Regression Basic Concept

The first thing that comes to mind hearing Linear Regression comes to mind that statistics!!!!! But lets get some basic knowledge about using Linear Regression in Machine Learning.

Machine Learning(ML)


Machine learning is the task of making computers more intelligent without explicitly teaching them how to behave,It does so by identifying patterns/Characteristics in data .

In classic terms, machine learning is a type of artificial intelligence that enables self-learning from data and then applies that learning without the need for human intervention.

Broadly, ML is a subset of computer science which involves applying statistics over observed data to generate some process that can achieve some task. This encompasses both the structure of ML (taking data and learning from it using statistics) and the impact of ML (use cases like facial recognition and recommender systems).


Linear Regression

Linear Regression is a supervised machine learning algorithm where the predicted output

is continuous and has a constant slope. It’s used to predict values within a continuous range, (e.g. sales, price) rather than trying to classify them into categories (e.g. cat, dog). Now lets focus on the simplest form of linear regression that is simple linear regression.

Simple Linear Regression

Simple linear regression is a regression technique in which the independent variable has a linear relationship with the dependent variable. The straight line in the diagram is the best fit line. The main goal of the simple linear regression is to consider the given data points and plot the best fit line to fit the model in the best way possible.The linear regression model provides a sloped straight line representing the relationship between the variables.


Mathmetically the formula for a simple linear regression is:

Here,

y is the predicted value of the dependent variable (y) for any given value of the independent variable (x).

B0 is the intercept, the predicted value of y when the x is 0.

B1 is the regression coefficient – how much we expect y to change as x increases.

x is the independent variable ( the variable we expect is influencing y).

e is the error of the estimate, or how much variation there is in our estimate of the regression coefficient.

Linear regression finds the line of best fit line through your data by searching for the regression coefficient (B1) that minimizes the total error (e) of the model.

A linear line showing the relationship between the dependent and independent variables is called a Regression Line. A regression line can show 2 types of relationship:

Positive Linear Relationship:

If the dependent variable increases on the Y-axis and independent variable increases on X-axis, then such a relationship is termed as a Positive linear relationship.

In above figure 1st diagram shows Positive Linear Regression.

Negative Linear Relationship:

If the dependent variable decreases on the Y-axis and independent variable increases on the X-axis, then such a relationship is called a negative linear relationship.

In above figure 1st diagram shows Positive Linear Regression.


While finding solution we have to focus on the error between predicted values and actual values should be minimized. The best fit line will have the least error.

For better Understanding let's take one example:-

Consider predicting the salary of an employee based on his/her age. We can easily identify that there seems to be a correlation between employee’s age and salary (more the age more is the salary).


The hypothesis of linear regression is Y represents salary, X is employee’s age. So in order to predict Y (salary) given X (age), we can Scatter plot and linear regression line using http://endmemo.com/statistics/lr.php:


So, when age is more you get more salary and the two are directly correlated. This correlation is displayed by a blue line in the graph, which is called the the best fit line because it shows the best relationship between the scattered plots.

The intercept we got is -62297.794118 and the slope is 4209.558824. We can conclude that the average salary increases is ₹4209.55 for each year age increases in years. So, the best fit line can be determined as y = 4209.558824x - 62297.794118.

I hope this will help you get a clear cut understanding of what is machine learning and linear regression from a beginner point of view.




36 views0 comments

Recent Posts

See All
 

© Numpy Ninja.