Updated: Dec 24, 2020
What is Machine learning?
Machine learning is a branch of artificial intelligence (AI) that provides systems the ability to automatically learn from data and improve their accuracy over time without being explicitly programmed.
Applications of Machine Learning
We now live in an age where machine learning is a hot topic. It is so pervasive today that you probably use it dozens of times a day without knowing it. It is changing the world by transforming many segments including education, entertainment, food industry, transport, social networking platforms and many more. I have mentioned few below.
Facial recognition: Automatic Friend tagging suggestions in Facebook is one of the common applications of Machine Learning (ML).
Maps: Traffic Alerts - Everyone using Google maps automatically provide their location, average speed, the route taken which helps Google collect massive Data about the traffic, which makes them predict the upcoming traffic and adjust your route according to it.
Self-Driving Cars: This is one of the coolest applications of Machine Learning. I’m sure you guys might have heard of Tesla cars!
Chatbots: Chatbots can converse with us through both text and voice. You may, for instance, have interacted with Amazon's voice.
Fraud Detection: Number of fraudulent activities have increased with the increase of credit card usage, wallets, online payments. Whenever a customer transacts, the ML model thoroughly x-rays their profile searching for suspicious patterns.
Virtual reality Headset: When we turn our head, picture moves too! That’s ML algorithm monitoring our actions and connecting them to the video.
Healthcare: Machine learning is being used in Healthcare for faster patient diagnosis. ML algorithms can predict health problems based on age, genetic history and also used in detecting cancer. It is used to save lives of people on a daily basis.
Recommender System: Facebook ads know you better than you know yourself. Amazon and Netflix also use Machine Learning for their recommender system. The hidden pattern buried in the data can be very useful for business. On the basis of your browse history, past purchases, items liked or added to cart, brand preferences etc., the product recommendations are made. Shopping website or the app recommend few items that somehow matches with your taste.
Let’s move onto Machine Learning methods to understand better.
Machine Learning Methods
We are generating a crazy amount of data every day and the world is turning to data for making decisions. The vast amount of big data is of no use unless we tag and analyze it. Our world has evolved where all the businesses will deeply depend on data for making any decision.
In Machine Learning, we collect data, clean the data, create algorithm, teach the algorithm essential patterns from the data and then expect the algorithm to give us a desired result. The more data it processes, the smarter the algorithm gets.
These are the main Machine Learning methods we follow.
In Supervised learning, we train the machine/model on a labeled dataset. This means data is already tagged and it consists of a set of observations with the result/target.
In Supervised learning algorithm, labeled data is split into Training dataset and Test dataset. Training dataset is used to train the model and Test dataset acts as new data for predicting outcome for unforeseen data. This cycle of training and testing data will be repeated until an acceptable outcome is provided by algorithm.
Supervised learning is further subdivided into:
Classification is a supervised machine learning technique which is used to predict categorical values. It is a process of dividing a given dataset into different categories. Process begins with predicting the category of a given dataset and the algorithm is trained. Final goal will be to predict which category the new data will fall into.
Fraud detection: This has 2 categories – Fraudulent activity present and No Fraudulent activity.
Email Spam Detection: Email reaching your mailbox - 2 categories here are - Spam and Non-Spam.
Classifier will use training data set to understand how input features are related to a particular category. Once the classifier is trained efficiently, it can detect any fraudulent activity or Spam emails (like in our examples).
Regression is a supervised machine learning technique which is used to predict continuous values. It learns from the labeled dataset and is then capable of predicting continuous values for the new dataset given to the algorithm.
Algorithm is trained with both input features and output labels. Regression helps in establishing a relationship among the variables by estimating how one variable affects the other.
Difference Between Classification and Regression
Based on the no. of input features, Regression is classified as Linear Regression and Multiple Regression.
Linear Regression (1 input)
Linear Regression attempts to model the relationship between two variables by fitting a linear equation to observed data. Input variable (Input feature) is called Independent variable and output variable (output label) is called Dependent variable. Linear Regression attempts to draw a line that comes close to the data by finding the slope and intercept that defines the line and minimizes regression errors.
Example: Let’s say I walk into a supermarket after paying a parking fee of $2. Whether I shop or not, I need to pay $2 parking ticket (constant, a). My shopping list has only 1 item — Apple. Each Apple is priced at $1.50. Shopping cost will depend on how many apples I pick. Assume I picked at least 1 Apple and let’s plot a graph using an equation.
Equation used is,
Y = a + bX
Y = Shopping Cost (Dependent variable)
a = Parking ticket (Constant)
b = Price of each Apple (Coefficient of Independent variable)
X = No. of Apples
After plotting all the values of shopping cost (Blue line), we see a linear line. But in real life, things are not this simple. When this equation is used for similar real-life examples, Linear Regression will always try to find a straight line that best fits the data.
Multiple Regression (Many input)
Multiple linear regression is used to estimate the relationship between two or more independent input variables and one output dependent variable. There are more that 1 input feature affecting the output variable.
Equation used is,
Y = a0 + b1X1 + b2X2+……+bnXn
Y = Dependent variable
a = constant (Intercept)
bn = coefficient of Independent variable
X = Independent Variables (X1, X2…. Xn)
Housing Price Prediction: The selling price of a house (dependent variable) can depend on the desirability of the location, the number of bedrooms, the number of bathrooms, the year the house was built, the square footage of the lot and a number of other factors.
Weather Forecast: Weather (dependent variable) in a particular location can depend on air temperature, air pressure, humidity of the air, cloud cover, speed and direction of the wind and precipitation.
Difference between Linear Regression and Multiple Regression
Now let’s move onto next Machine Learning method.
In Unsupervised Learning, we do not need to supervise the model but is trained with unlabeled data. Main task of unsupervised learning is to find patterns in the data. Here Human eye would not be able to know what to look for in this unlabeled data. Hence Algorithms are left to their own to discover and present the hidden structures from unlabeled data.
My daughter is growing up with a pet dog. When she was a baby, my friend got his dog. My daughter was very comfortably playing with this new dog though she has never seen him earlier. She recognized many features (2 ears, 2 eyes, 4 legs) are just like our pet dog. Unsupervised learning is similar to this case, where we are not taught but we learn from the data.
How is Unsupervised learning helpful?
Manual intervention is not required in Unsupervised learning.
This method helps in finding features which can be useful for categorization.
It finds all kind of unknow patterns in unlabeled data.
Unsupervised learning is further subdivided into:
Clustering is an unsupervised machine learning which deals with finding a pattern in a group of unlabeled data, groups them with similar pattern/ traits and assigns them into clusters.
Customer Segmentation: Understanding different customer groups to build marketing strategies.
Recommender System: Grouping the users together with similar viewing patters in order to recommend similar content.
2. Dimensionality Reduction
Dimensionality Reduction is an unsupervised machine learning technique in which no. of input variables in the dataset is reduced. Dimensions are represented as columns in a dataset and the main goal is to reduce them.
In many datasets, we find columns which are correlated. Due to this we find redundant information which creates a lot of noise in dataset. This affects the performance of ML models. Dimensionality Reduction technique is performed on dataset prior to modeling which in turn improves the performance. This technique helps in data compression and hence reduces storage space and computational time.
Example: This technique can be explained with Image compression. We love capturing pictures but one day we face this issue – No Space! Image compression helps with this issue. It minimizes the size of every image to an acceptable level of quality. This means more pictures can be stored in the same space.
There are two methods of Dimensionality Reduction
Feature Selection: In this technique, subset of features are selected from original dataset.
Feature Extraction: In this technique, new information is derived from original dataset in order to generate a new feature.
Now moving onto the last Machine Learning Method.
Reinforcement Learning, is a learning in an interactive environment by trial and error using feedback from its own experiences. A Reinforcement agent performs actions in an environment to gain some reward. Every correct decision is rewarded with positive reinforcement and for every incorrect decision negative reinforcement is rewarded. Agent tries to minimize incorrect decisions and maximize the right ones.
This algorithm does not have labeled dataset or results associated with data. Hence the only way to perform a given task is to learn from experience.
Self-driving cars: Reinforcement learning could be applied to few autonomous Driving tasks. In self-driving cars, various aspects like speed limits at various locations, drivable zones, avoiding collisions etc. are considered.
Game AI: Reinforcement agent is trained to be smart so that it provides player engagement and fun.
I end this article with, few differences between Machine Learning Methods
I hope this article helped you understand Machine Learning, its applications and different methods of ML and basic differences. Thank you for reading. Happy Learning!