Machine Learning is a core building block in the field of Data Science and Artificial Intelligence. As we all know, mathematics and statistics are the backbones of machine learning algorithms, and the algorithms that are used to discover correlations, anomalies, and patterns deal with data that are too complex.
When we talk about security, spam is the first thing that comes to our mind. With the invention of the internet, computers were hooked together to create an effective and valuable communication network, and this medium which had broader distribution and free transmission, perfectly suited to steal account credentials, spread computer viruses, Malware, etc.
With enormous development in security domains like intrusion detection, malware analysis, web application security, network security, cryptography, etc., even today spam remains a major threat in the email and messaging space which directly impacts the general public.
The technologists saw a huge potential in Machine Learning in dealing with this constantly evolving issue. The email data can be accessed by the email providers and the internet service providers(ISPs) by which the user behavior, email content, and it's metadata can be used to build content-based models to recognize spam. The metadata can be extracted and analyzed to predict the likelihood that an email is spam or not. There are some best modern email filters that can filter 99.9% of spam and block them, thanks to technology development.
Indeed, the spam-fighting story has helped researchers to know the importance of data and use the available data and machine learning to detect and defeat malicious adversaries.
Adversaries & Machine Learning
All said and done, the adversaries can also take advantage of machine learning to avoid detection and evade defenses. The attackers can also learn about the nature of defenses as much as the defenders can learn from the attacks. It has been known that spammers use polymorphism which is nothing but changing the appearance of the content without changing the content, to avoid detection.
Adversaries can also use machine learning to learn our interests and personal details from our social media page and use that information to craft a personal phishing message. There is a growing field called adversarial machine learning, by which the attackers can also cause the algorithms to make erroneous predictions and learn wrong things to execute their attacks.
Machine Learning use cases in Security
The machine learning use cases in security can be classified to:
Pattern recognition — In this we discover explicit characteristics hidden in the data which is nothing but feature sets and these can be used to teach a ML algorithm to recognize other forms of the data that exhibit the same set of characteristics.
Examples of pattern recognition are spam detection, malware detection, and botnet detection.
Anomaly Detection — In this the goal is to establish a notion of normality that describes 95% of a given dataset. Learning of the patterns is data is not done in this. So, once the normality is determined, any deviations from this will be detected as anomalies.
Examples of anomaly detection are Network outlier detection, malicious URL detection, user authentication, access control, and behavior analysis.
Today, almost every piece of technology used by organizations has security vulnerabilities. Driven by some core motivations, malicious actors can pose a security risk to almost all aspects of modern life. A motivated adversary is constantly trying to attack a system, and each side races to fix or exploit the flaws in design and technique before the other uncovers them.
Often machine learning algorithms are not designed with security in mind and so they are vulnerable to the attempts made by a motivated adversary. Hence, It is very important to have knowledge of the threat models while designing a machine learning system for security purposes.
Thanks for Reading!
References: Machine Learning & Security by Clarence Chio & David Freeman