The meteoric rise of malware has put us all at risk. We are engaged in a never-ending race with cybercriminals to protect systems, plug gaps, and eradicate vulnerabilities before they can gain access. The front line grows by the day as we share more data and employ new network-connected devices via the rise of the Internet of Things.
Keeping up with the fast pace of new malicious threats is a real challenge. If it takes longer to scan for malware than it takes the malware to gain a foothold or exfiltrate data, then we are stuck in a detect and remediate pattern. Prevention would be preferable. One possible path to accurate prediction and real-time prevention is the development of machine learning algorithms.
Threats evolve quickly
Whoever you go to for malware stats the numbers are frightening. The AV-Test Institute registers more than 250,000 new malicious programs every day. A new malware specimen emerged every 4.6 seconds in 2016, according to G Data, and that figure dropped to 4.2 seconds in the first quarter of 2017. As many as 72 million unique URLs were documented as malicious in the last quarter alone.
We’re in the midst of a serious malware epidemic and we need new weapons in the fight against cybercriminals. The traditional approach of blacklisting URLs necessitates frequent updates being pushed out to protect people, or it requires their security systems to access cloud-based blacklists. In both cases there are potential performance impacts, but there’s also a delay between the discovery of the malicious URL and the protection against it.
It’s possible to ramp up the security levels by blocking wider domains, but actions like that lead to false positives. We could enjoy complete protection, but if it means blocking access to everything, it’s obviously not a viable solution. So, what’s the answer?
Machine learning models
There has been plenty of buzz about the potential of machine learning for countless industries, including cybersecurity, but there isn’t much clarity on precisely what it can do. The basic idea is to emulate the human brain with an artificial neural network that’s able to ingest huge data sets. These models learn through human guidance, and trial and error, until they can accurately recognize suspicious URLs or probable malware.
As the machine learning model improves, the hope is that it can reach the point where it correctly predicts what is malware and automatically blocks access to it. There’s a great deal of work in crafting a predictive model like this. The data must be properly prepared, the model must be designed, and then you must train and validate it, before evaluating its effectiveness.
For a deep dive into deep learning, read up on how Sophos has been developing just such a threat detection model through machine learning.
Challenges for machine learning
There are many prerequisites for an effective machine learning model. You obviously want to strike a good balance with high detection rates and minimal false positives. It needs to be fed a stream of relevant, quality data on a vast scale, but it also has to be lean in order to make real-time decisions and be truly proactive.
The potential rewards are so great that the whole industry is moving towards tackling these challenges. Security systems that can leverage big data to consistently and accurately predict and shut out threats in a sea of shifting variables could dramatically reduce the impact of cybercrime. No wonder that ABI Research predicts that machine learning in cybersecurity will boost big data, intelligence, and analytics spending to $96 billion by 2021.
On the horizon
As exciting as developments like machine learning and genetic algorithms are, it’s important to remember that we’re talking about supplementary technologies here. We still need security strategies, proper training for staff, stringent security testing, and a host of tools to protect our networks and data. The guiding hand of cybersecurity experts is essential for these models to continue improving.
These technologies won’t replace humans, they’ll simply empower us with more accurate information. It may be years before machine learning leads to highly accurate autonomous systems, but there’s no doubt that these models will prove a valuable ally as we strive for great security in cyberspace.