Artificial Intelligence & Machine Learning in Cybersecurity
AI and ML, definitions
There is a huge buzz around artificial intelligence (AI) and machine learning (ML) only comparable to the lack of clarity of the meaning of those terms.
Within AI we can find several concepts like strong AI or true AI that refer to artificial general intelligence, a hypothetical machine that exhibits behaviour at least as skilful and flexible as humans do.
But the truth is that there is not currently such a machine that can operate and learn totally on its own outside a controlled environment.
AI has to be able to deal with vast amounts of data, the ability to reason, organise and structure knowledge mimicking the way a human does. At this moment, this is mainly science-fiction.
There is general consensus though, that AI is a superset of ML.
As a superset, AI has more topics than ML, although there are some overlaps and implies more than just learning, like speech recognition and understanding, perception, creativity and intuition ;three dimensions, 3D understanding and interactions with the environment; reasoning, contextual understanding within a conversation and object manipulation. Commercial applications of AI that represent additions of AI over ML might be self-driving cars, computer vision and natural language processing (NLP).
Machine Learning is an AI discipline that gives computers the ability to learn without being explicitly programmed. Basically, a Machine Learning computer will find patterns in data and then predict the outcome of something it has never seen before.
The latest developments in the ability to manage large datasets or bigdata, storage capacity to keep all that data and the computer power, have enabled the development of ML.
There are many types of ML, the most prevalent being supervised learning, deep learning and reinforcement learning.
Most of the current applications of AI in cybersecurity do not go beyond ML.
Applications of ML in cybersecurity: the low hanging fruit.
Machine learning in cybersecurity performs extremely well where we have lots of data either on the cloud or on the endpoint, working in combination with bigdata and analytics.
The most suitable applications would be in processing massive quantities of data and performing vast operations to identify anomalies, suspicious or unusual behaviour, detect and correct known vulnerabilities, suspicious behaviour and zero-day attacks.
ML might prove very helpful in detecting issues of a higher complexity, faster and more accurately than the human analyst.
In the unfortunate case of an attack, an automated response is critical in order to minimize the impact, conduct forensics and to defend effectively.
From a defensive perspective we need to be able to respond in computer or machine time versus human time to stop some of the attacks. Defence against intelligent cyber weapons can only be achieved by intelligent software.
Machine learning is increasingly being introduced to fight e-commerce fraudsters. There is currently access to lots of information about suspect fraudsters, including their purchase activities and profile, online browsing activities, social networks and fake identification they submit to get their orders approved. The real challenge is how we can make sense of this unstructured data and then make good approve/decline decisions for thousands of merchants in real-time.
Other uses of ML in cybersecurity might address the acute problem of scarce and expensive expertise through resource optimization or increase in staff productivity. Also a substantial reduction in false positive rates would positively impact cybersecurity operations and ML is very effective in achieving this goal. We need to be cognizant that the widening cyber-security skills gap is seriously threatening companies and this serious issue needs to be assessed in terms of cyber risk and properly addressed.
The accuracy and effectiveness of the response to an attack could also be improved leveraging ML which is also quite important considering that cybersecurity has quite low fault tolerance as it only takes one vulnerability to be exploited in order to have a data breach.
Challenges to adopt ML
When adopting ML to implement any of the functionalities discussed previously, it is very important to be realistic about the expectations. ML is frequently oversold and we cannot forget that ML is powered by math, not magic.
Probably the toughest challenge to adopting ML is going to be availability and quality of the data. Typically we do not have all the information needed to feed the algorithms, for example enough attack data, with the right context.
Additionally, there is a steep learning curve and important limitations in the learning process.
With enough context data, the learning process shouldn’t start from zero, but again, having this contextual data and leveraging is not an easy task.
Once an ML solution is implemented, we need to make sure that we are detecting the right thing. Sometimes the algorithms do not learn the right thing but something else. On top of that, testing and debugging is not easy, as we need to deal with a lot of uncertainties.
There are also important costs of acquisition, operation and maintenance generally related to the highly specialized, scarce and expensive expertise required.
One final important barrier or challenge might be regulation. The impact of regulatory frameworks might be diverse, involving privacy, data protection and other regulations impacting automated decision making.
AI and ML used for evil
Let’s be clear on one thing: AI and ML are tools and consequently they are not inherently bad or evil.
Having said that, as there are quite interesting applications, as previously discussed , to help the good guys, these powerful tools might also be and are currently being weaponized to wreak havoc.
The bad guys will definitely seek to leverage machine learning too, to support their attacks, learn from defensive responses and disrupt detection models.
We should expect more advancements in the use of machine learning and advanced analytics by attackers to accelerate and sharpen social engineering attacks-phishing, fraud, DDoS, ransomware, spyware, and scams across more industry sectors than they can do today using manual reconnaissance techniques.
For example, in the case of ransomware, attackers might leverage advanced analytics and ML to switch to more profitable targets, including high net-worth individuals, IoT or specific businesses.
As we discussed before, machine speed in cybersecurity is critical and hackers will try their hardest to exploit newly discovered vulnerabilities faster than defenders can patch them.
Ethics in ML
One of the first ethical questions to arise around ML inevitably pertains to automation and the resulting loss of human jobs. As the cybersecurity industry currently faces a talent shortage it is not clear whether automation would be as controversial as some people allege.
Additional ethical issues emerge when considering predictive cybersecurity used to anticipate cybercrime or cyberterrorism – wherein the accused are implicated in crimes that have yet to be committed. This approaches conflict directly within the existing legal framework.
Also, there are potential issues arising from the poor quality and/or inadequate quantity of data on which to base predictions, as well as the predictive capability of the algorithms used to infer probabilistic outcomes. Algorithmic transparency might be a serious issue in particular in regulated industries, especially when implicating people, and it is not easy to address this problem as just because someone has access to the ML code, it does not always equal being able to explain how the software works, mainly because there are ML algorithms that do not behave in a wholly predictable manner.
Another very important problem is that some of the information learned might be private or confidential. This might be particularly serious under new incoming regulations like GDPR.
For more information regarding Artificial Intelligence in cybersecurity, please visit: