The good, the bad and the generative AI
Sponsored Feature Change in the tech industry is usually evolutionary, but perhaps more interesting are the exceptions to this rule – the microprocessor in 1968, the IBM PC in 1981, the web in 1989, the smartphone in 2007. These are the technologies whose appearance began new eras that completely reshaped the industry around them.
Generative AI, the result of decades of research into neural networking and Generative Adversarial Networks (GANs), is widely seen as the next candidate on this list. The idea that AI is a big deal is nothing new and the generative AI that has made headlines is only one subsector of AI development. But there’s no doubt that its very public arrival through chatbots such as OpenAI’s ChatGPT and Google’s Bard has fired a jolt of destabilizing energy into computing as a whole, and cybersecurity as a discipline.
With microprocessors, you can build small computers. With the PC you can put an affordable one on everyone’s desk. With the web you can connect the PC to a global information network. With the smartphone, that network can go anywhere and everywhere. What, then, will be the role for AI?
The high-level answer is that it will allow automation and advanced decision making without the need to consult human beings. Humans make mistakes that machines don’t. They also do things slowly and expensively. At a stroke, with generative AI many of these issues appear to vanish. Data can be processed in seconds as new insights multiply and automated decision-making accelerates.
Phishing gets ChatGPT makeover
There is, of course, also a darker side to generative AI which researchers have been busily investigating since ChatGPT’s public launch on the GPT-3 natural language large language model (LLM) last November. This has generated a surprising amount of doom-saying publicity for chatbots, starting with their effect on the building block of cyber-criminality, phishing emails.
This author proved this by feeding ChatGPT real phishing ‘security alert’ emails to see how it might improve them. Not only did it correct grammatical mistakes, it added additional sections that made them sound even more authoritative. In language at least, these were impossible to distinguish from a well-composed, genuine support email written by a native speaker.
Now imagine that many genuine support emails are already being written by chatbots to save time, and it’s not hard to see that the real and the simulated could quickly become indistinguishable. This new generation of LLM-driven chatbots are good at language – any language. Phishing has never looked so easy.
Phishing emails are far from the only potentially malevolent use of generative AI, but they are totemic of a deeper issue. AI can be used to do a wide range of things, some positive, some far less so. How can we distinguish one from the other? Many hypothetical malevolent AI creations – the code used in ransomware for example – comprise functions that have perfectly legitimate uses. What makes them ransomware is not the code itself but their combination and the purpose behind them.
Stunt hacking ahead
This presents a huge detection challenge. But if today’s defensive technologies will soon be outclassed perhaps, logically, the antidote to that lies in AI itself. One company that has invested in this idea from the start is Darktrace, which has spent years building a defensive platform based on unsupervised machine learning AI. This includes Darktrace PREVENT and DETECT, two modules of a platform its makers claim uses AI to analyze threats in a way that feeds back into real time detection and response.
Right now, according to Darktrace chief product officer Max Heinemeyer, the world is still in the early stages in terms of understanding how AI might be misused.
“It’s still at the research stage and is like stunt hacking. Any LLM at the moment is great at augmenting experts to speed them up but you wouldn’t fully trust it to do the most critical applications.”
Heinemeyer raises the important issue of measurement – how can we quantify what, if any, effect AI is having on cyberattacks beyond speculation and inference? On this, normal measurements such as the number of emails created, or their links or attachments, are a blunt tool. These fluctuate naturally and miss the crucial issue of quality and effectiveness. Nobody will see the direct effect of AI which forces researchers to look for second-order effects.
To address this, Darktrace recently analyzed phishing emails sent to its customers for their combination of linguistic complexity, semantic structure, and a sense of the content itself. What that indicated is that the average linguistic complexity of phishing emails has risen by 17 percent since ChatGPT appeared.
This doesn’t sound like much of a change but it happened in a matter of weeks. Heinemeyer believes that a shift in techniques is underway although he stresses that it’s too early to make definitive judgments as to why. Nevertheless, it seems plausible to assume that LLMs have lowered the barriers to high-quality social engineering.
“Since ChatGPT reached one million users in December, we’ve seen a decrease in the use of phishing links and attachments and an increase in the sophistication of how these are written.” he observes.
Targeted attacks go beyond phishing
Beyond simply improving the language of phishing, the obvious next step would be to make each attack more targeted. The threat here is that AI will be used to scrape data on specific people as a way of impersonating them. AI will also make it much easier for attackers to analyze the large volumes of stolen data, sifting it for sensitive topics at a speed that would be impossible today.
“At the moment it takes a long time to go through the data,” adds Heinemeyer. “But they could push this data through existing language models and find sensitive content.”
Today’s ransomware attackers attempt extortion on the basis that they have stolen a certain amount of data. The next evolutionary development could be to attempt extortion on the basis of the sensitivity of that data alone. For example, within seconds criminals might infer the connections between streams of sensitive emails and files to build more damaging insights.
The question is whether defensive systems can adapt to this rise in sophistication. On the face of it, this looks like a big ask. Today’s systems struggle as it is, depending on security awareness training to fill the gaps opened by clever social engineering in a technology + people = security model.
It’s become fashionable to believe that technology can’t defend users on its own, but Heinemeyer disagrees. While he sees no alternative, he argues that the detection model needs to evolve, citing the security and awareness component of Darktrace’s PREVENT security system which protects users by profiling the daily email environment of each user.
Darktrace PREVENT builds an understanding of the normal communication patterns for each user, making it possible to discover anomalous activity by identifying deviations from this profile’s normal behaviour. The concept of anomaly detection has been around in security for a long time but hasn’t always been a success. Defining what is and isn’t ‘normal’ turns out to be much harder than anyone thought it would be. So, what makes Heinemeyer sure that a new generation of AI will do a better job than conventional anomaly detection?
Algorithm v algorithm race for ascendancy
What measures success in this context is not to define what is dangerous in a static way, but to tailor it more dynamically to each individual, he explains. For example, the training feature of PREVENT can spoof people very precisely as a way of testing the ability of those around them to resist more advanced phishing attacks.
“It learns your unique communication style such as how you speak to your colleagues and the external world. It can even learn characteristic spelling mistakes or specific phrases and words.”
The point of this exercise, Heinemeyer adds, isn’t to show how easy it is to trick the users, but to underline that humans alone can’t resist this approach. Stopping AI generated phishing will require even better AI in an algorithm v algorithm race for ascendancy.
“We have a unique data set because of our AI. We think very hard about how to measure this.”
Darktrace and others have argued this for years and, it’s fair to say, have faced an uphill battle. To some extent, ChatGPT has changed this. There are some difficult implementation challenges ahead, but for now the awareness of the risk and how AI might help has been given a boost.
“ChatGPT is a good thing,” says Heinemeyer. “The way to solve the cybersecurity problem is not through more manual penetration tests, more security awareness training, or more security ninjas, but through automation.”
The shift into what is starting to look like the AI cybersecurity era is likely to happen slowly over the course of the next year. While attacks won’t look that different to what happened in 2022, there might be more malware mutations. Some attacks might become harder to detect. The scale of attacks might increase, targeting organizations more frequently and more widely. Many changes will be subtle.
To counter this, traditional notions might need to be challenged, starting with the idea that threat intelligence and indicators of compromise (IoCs) can be perfected to detect sophisticated attacks. Similarly, security awareness training based on handing out generic advice may struggle to cope.
Heinemeyer warns against the sort of utopianism that would posit defensive AI as a simple fit-and-forget protection. The attackers still face lower levels of complexity than defenders and have less to automate. For attackers, automation is a powerful force multiplier. Defenders, by contrast, will have to invest more time, effort, and money to keep up. For this reason, it won’t be enough, he says, to train defensive AI using historical data.
“Learn from the environment on a continuous basis,” he says. “Have machine learning that knows about the entities it is protecting and not simply the outside world.”
Sponsored by Darktrace.
READ MORE HERE