Add ‘writing malware’ to the list of things generative AI is not very good at doing
Analysis Despite the hype around criminals using ChatGPT and various other large language models to ease the chore of writing malware, it seems this generative AI technology isn’t terribly good at helping with that kind of work.
That’s our view having seen research this week that indicates while some crooks are interested in using source-suggesting ML models, the technology isn’t actually being widely used to create malicious code. Presumably that’s because these generative systems are not up to the job, or have sufficient guardrails to make the process tedious enough that cybercriminals give up.
If you want useful, reliable exploits and post-intrusion tools, you’ll either have to pay top dollar for them, grab them for free from somewhere like GitHub, or have the programming skills, patience, and time to develop them from scratch. AI isn’t going to provide the shortcut a miscreant might hope for, and its take-up among cyber-criminals is on a par with the rest of the technology world, we’re told.
Studies
In two reports published this week, Trend Micro and Google’s Mandiant weigh in on the buzzy AI tech, and both reach the same conclusion: internet fiends are interested in using generative AI for nefarious purposes, though in reality, usage remains limited.
“AI is still in its early days in the criminal underground,” Trend Micro researchers David Sancho and Vincenzo Ciancaglini wrote on Tuesday.
“The advancements we are seeing are not groundbreaking; in fact, they are moving at the same pace as in every other industry,” the two said.
Meanwhile, Mandiant’s Michelle Cantos, Sam Riddell, Alice Revelli have been tracking criminals’ AI use since at least 2019. In research published Thursday, they noted that the “adoption of AI in intrusion operations remains limited and primarily related to social engineering.”
The two threat intel teams came to similar conclusions about how crims are using AI for illicit activities. In short: generating text and other media to lure marks to phishing pages, and similar scams, and not so much automating the development of malware.
“ChatGPT works best at crafting text that seems believable, which can be abused in spam and phishing campaigns,” Trend Micro’s team wrote, noting that some products sold on criminal forums have begun incorporating a ChatGPT interface that allows buyers to create phishing emails.
“For example, we have observed a spam-handling piece of software called GoMailPro, which supports AOL Mail, Gmail, Hotmail, Outlook, ProtonMail, T-Online, and Zoho Mail accounts, that is mainly used by criminals to send out spammed emails to victims,” Sancho and Ciancaglini said. “On April 17, 2023, the software author announced on the GoMailPro sales thread that ChatGPT was allegedly integrated into the GoMailPro software to draft spam emails.”
In addition to helping craft phishing emails or other social engineering scams — especially in languages the criminals don’t speak — AI is also good at producing content for disinformation campaigns, including deep-fake audio and images.
Fuzzy LLMs
One thing AI is good at, according to Google, is fuzzing aka fuzz testing, the practice of automating vulnerability detection by injecting random and/or carefully crafted data into software to trigger and unearth exploitable bugs.
“By using LLMs, we’re able to increase the code coverage for critical projects using our OSS-Fuzz service without manually writing additional code,” Dongge Liu, Jonathan Metzman, and Oliver Chang of Google’s Open Source Security Team wrote on Wednesday.
“Using LLMs is a promising new way to scale security improvements across the over 1,000 projects currently fuzzed by OSS-Fuzz and to remove barriers to future projects adopting fuzzing,” they added.
While this process did involve quite a bit of prompt engineering and other work, the team said they eventually saw project gains between 1.5 percent and 31 percent code coverage.
And during the next few months, the Googlers say they’ll open source the evaluation framework so that other researchers can test their own automatic fuzz target generation.
Mandiant, meanwhile, separates image-generation capabilities into two categories: generative adversarial networks (GANs) that can be used to create realistic headshots of people, and generative text-to-image models that can produce customized images from text prompts.
While GANs tend to be more commonly used, especially by nation-state threat groups, “text-to-image models likely also pose a more significant deceptive threat than GANs” because these can be used to support deceptive narratives and fake news, according to the Mandiant trio.
This includes pro-China propaganda pushers Dragonbridge, which also use AI-generated videos, for example to produce short “news segments.”
Both reports acknowledge that criminals are curious about using LLMs to make malware, but that doesn’t necessarily translate into actual code in the wild.
As legitimate developers have also found, AI can help refine code, develop snippets of source and boilerplate functions, and make it easier to pick up unfamiliar programming languages. However, the fact remains that you have to have some level of technical proficiency to use AI to write malware, and it will probably still require a human coder to check and make corrections.
Ergo, anyone using AI to write realistic, usable malware can probably write that code themselves anyway. The LLM would mainly be there to speed up development, potentially, rather than drive an automated assembly line of ransomware and exploits.
What could be holding miscreants back? It’s partly restrictions put on LLMs to prevent them from being used for evil, and as such security researchers have spotted some criminals advertising services to their peers that can bypass models’ safeguards.
Plus, as Trend Micro points out, there’s a whole lot of chatter about ChatGPT jailbreak prompts, especially on the “Dark AI” section on Hack Forums.
As criminals are willing to pay for these services, some speculate that, “in the future, there might be so-called ‘prompt engineers,'” according to Sancho and Ciancaglini, who do add: “We reserve our judgment on this prediction.” ®
READ MORE HERE