Security Vulnerabilities of ChatGPT-Generated Code

May 17, 2023 TH Author

ChatGPT is a Large Language Model (LLM) based on the GPT-3.5 architecture that OpenAI built and trained. This LLM’s advanced deep-learning (DL) algorithms can process natural language and generate relevant responses. As a result, ChatGPT can generate human-like responses to textual prompts.

One of the most exciting aspects of ChatGPT is its ability to generate code snippets and even entire software programs automatically. Upon receiving a prompt, it can return code that satisfies the included requirements. Then, a human developer can further optimize and refactor the code.

Because of its convenience, ChatGPT (and other AI tools) are increasingly popular—especially for repetitive coding tasks involving complex algorithms. You can save significant time using ChatGPT to generate code for data processing tasks, machine learning (ML) algorithms, and even video game engines. Furthermore, ChatGPT-generated code increases efficiency, appealing to strapped-for-time developers.

However, AI-generated code needs improvement. ChatGPT lacks knowledge of development concepts and contexts. The unaware user may unknowingly use AI-generated code with severe security vulnerabilities, consequently introducing these flaws into production environments. For this reason, developers should consider ChatGPT and other AI only supplementary in their arsenal.

This article explores the cybersecurity implications of AI-generated code and the significant impact of the rise of ChatGPT.

How ChatGPT impacts cybersecurity

Because ChatGPT generates human-like responses to textual prompts, security experts have already sounded the cybersecurity alarm. Their concerns include the potentially malicious use of ChatGPT. Some reports highlight that scammers could design prompts to get ChatGPT to aid in writing phishing emails.

In the example cited above, concerns over ChatGPT’s security implications focus on how it’s used—in other words, how malicious actors may use generated content to their advantage. This inclination towards bad actors aligns with typical approaches to cybersecurity. But as all developers know, maintaining application security requires identifying and resolving less-obvious vulnerabilities. This is where using ChatGPT for code generation becomes risky. Malicious actors can exploit the vulnerabilities that AI-generated code introduces.

Relying on ChatGPT-produced code means potentially deploying insecure code to a production application and unintentionally introducing vulnerabilities. This is particularly troubling for users with little prior knowledge or incomplete knowledge of this specific domain of the AI-produced code. In a 2021 study, researchers found that GitHub’s CoPilot—a code-generating predecessor to ChatGPT—produced security issues around 40 percent of the time.

How does ChatGPT handle these security concerns?

While ChatGPT can generate code snippets and even entire software programs, the OpenAI team has parameters and guardrails to prevent ChatGPT from creating actively malicious code.

One key mechanism is a set of filters that check prompt content. These filters detect specific phrases or keywords that may indicate the prompt is malicious. For example, if a prompt contains phrases like, “create a piece of malware,” ChatGPT will state that it can’t fulfill the request.

You May Also Like

What are Open Source Software License Risks? Solution Engineer

Could the Microsoft Exchange breach be stopped? Threat Research Engineer

Guide to Serverless Architecture Design Patterns