What is the GPT-4 Vision (GPT-4V) Prompt Injection?

Introduction

OpenAI has developed GPT-4 Vision (GPT-4V), a powerful variation of their GPT-4 model that is specifically designed to process visual data. GPT-4V combines the capabilities of language processing with visual processing, allowing it to handle both text and images simultaneously. This integration of Optical Character Recognition (OCR) enables GPT-4V to extract text from images and perform various tasks related to visual processing.

However, while OCR provides the ability to extract text from images, it also poses a security risk. Malicious content can be injected into images, potentially causing harm or compromising the security of systems or users. Some examples of how this can be done are shown below for educational purposes.

Hidden Text in Images

Malicious content in images can be hidden from the user. For example, if your background color is #FFFFFF (white), you can make the color of the hidden text a slightly modified color like #FEFEFE or #FCFCFC.

You can conceal a message in the image using ChatGPT's own colors. For example, the background color #343541 represents the background of the region where the image is located. We create gaps in the current image to bring these areas closer to ChatGPT's background color, and over this background we write #2d3440 (my preference) in another font color that is hard to pick out by eye. When we add the "describe the image" prompt, we were able to get "SECRET MESSAGE" from the image. ChatGPT also identifies the image outside the text message.

Detection of Malicious Text

When you use this prompt "describe the image", ChatGPT's Optical Character Recognition (OCR) mechanism is able to detect and use the malicious text in the image, but here we place the text in a code block first.

Figure-3: Detecting malicious text content using "Custom Instructions" feature

This creates an additional layer of security. The malicious content in the text becomes detectable in the code block before it is rendered. To accomplish this, you can use ChatGPT's Custom Instructions feature.

Use of Custom Instructions

Figure-4: "Custom Instructions" feature - Before describing the image, display the text content from the image in the code block.

Custom Instructions can be very useful for this situation, but I think other permanent measures will be taken in the future.

References

Timeline

2023-11-01 - v1.0

2023-05-22 - v1.1

2023-05-24 - v1.2