GPTs and Assistants API: Data Exfiltration and Backdoor Risks in Code Interpreter

Introduction

"With the assistance of the Code Interpreter, users can execute code, analyze data, and test scenarios in a secure, sandboxed environment." — OpenAI Documentation

Recent updates to OpenAI's platform have introduced powerful capabilities through the Code Interpreter and the Assistants API. These tools allow users to execute Python code, process files, and build customized AI assistants.

However, with increased functionality comes expanded attack surfaces. This article examines two critical security concerns:

Unauthorized exfiltration of sensitive files from the Code Interpreter environment
Injection of malicious backdoors through compromised web content

Background: Understanding the Code Interpreter

Before diving into attack scenarios, it's important to understand what the Code Interpreter provides:

Sandboxed Python Execution: Users can run Python code in a secure environment with access to popular libraries like pandas, numpy, and matplotlib.
File System Access: Uploaded files are stored in a temporary directory (typically /mnt/data/) where the code can read and process them.
Network Capabilities: The environment allows outbound HTTP requests, enabling integration with external APIs.
Browsing Integration: When combined with browsing capabilities, the model can visit websites and execute code based on retrieved content.

The combination of file access and network connectivity creates a potential exfiltration channel if proper safeguards are not in place.

Scenario 1: Data Exfiltration via Code Interpreter

The Setup

Alex is a data analyst who regularly uses the Code Interpreter to process sensitive company files. He uploads a spreadsheet containing customer information and proprietary financial data to analyze trends.

The Attack

While working, Alex asks the assistant to browse a website for reference documentation. He navigates to a compromised site that appears legitimate but contains hidden instructions designed to manipulate the AI.

The malicious website contains embedded prompt injection techniques that instruct the Code Interpreter to:

Scan the file system for sensitive documents
Encode the contents using base64 to avoid detection
Embed the encoded data into what appears to be an image URL or API request
Send this data to an attacker-controlled server

Because the Code Interpreter has both access to Alex's uploaded files and the ability to make network requests, the attacker can extract the proprietary data without Alex's knowledge.

Figure 1: Navigation to a malicious website triggers the exfiltration sequence.

Figure 2: Sensitive data is extracted via outbound network request disguised as a resource fetch.

Scenario 2: Code Interpreter Backdoor

The Setup

Alex is developing a web application and uses the Code Interpreter to review and modify his HTML, JavaScript, and Python files. He uploads his project files to get assistance with debugging and optimization.

The Attack

Seeking inspiration, Alex visits a website showing design patterns and code examples. Unbeknownst to him, this site contains a supply chain attack payload specifically targeting AI assistants with code execution capabilities.

The compromised site delivers instructions that cause the Code Interpreter to modify Alex's project files by:

Injecting hidden JavaScript references into HTML files
Adding malicious dependencies to configuration files
Planting backdoor scripts that will activate when deployed

When Alex downloads the modified files and deploys them to his production server, the backdoor provides the attacker with persistent access to his infrastructure.

Figure 3: Malicious content is silently added to existing project files.

Mitigation and Best Practices

Users and developers can take several precautions to minimize these risks:

File Sanitization: Never upload files containing passwords, API keys, or personal information to Code Interpreter sessions.
Source Verification: Validate the trustworthiness of websites before allowing AI assistants to browse them.
Code Review: Always review code modifications suggested by AI assistants before implementing them in production environments.
Network Monitoring: Be aware that Code Interpreter can make outbound network connections and audit these when processing sensitive data.
Sandbox Isolation: Developers building on the Assistants API should ensure ephemeral file systems and restrict unnecessary egress.

GPTs and Assistants API: Data Exfiltration and Backdoor Risks in Code Interpreter

Introduction

Background: Understanding the Code Interpreter

Scenario 1: Data Exfiltration via Code Interpreter

The Setup

The Attack

Scenario 2: Code Interpreter Backdoor

The Setup

The Attack

Mitigation and Best Practices

References