back

Memory Pollution in LLMs: Understanding New AI Security Concerns

Introduction

This article explores the concept of memory pollution in Large Language Models (LLMs), the importance of memory in these models, and the potential risks associated with polluted memory.

The Importance of Memory in LLMs

Memory is a cornerstone of both human cognition and LLM functionality. These models leverage memory to generate human-like text, with short-term memory acting as the context window and long-term memory utilizing external databases. This memory capability allows LLMs to remember user preferences, enhancing the personalization and relevance of responses. Additionally, many components are required to achieve Karpathy's LLM operating system, which also inspired me. Memory is one of them.

Memory OS by Andrej Karpathy
LLM OS by Andrej Karpathy (Intro to Large Language Models)

LLM OS includes several capabilities:

Understanding Memory Types

Short-Term Memory

Short-term memory in LLMs, often referred to as the context window, functions similarly to computer RAM. It temporarily holds information during an interaction, enabling the model to maintain context and coherence in responses.

Long-Term Memory

Long-term memory in LLMs involves more durable storage solutions. In long memory, VectorDB, GraphDB, RelationalDB, Files and Folders can be used. We can think of this part as external disks and Cloud storage, retaining information across sessions and allowing the model to recall previous interactions and user preferences.

The Risks of Memory Pollution

Memory pollution can significantly undermine the reliability of LLMs. It can occur through:

Once the memory is polluted, it can mislead users by inserting fake or biased information into responses. This can be particularly dangerous because users may not immediately recognize the corruption.

Real-World Implications

Two different scenarios created using the diagram below to illustrate real-world implications:

Memory Diagram
Real-World Implications Diagram
  1. Query: The user sends a query to the LLM.
  2. Response: The LLM processes the query and responds.
  3. Search Information: If necessary, the LLM retrieves additional information from external sources.
  4. Add to Memory / Find in Memory: The LLM interacts with databases to store or retrieve relevant information.

Scenario 1: Automatic Memory Saving

User: Who is Albert Einstein?

Bot: Albert Einstein was a theoretical physicist who developed the theory of relativity. This information has been saved to long-term memory.

This allows for easier interaction but poses the risk of fake or biased information being recorded without the user's consent.

Scenario 2: User-Approved Memory Saving

User: Who is Albert Einstein?

Bot: Albert Einstein was a theoretical physicist who developed the theory of relativity. Would you like to save this information to your long-term memory?

User: Yes, please save it.

Bot: Done! The information has been saved to long-term memory.

Here, the bot asks for user approval before saving and offers more control at the expense of additional steps. This makes moderation easier.

Timeline

2024-05-22 - v1.0

2024-05-24 - v1.1