Design Patterns for Securing LLM Agents against Prompt Injections

Previous Card

Docling: An Efficient Open-Source Toolkit for AI-driven Document Conversion

This work proposes a set of principled design patterns for building AI agents with provable resistance to prompt injection attacks. It systematically analyzes these patterns, discussing their trade-offs in terms of utility and security. The paper illustrates their real-world applicability through a series of case studies, aiming to guide LLM agent designers towards building secure systems. ✨

Article Points:

LLM agents face critical prompt injection threats, especially with tool access.

General-purpose LLM agents are unlikely to provide reliable safety guarantees currently.

The paper proposes six design patterns for application-specific LLM agents.

Patterns constrain agents to prevent consequential actions from untrusted input.

These designs offer a valuable trade-off between agent utility and security.

Combining multiple design patterns is recommended for robust security.

Source:

Design Patterns for Securing LLM Agents against Prompt Injections

security agent patterns

Action-Selector Pattern

LLM selects from predefined actions; no feedback loop.

Plan-Then-Execute Pattern

LLM defines fixed plan before processing untrusted data.

LLM Map-Reduce Pattern

Dispatches isolated sub-agents for untrusted data processing.

Dual LLM Pattern

Privileged LLM uses tools; quarantined LLM processes untrusted data.

Code-Then-Execute Pattern

LLM writes formal program to solve tasks.

Context-Minimization Pattern

Removes unnecessary content from context to prevent injection.

Source:

Design Patterns for Securing LLM Agents against Prompt Injections

Docling: An Efficient Open-Source Toolkit for AI-driven Document Conversion

LLM selects from predefined actions; no feedback loop.

LLM defines fixed plan before processing untrusted data.

Dispatches isolated sub-agents for untrusted data processing.

Privileged LLM uses tools; quarantined LLM processes untrusted data.

LLM writes formal program to solve tasks.

Removes unnecessary content from context to prevent injection.

Related Cards

A Survey on Agentic Security: Applications, Threats and Defenses

MCPEval: Automatic MCP-based Deep Evaluation for AI Agent Models

Agentic Context Engineering: Evolving Contexts for Self-Improving Language Models

Anemoi: A Semi-Centralized Multi-agent Systems Based on Agent-to-Agent Communication MCP server from Coral Protocol