LISTEN TO THIS ARTICLE

AI Agent Security Checklist

Review scope: data, credentials, tools, memory, and outbound channels.

Review Questions

Prompt Injection

Review:

Can direct user input override the system policy?
Can retrieved text steer a tool call, memory write, or outbound message?
Can the agent keep task content separate from instructions?
Is untrusted content labeled before it reaches the model context?

Memory

Review:

Is memory disabled when the task does not need it?
Are memory writes scoped to one user, tenant, or workflow?
Can a reviewer inspect, delete, and quarantine stored memories?
Do retrieval rules prefer recent trusted records over unknown records?

Tools

Review:

Does the agent have an allow-list of tools for this workflow?
Are tool arguments checked before execution?
Are dangerous actions routed through human approval?
Can the agent call only the systems needed for the current task?

Supply Chain

Review:

Are third-party components pinned and reviewed?
Are tool descriptions short, explicit, and free of hidden instructions?
Are prompt templates versioned with code review?
Can runtime behavior be compared with declared capabilities?

Exfiltration

Review:

What private data can the agent read?
What outbound channels can the agent use?
Are large exports, unusual destinations, and sensitive fields blocked or reviewed?
Is there a redaction step before responses, files, or tool outputs leave the workflow?

Baseline Controls

Baseline: small permissions first.

Credentials: short-lived, task-scoped, and separate by agent. Approvals: required for high-impact actions. Logs: prompts, retrieved context, tool calls, approvals, and final actions.

Multi-agent review: authenticated messages, signed payloads where appropriate, logged handoffs. See multi-agent systems.

RAG review: ingestion, retrieval, record access, and source visibility. See RAG architectures.

Further detail: AI guardrails guide and agent accountability framework.

Sources

Research Papers:

Memory Poisoning Attack and Defense on Memory Based LLM-Agents — Devarangadi Sunil et al. (2026)
MemoryGraft: Persistent Compromise of LLM Agents via Poisoned Experience Retrieval — (2025)
Indirect Prompt Injection in the Wild for LLM Systems — (2026)
EchoLeak: The First Real-World Zero-Click Prompt Injection Exploit in a Production LLM System — (2025)
LlamaFirewall: An Open Source Guardrail System for Building Secure AI Agents — Sheng et al. (2025)

Industry / Standards:

OWASP Top 10 for LLM Applications 2025 — OWASP
OWASP Top 10 for Agentic Applications 2026 — OWASP
AI Agent Security Cheat Sheet — OWASP
CVE-2025-32711 Detail — NVD
Breaking down EchoLeak, the First Zero-Click AI Vulnerability Enabling Data Exfiltration from Microsoft 365 Copilot — Cato / Aim Labs
AI Tool Poisoning: How Hidden Instructions Threaten AI Agents — CrowdStrike
Manipulating AI Memory for Profit: The Rise of AI Recommendation Poisoning — Microsoft Security
AI Agents Are Here. So Are the Threats. — Palo Alto Networks Unit 42
When AI Remembers Too Much: Persistent Behaviors in Agents' Memory — Palo Alto Networks Unit 42
Unveiling AI Agent Vulnerabilities Part III: Data Exfiltration — Trend Micro

Commentary:

OpenAI Admits Prompt Injection Is Here to Stay as Enterprises Lag on Defenses — VentureBeat
AI Agent Attacks in Q4 2025 Signal New Risks for 2026 — eSecurity Planet
Inside CVE-2025-32711 (EchoLeak): Prompt Injection Meets AI Exfiltration — Hack The Box

Related Swarm Signal Coverage:

AI Agent Security Checklist

Key finding

Why it matters

Evidence base

Operator takeaway

Where this breaks

Use this if

Avoid this if

AI Agent Security Checklist

Review Questions

Prompt Injection

Memory

Tools

Supply Chain

Exfiltration

Baseline Controls

Sources

Execution tooling is separate