LISTEN TO THIS ARTICLE

AI Agent Security Checklist

Review scope: data, credentials, tools, memory, and outbound channels.

Review Questions

Prompt Injection

Review:

  • Can direct user input override the system policy?
  • Can retrieved text steer a tool call, memory write, or outbound message?
  • Can the agent keep task content separate from instructions?
  • Is untrusted content labeled before it reaches the model context?

Memory

Review:

  • Is memory disabled when the task does not need it?
  • Are memory writes scoped to one user, tenant, or workflow?
  • Can a reviewer inspect, delete, and quarantine stored memories?
  • Do retrieval rules prefer recent trusted records over unknown records?

Tools

Review:

  • Does the agent have an allow-list of tools for this workflow?
  • Are tool arguments checked before execution?
  • Are dangerous actions routed through human approval?
  • Can the agent call only the systems needed for the current task?

Supply Chain

Review:

  • Are third-party components pinned and reviewed?
  • Are tool descriptions short, explicit, and free of hidden instructions?
  • Are prompt templates versioned with code review?
  • Can runtime behavior be compared with declared capabilities?

Exfiltration

Review:

  • What private data can the agent read?
  • What outbound channels can the agent use?
  • Are large exports, unusual destinations, and sensitive fields blocked or reviewed?
  • Is there a redaction step before responses, files, or tool outputs leave the workflow?

Baseline Controls

Baseline: small permissions first.

Credentials: short-lived, task-scoped, and separate by agent. Approvals: required for high-impact actions. Logs: prompts, retrieved context, tool calls, approvals, and final actions.

Multi-agent review: authenticated messages, signed payloads where appropriate, logged handoffs. See multi-agent systems.

RAG review: ingestion, retrieval, record access, and source visibility. See RAG architectures.

Further detail: AI guardrails guide and agent accountability framework.

Related: The Agent Project That Should Have Been One

Sources

Research Papers:

Industry / Standards:

Commentary:

Related Swarm Signal Coverage: