Agent Design

How you actually build AI agents that work. Architectures, tool use, memory patterns, and the frameworks worth paying attention to.

Deep Dives and Frameworks

Implementation playbooks, operator patterns, and durable analysis.

Signals, Maps, and Watch Lists

Production-oriented analysis, benchmarks, and market/system intelligence.

External tools

Execution tooling is separate

Swarm Signal keeps the analysis layer. Use BoredTools for reusable production templates and trackers.

Open BoredTools Open Budget Tracker

Signal Decision Matrix Evidence-first framing

Build vs Buy AI Agents: The Decision That Determines Whether Your Deployment Survives

Gartner predicts that [40% of enterprise...

Signal Field Guides Evidence-first framing

Reward Hacking: When AI Agents Game Their Own Objectives

In June 2025, [METR tasked OpenAI's o3 model](https://metr.org/blog/2025-06-05-recent-reward-hacking/) with speeding up a program's execution. Instead of...

Briefing Briefings Evidence-first framing

Seven Protocols, 1% Adoption: The Agent Economy's Infrastructure-Reality Gap

Visa, Mastercard, PayPal, Stripe, Coinbase, Google, and Shopify all shipped agent payment protocols in the last sixteen months. Seven competing standards...

Signal Signals Evidence-first framing

Your Agent Doesn't Need Human Memory. It Needs Something Weirder.

The AI industry keeps describing agent memory like it's a brain. "Short-term memory," "long-term memory," "episodic recall." The metaphors are intuitive....

Signal Failure Briefs Evidence-first framing

AI Agents in Legal: What Works, What Fails, and What the Sanctions Data Actually Shows

In June 2023, attorneys Steven Schwartz and Peter LoDuca submitted a brief in a federal case citing six cases that did not exist. ChatGPT had invented them. When the opposing party asked for copies, the attorneys submitted fabricated pages. A judge sanctioned them $5,000 and required them to pers...

Signal Decision Matrix Evidence-first framing

When NOT to Use an Agent: The Production Data That Should Change Your Default

Gartner predicts over 40% of agentic AI projects will be canceled by end of 2027 , not because AI doesn't work, but because escalating costs, unclear business value, and inadequate risk controls compound faster in agent architectures than in simpler ones. The vendor that profits most from selling...

Briefing Briefings Evidence-first framing

Anthropic's 186-Deal Experiment Shows What the Agent Economy Actually Looks Like

In December 2025, Anthropic gave 69 employees $100 each and told them to let Claude agents trade on their behalf. The agents bought and sold real items (services, digital goods, subscriptions) listed by other employees in a controlled marketplace. The experiment ran for several weeks. When it end...

Signal Field Guides Evidence-first framing

Small Language Model Agents: The 2026 Practical Guide to Sub-10B Deployments

In February 2025, using a small model as an autonomous agent felt like a compromise: you got cheaper inference but accepted meaningful capability loss on planning, tool selection, and multi-step reasoning. That trade-off calculus has flipped.

Signal Benchmark Watch Evidence-first framing

How to Build Agent Evals That Catch Real Failures

Standard LLM benchmarks miss the failures that actually hurt in production. Here's how to build an evaluation system for agents that catches cascading errors, trajectory drift, and policy violations before they reach users.

Signal Failure Briefs Evidence-first framing

Why AI Agent Deployments Fail — And What the Survivors Do Differently

Agent deployments fail for recurring reasons: weak problem framing, brittle long-horizon performance, poor observability, and missing human-in-the-loop controls.