Agent Design

How you actually build AI agents that work. Architectures, tool use, memory patterns, and the frameworks worth paying attention to.

Most Multi-Agent Systems Aren't Cooperating. They're Colliding.
signals

Most Multi-Agent Systems Aren't Cooperating. They're Colliding.

A new benchmark from Tsinghua and Microsoft tests 16 multi-agent frameworks on tasks requiring genuine coordination. The median system spends 74% of its inter-agent messages on redundant state synchronization, and adding a third agent makes most pipelines slower, not faster.

4 min read
Your Agent's System Prompt Is Fighting Itself
signals

Your Agent's System Prompt Is Fighting Itself

A framework called Arbiter treats agent system prompts as auditable code. Applied to Claude Code, Codex CLI, and Gemini CLI, it found 152 interference patterns — including critical contradictions and a structural data loss bug — for a total cost of $0.27.

3 min read
Agent Benchmarks Won't Sit Still
signals

Agent Benchmarks Won't Sit Still

Static agent benchmarks assume frozen environments. ProEvolve evolved one environment into 200 with 3,000 task sandboxes. Every frontier model failed in structurally different ways when familiar tools disappeared.

3 min read
Most AI Agents Don't Know When They're Wrong
signals

Most AI Agents Don't Know When They're Wrong

A 4B parameter model just matched GPT-4o on tool-use tasks by learning to verify its own actions. The CoVe paper shows verification-first training beats the retry-and-pray approach plaguing production

6 min read
From Clawdbot to OpenAI in 90 Days
signals

From Clawdbot to OpenAI in 90 Days

OpenClaw hit 100,000 GitHub stars in 48 hours, survived three name changes, a supply chain attack, and three critical CVEs. Then its creator Peter Steinberger joined OpenAI.

6 min read
Hierarchical Agents Don't Know Who They're Talking To
signals

Hierarchical Agents Don't Know Who They're Talking To

Roughly 70% of Earth science datasets hosted in large repositories like PANGAEA go uncited after publication. The data exists. The agents can access it....

7 min read
When Your Agent Stops Using Tools
signals

When Your Agent Stops Using Tools

Reinforcement learning was supposed to teach agents to use tools fluently. Instead, researchers are watching a consistent failure mode: models trained...

7 min read
The Protocol Wars Are Ending. Here's What Actually Happened.
signals

The Protocol Wars Are Ending. Here's What Actually Happened.

Anthropic's MCP and Google's A2A joined the Linux Foundation. IBM killed its own protocol to back A2A. 146 organizations signed on. The wars are ending.

5 min read
Multi-Agent Reasoning's Memory Problem
signals

Multi-Agent Reasoning's Memory Problem

Reasoning language models score in the top percentile on math olympiad benchmarks, yet a new study from Stanford found they fail to correctly recall their...

8 min read
Nobody Knows If Deployed AI Agents Are Safe
signals

Nobody Knows If Deployed AI Agents Are Safe

The 2025 AI Agent Index just cataloged over 100 deployed agentic AI systems, and the finding that should alarm everyone isn't about capability. It's about...

7 min read
Swarm Signal
0:00
0:00
Up Next

Queue is empty. Click "+ Queue" on any article to add it.