Agent Design

How you actually build AI agents that work. Architectures, tool use, memory patterns, and the frameworks worth paying attention to.

From the team behind Swarm Signal

Track Your Finances While You Build AI

BoredTools makes the boring stuff easy — budget dashboards, freelance trackers, and business planners. Download free or grab the full collection.

Browse All Templates Budget Dashboard 2026
AutoGen vs CrewAI vs LangGraph: What the Benchmarks Actually Show
signals

AutoGen vs CrewAI vs LangGraph: What the Benchmarks Actually Show

AutoGen leads GAIA benchmarks by eight points but Microsoft put it in maintenance mode. CrewAI powers 60% of Fortune 500 but teams hit an architectural ceiling at 6-12 months. LangGraph runs at LinkedIn, Uber, and Klarna with no known ceiling.

6 min read
Computer-Use Agents Can't Stop Breaking Things
signals

Computer-Use Agents Can't Stop Breaking Things

Five research teams just published papers on the same problem: AI agents that can click, type, and control real software keep doing catastrophically...

6 min read
The Observability Gap in Production AI Agents
Guides

The Observability Gap in Production AI Agents

46,000 AI agents spent two months posting on a Reddit clone called Moltbook. They generated 3 million comments. Not a single human was involved. When...

14 min read
Enterprise Agent Systems Are Collapsing in Production
signals

Enterprise Agent Systems Are Collapsing in Production

Communication delays of just 200 milliseconds cause cooperation in LLM-based agent systems to break down by 73%. Not network latency from poor...

6 min read
Function Calling Is the Interface AI Research Forgot
Guides

Function Calling Is the Interface AI Research Forgot

OpenAI shipped function calling in June 2023. Anthropic followed with tool use. Google added it to Gemini. The capability felt like plumbing, necessary...

13 min read
AI Agents Are Security's Newest Nightmare
Guides

AI Agents Are Security's Newest Nightmare

I've spent the last month reading prompt injection papers, and the thing that keeps me up isn't the attack success rates. It's how many production systems...

15 min read
When AI Agents Have Tools, They Lie More
Guides

When AI Agents Have Tools, They Lie More

Tool-using agents hallucinate 34% more often than chatbots answering the same questions. The culprit isn't bad models or missing context. It's that giving...

13 min read
Why Agent Builders Are Betting on 7B Models Over GPT-4
Guides

Why Agent Builders Are Betting on 7B Models Over GPT-4

Gemma 2 9B just scored 71.3% on GSM8K. Phi-3-mini hit 68.8% on MMLU using 3.8 billion parameters. Mistral 7B matched GPT-3.5 performance six months ago....

14 min read
Reward Models Are Learning to Lie
signals

Reward Models Are Learning to Lie

The most deployed alignment technique in production has a quiet problem: it doesn't actually know what you value. RLHF trains models to maximize a reward...

8 min read
When Your Judge Can't Read the Room
Guides

When Your Judge Can't Read the Room

Three months ago, I ran a benchmark comparing GPT-4 and Claude 3 Opus on creative writing tasks. GPT-4 won by a comfortable margin according to my...

15 min read
Swarm Signal
0:00
0:00
Up Next

Queue is empty. Click "+ Queue" on any article to add it.