signals

Computer-Use Agents Can't Stop Breaking Things
signals

Computer-Use Agents Can't Stop Breaking Things

Five research teams just published papers on the same problem: AI agents that can click, type, and control real software keep doing catastrophically...

6 min read
Enterprise Agent Systems Are Collapsing in Production
signals

Enterprise Agent Systems Are Collapsing in Production

Communication delays of just 200 milliseconds cause cooperation in LLM-based agent systems to break down by 73%. Not network latency from poor...

6 min read
Reward Models Are Learning to Lie
signals

Reward Models Are Learning to Lie

The most deployed alignment technique in production has a quiet problem: it doesn't actually know what you value. RLHF trains models to maximize a reward...

8 min read
Most Agent Benchmarks Test the Wrong Thing
signals

Most Agent Benchmarks Test the Wrong Thing

The SciAgentGym team ran 1,780 domain-specific scientific tools through current agent frameworks. Success rate on multi-step tool orchestration: 23%. Same...

6 min read
The Inference Budget Just Got Interesting
signals

The Inference Budget Just Got Interesting

OpenAI's o1 made headlines for "thinking harder" during inference. But the real story isn't that a model can spend more tokens on reasoning: it's that...

7 min read
signals

When Multi-Agent Systems Break: The Coordination Tax Nobody Warns You About

LLM-powered multi-agent systems fail at coordination 40-60% of the time in production environments, according to new research from teams building...

7 min read
Inference-Time Compute Is Escaping the LLM Bubble
signals

Inference-Time Compute Is Escaping the LLM Bubble

Explore how inference-time compute scaling lets AI models think longer and reason deeper, boosting accuracy without retraining.

7 min read
Your AI Agent Can Reason, Plan, and Code. It Still Can't See the Web.
signals

Your AI Agent Can Reason, Plan, and Code. It Still Can't See the Web.

AI agents can reason, plan, and code. But they still can't reliably see the live web. The observation layer is the real bottleneck for production agents.

6 min read
The Protocol Wars Nobody's Winning
signals

The Protocol Wars Nobody's Winning

Ten competing agent protocols and counting. MCP won the tool layer but shipped without authentication. The alphabet soup is a coordination failure.

7 min read
Fourteen Papers, Three Ways to Break: ICLR 2026's Multi-Agent Failure Playbook
signals

Fourteen Papers, Three Ways to Break: ICLR 2026's Multi-Agent Failure Playbook

ICLR 2026 produced a failure playbook for multi-agent systems. 70% of agent communication is redundant. Single agents still match swarms on most benchmarks.

7 min read
Swarm Signal
0:00
0:00
Up Next

Queue is empty. Click "+ Queue" on any article to add it.