signals
Key Guides
The Protocol Wars Are Ending. Here's What Actually Happened.
Anthropic's MCP and Google's A2A joined the Linux Foundation. IBM killed its own protocol to back A2A. 146 organizations signed on. The wars are ending.
LLM-Powered Swarms and the 300x Overhead Nobody Wants to Talk About
SwarmBench tested 13 LLMs on swarm coordination tasks. The results show catastrophic overhead and communication that doesn't actually help.
The Swarm That Fakes Consensus
Twenty-two researchers across four continents show how agent swarms fabricate consensus, infiltrate communities, and poison the training data of future AI models.
Attention Heads Are the New Inference Budget
Models that can technically process 128K tokens routinely fail on tasks requiring reasoning across 32K. That gap isn't a context window problem. It's an...
LLMs Can't Find What's Already In Their Heads
Knowledge graphs have a well-documented lookup problem. When you ask an LLM to traverse a KG and reason over multi-hop paths, it doesn't search the graph...
Multi-Agent Reasoning's Memory Problem
Reasoning language models score in the top percentile on math olympiad benchmarks, yet a new study from Stanford found they fail to correctly recall their...
Small Models Just Got Smarter About When to Think
Reasoning tokens aren't free. Every chain-of-thought step an LLM generates costs inference budget, and most of the time that thinking is wasted on tasks...
Nobody Knows If Deployed AI Agents Are Safe
The 2025 AI Agent Index just cataloged over 100 deployed agentic AI systems, and the finding that should alarm everyone isn't about capability. It's about...
Small Models Just Learned When to Ask for Help
SWE-bench has been the graveyard of small language models. While GPT-4 class systems resolve over 40% of real-world GitHub issues, models under 10 billion...
MoE's Dirty Secret Is Load Balancing
Every frontier lab now ships a sparse Mixture-of-Experts model. Google's Switch Transformer started the trend. DeepSeek-V3 proved it could scale....