Tyler

Signal Decision Matrix Evidence-first framing

Multi-Agent Orchestration: The Illusion of Cooperation

A new benchmark from Tsinghua and Microsoft tests 16 multi-agent frameworks on tasks requiring genuine coordination. The median system spends 74% of its inter-agent messages on redundant state synchronization, and adding a third agent makes most pipelines slower, not faster.

Signal Failure Briefs Evidence-first framing

Your Agent's System Prompt Is Fighting Itself

A framework called Arbiter treats agent system prompts as auditable code. Applied to Claude Code, Codex CLI, and Gemini CLI, it found 152 interference patterns — including critical contradictions and a structural data loss bug — for a total cost of $0.27.

Signal Market Maps Evidence-first framing

The GPU Bottleneck Isn't Compute Anymore

NVIDIA's Blackwell GPUs doubled tensor core throughput but left shared memory and exponential units unchanged. FlashAttention-4 rearchitects attention kernels from scratch to work around this asymmetry, achieving 1,613 TFLOPs/s and up to 1.3x speedup over cuDNN on B200.

Signal Signals Evidence-first framing

Your Agent's Memory Problem Isn't Where You Think

A diagnostic framework crossing three write strategies with three retrieval methods reveals that retrieval quality dominates agent memory performance.

Signal Signals Evidence-first framing

47,000 AI Agents Built a Social Network. Most of What They Said Was Ritual.

Researchers at Kent State and NJIT analyzed 361,605 posts and 2.8 million comments from Moltbook, the first AI-only social network. What they found: 56% of agent interaction is formulaic ritual, fear is existential rather than tactical, and conversations lose topical substance with each reply.

Signal Failure Briefs Evidence-first framing

Alignment Works in English. In Japanese, It Backfires.

A new study shows the same alignment intervention that produces strong safety effects in English reverses direction in Japanese, increasing harmful outputs. Tested across 1,584 simulations, 16 languages, and three model families.

Signal Benchmark Watch Evidence-first framing

Agent Benchmarks Won't Sit Still

Static agent benchmarks assume frozen environments. ProEvolve evolved one environment into 200 with 3,000 task sandboxes. Every frontier model failed in structurally different ways when familiar tools disappeared.

Signal Signals Evidence-first framing

MoE Training Just Got 4x Faster

Grouter extracts routing structures from pre-trained MoE models and reuses them as fixed routers for new models. The result: 4.28x improvement in data utilization and up to 33.5% throughput acceleration.

Signal Market Maps Evidence-first framing

Your GP's New Triage Nurse Is an Algorithm

AI triage is filtering millions of NHS patient interactions annually. The evidence on whether it's helping is a lot messier than the press releases suggest.

Signal Benchmark Watch Evidence-first framing

The UK Is Letting AI Diagnose Your Dog

ManyPets routes every insurance claim through an AI agent. 55% need zero human involvement. In the same year, the RCVS dropped the physical exam requirement for prescribing. Each piece works. Nobody's testing the integration.

Execution tooling is separate

Multi-Agent Orchestration: The Illusion of Cooperation

Your Agent's System Prompt Is Fighting Itself

The GPU Bottleneck Isn't Compute Anymore

Your Agent's Memory Problem Isn't Where You Think

47,000 AI Agents Built a Social Network. Most of What They Said Was Ritual.

Alignment Works in English. In Japanese, It Backfires.

Agent Benchmarks Won't Sit Still

MoE Training Just Got 4x Faster

Your GP's New Triage Nurse Is an Algorithm

The UK Is Letting AI Diagnose Your Dog