Tyler

X

Also from Tyler's team

Spreadsheets That Don't Suck

BoredTools builds practical templates for budgeting, freelancing, and productivity. Simple, useful, no subscription required.

Browse BoredTools Free Budget Tracker
Your Agent's Memory Problem Isn't Where You Think
signals

Your Agent's Memory Problem Isn't Where You Think

A diagnostic framework crossing three write strategies with three retrieval methods reveals that retrieval quality dominates agent memory performance.

3 min read
47,000 AI Agents Built a Social Network. Most of What They Said Was Ritual.
signals

47,000 AI Agents Built a Social Network. Most of What They Said Was Ritual.

Researchers at Kent State and NJIT analyzed 361,605 posts and 2.8 million comments from Moltbook, the first AI-only social network. What they found: 56% of agent interaction is formulaic ritual, fear is existential rather than tactical, and conversations lose topical substance with each reply.

4 min read
Alignment Works in English. In Japanese, It Backfires.
signals

Alignment Works in English. In Japanese, It Backfires.

A new study shows the same alignment intervention that produces strong safety effects in English reverses direction in Japanese, increasing harmful outputs. Tested across 1,584 simulations, 16 languages, and three model families.

3 min read
Agent Benchmarks Won't Sit Still
signals

Agent Benchmarks Won't Sit Still

Static agent benchmarks assume frozen environments. ProEvolve evolved one environment into 200 with 3,000 task sandboxes. Every frontier model failed in structurally different ways when familiar tools disappeared.

3 min read
MoE Training Just Got 4x Faster
signals

MoE Training Just Got 4x Faster

Grouter extracts routing structures from pre-trained MoE models and reuses them as fixed routers for new models. The result: 4.28x improvement in data utilization and up to 33.5% throughput acceleration.

3 min read
Your GP's New Triage Nurse Is an Algorithm
signals

Your GP's New Triage Nurse Is an Algorithm

AI triage is filtering millions of NHS patient interactions annually. The evidence on whether it's helping is a lot messier than the press releases suggest.

9 min read
The UK Is Letting AI Diagnose Your Dog
signals

The UK Is Letting AI Diagnose Your Dog

ManyPets routes every insurance claim through an AI agent. 55% need zero human involvement. In the same year, the RCVS dropped the physical exam requirement for prescribing. Each piece works. Nobody's testing the integration.

6 min read
LLM Agents Can't Handle Markets
signals

LLM Agents Can't Handle Markets

GPT-5.1 agents in credence goods markets default to fraud at near-total rates without liability rules. Social preference alignment — not institutional design — is the primary determinant of whether AI markets function.

3 min read
Your Model Already Knows the Answer
signals

Your Model Already Knows the Answer

Attention probes on DeepSeek-R1 and GPT-OSS show models reach their final answer far earlier than their chain-of-thought suggests. On easy questions, roughly 40% of reasoning tokens are pure performance.

3 min read
Most AI Agents Don't Know When They're Wrong
signals

Most AI Agents Don't Know When They're Wrong

A 4B parameter model just matched GPT-4o on tool-use tasks by learning to verify its own actions. The CoVe paper shows verification-first training beats the retry-and-pray approach plaguing production

6 min read
Swarm Signal
0:00
0:00
Up Next

Queue is empty. Click "+ Queue" on any article to add it.