Also from Tyler's team
Spreadsheets That Don't Suck
BoredTools builds practical templates for budgeting, freelancing, and productivity. Simple, useful, no subscription required.
Your AI Inherited Your Biases: When Agents Think Like Humans (And That's Not a Compliment)
New research shows AI agents don't just learn human capabilities; they systematically inherit human cognitive biases. The implications for deploying agents as objective decision-makers are uncomfortable.
Agents That Rewrite Themselves: The Self-Modifying Stack Is Here
Three independent papers demonstrate agents rewriting their own training code, generating their own knowledge structures, and refining their reasoning at test time. Self-improvement has moved from theory to working engineering.
The Benchmark Trap: When High Scores Hide Low Readiness
AI benchmarks measure performance in sanitized environments that bear little resemblance to conditions where these systems will actually operate.
Open Weights, Closed Minds: The Paradox of 'Open' AI
Models you can download but can't verify, use but can't fully trust, deploy but can't completely understand. The paradox of 'open' AI.
Tools That Think Back: When AI Agents Learn to Build Their Own Interfaces
The first generation of agents treated tools as static functions. The emerging generation reasons about tools, remembers usage patterns, and adapts to heterogeneous interfaces.
The Prompt Engineering Ceiling: Why Better Instructions Won't Save You
On frontier models, sophisticated prompting underperforms zero-shot queries. The techniques that made mid-tier models usable are now making frontier models worse.
When Models See and Speak: The Multimodal Agent Arrives
Multimodal agents are navigating websites, controlling robots, and generating 3D scenes. But perception is the bottleneck, and bridging it requires rethinking how models attend to the world.
Robots With Reasoning: When Language Models Meet the Physical World
A robot arm completing 84.9% of manipulation tasks without a single demonstration. Not through months of reinforcement learning: through pure language model reasoning. The line between software agents and physical robots is blurring.
Interpretability as Infrastructure: Why Understanding AI Matters More Than Controlling It
Mechanistic interpretability has moved from describing what models do to engineering how they work. If you can identify the neurons responsible for a specific behavior, you don't need to control the entire system.
From Answer to Insight: Why Reasoning Tokens Are a Quiet Revolution in AI
OpenAI's o1 jumped from the 11th to the 83rd percentile on competitive programming. The difference wasn't better data or more parameters; it was reasoning tokens, invisible chains of thought that let models think before they answer.