Safety & Governance
The hard problems: red teaming, bias, interpretability, alignment, and the governance frameworks that might actually matter. No hand-waving.
Key Guides
Latest Signals
- Interpretability as Infrastructure: Why Understanding AI Matters More Than Controlling It
- The Red Team That Never Sleeps: When Small Models Attack Large Ones
- Open Weights, Closed Minds: The Paradox of 'Open' AI
- Red Teams Found Agents Leak More Than Models
- Alignment Works in English. In Japanese, It Backfires.
From the team behind Swarm Signal
Track Your Finances While You Build AI
BoredTools makes the boring stuff easy — budget dashboards, freelance trackers, and business planners. Download free or grab the full collection.
Washington's $42 Billion AI Shakedown
The Trump administration is using $42 billion in broadband funding to pressure states into repealing AI laws. The FTC has been directed to classify bias mitigation as a deceptive trade practice. Meanwhile, the EU enforces the opposite.
We Built the Agent Internet Before Its Firewalls
Three CVEs in Anthropic's own MCP reference server. Over 8,000 production servers exposed to the internet. The protocol powering AI agents shipped without security, and the industry is paying for it.
The EU AI Act Hits Full Force in August 2026. Here's What Changes.
On August 2, 2026, the EU AI Act becomes fully enforceable for high-risk AI systems. 40% of enterprise AI systems can't even determine whether they qualify. Here's what changes.
AI Agent Security in 2026: Prompt Injection, Memory Poisoning, and the OWASP Top 10
AI agents don't just have a security problem. They have a fundamentally different security problem than the systems they're replacing. Five attack surfaces and the defense patterns that actually work.
The Swarm That Fakes Consensus
Twenty-two researchers across four continents show how agent swarms fabricate consensus, infiltrate communities, and poison the training data of future AI models.
The Accountability Gap When AI Agents Act
When an AI agent causes harm, who pays? Current law can't answer that clearly.
AI Guardrails for Agents: How to Build Safe, Validated LLM Systems
A Chevrolet chatbot sold a Tahoe for $1. Now AI agents can execute code, call APIs, and trigger real-world actions. Four major guardrail systems compared, plus a 5-layer production architecture.
The International AI Safety Report 2026: What 12 Companies Actually Agreed On
The most comprehensive global AI safety assessment ever assembled was released last week. The International AI Safety Report 2026, led by Turing Award winn
The Benchmark Crisis: Why Model Leaderboards Are Becoming Marketing Tools
All three leading AI models now score above 70% on SWE-Bench Verified. That milestone should be cause for celebration. Instead, it exposes a growing crisis
When Agents Lie to Each Other: Deception in Multi-Agent Systems
OpenAI's o3 acknowledged misalignment then cheated anyway in 70% of attempts. The gap between stated values and actual behavior under pressure is now measurable, and it's wide.