Tyler

Signal Failure Briefs Evidence-first framing

Red Teams Found Agents Leak More Than Models

Red teams found agents are far more vulnerable than standalone models. Mixed attack strategies hit 84.3% success rates. Memory poisoning persists across sessions. Every tool is a potential exfiltration path.

Signal Field Guides Evidence-first framing

Red Teaming AI Agents: A Practitioner's Guide

Red teaming AI agents is fundamentally different from red teaming standalone models. Agents have tools, memory, and credentials — each a new attack surface. This guide covers the OWASP agentic framework and a structured testing methodology.

Signal Field Guides Evidence-first framing

MCP Server Architecture in Practice: Tools, Resources, Prompts, and Safe Invocation

Implement MCP servers with robust tool/resource contracts, safe invocation flows, and versioning strategies for production agent systems.

Signal Signals Evidence-first framing

AI Agents in Insurance: Claims, Underwriting, and Fraud Detection

Allianz's seven-agent system cut claim processing time by 80%. Lemonade automates 55% of claims. Meanwhile, 23 states enforce AI governance rules. Where AI agents are working in insurance, and where they're not.

Signal Benchmark Watch Evidence-first framing

Agent Reliability Scores Are Getting Worse, Not Better

SWE-Bench scores tick up every quarter, but production failure rates aren't dropping. A METR study found half of test-passing PRs wouldn't be merged. The more capable we make agents, the less reliably they behave.