Tyler

X
RAG Pipelines Are Silently Dropping Context
signals

RAG Pipelines Are Silently Dropping Context

Your RAG pipeline retrieves the right documents. The LLM ignores half of them. The RAG-E framework found generators skip the top-ranked passage in 47-67% of cases. The retrieval-utilization gap is the real bottleneck.

4 min read
mcp

MCP Server Architecture in Practice: Tools, Resources, Prompts, and Safe Invocation

Multi-Agent Systems for Supply Chain Optimization
Guides

Multi-Agent Systems for Supply Chain Optimization

Walmart fulfills 76% of orders from local regions with agent-driven logistics. Maersk saved $300 million. But only 23% of supply chain organizations have a formal AI strategy. Where multi-agent systems are delivering results.

12 min read
Red Teams Found Agents Leak More Than Models
signals

Red Teams Found Agents Leak More Than Models

Red teams found agents are far more vulnerable than standalone models. Mixed attack strategies hit 84.3% success rates. Memory poisoning persists across sessions. Every tool is a potential exfiltration path.

3 min read
Red Teaming AI Agents: A Practitioner's Guide
Guides

Red Teaming AI Agents: A Practitioner's Guide

Red teaming AI agents is fundamentally different from red teaming standalone models. Agents have tools, memory, and credentials — each a new attack surface. This guide covers the OWASP agentic framework and a structured testing methodology.

12 min read
MCP Server Architecture in Practice: Tools, Resources, Prompts, and Safe Invocation
mcp

MCP Server Architecture in Practice: Tools, Resources, Prompts, and Safe Invocation

Implement MCP servers with robust tool/resource contracts, safe invocation flows, and versioning strategies for production agent systems.

1 min read
AI Agents in Insurance: Claims, Underwriting, and Fraud Detection
Guides

AI Agents in Insurance: Claims, Underwriting, and Fraud Detection

Allianz's seven-agent system cut claim processing time by 80%. Lemonade automates 55% of claims. Meanwhile, 23 states enforce AI governance rules. Where AI agents are working in insurance, and where they're not.

13 min read
Agent Reliability Scores Are Getting Worse, Not Better
signals

Agent Reliability Scores Are Getting Worse, Not Better

SWE-Bench scores tick up every quarter, but production failure rates aren't dropping. A METR study found half of test-passing PRs wouldn't be merged. The more capable we make agents, the less reliably they behave.

3 min read
Best Open-Weight Models for Production AI Agents 2026
Guides

Best Open-Weight Models for Production AI Agents 2026

Your agent framework doesn't matter if the model underneath it can't call tools reliably. We tested and ranked eight open-weight models specifically for agent use cases: tool calling accuracy, multi-step reasoning, context retention, hosting economics, and licensing terms.

11 min read
Single Agent vs Multi-Agent: When Swarms Actually Help
Guides

Single Agent vs Multi-Agent: When Swarms Actually Help

Compare single-agent and multi-agent architectures on complexity, cost, debugging, and when orchestration helps.

7 min read
Swarm Signal
0:00
0:00
Up Next

Queue is empty. Click "+ Queue" on any article to add it.