Topic

Agent engineering

For the builders: agent architectures and their failure modes, memory design, framework-specific testing guides, and the engineering workflows that make agents shippable.

May 21, 20266 min read

Eval-Driven Development for AI Agents

Eval-driven development means writing evals before you build, iterating against them, and gating releases on results. How the loop works — and its limits.

AI evals Agent engineering

May 18, 20266 min read

Agent Memory: Architectures and How to Test Them

A practical guide to agent memory — context windows, summarization, long-term stores — the failure modes of each, and how to test memory before users do.

Agent engineering Reliability

May 14, 20266 min read

How to Test a LangGraph Agent

A layered method for testing a LangGraph agent: unit-test nodes, verify routing and state, then run simulated users against the compiled graph.

Agent engineering Agent testing

May 11, 20267 min read

AI Agent Architecture Patterns and Their Failure Modes

The five common AI agent architecture patterns — single-loop, plan-then-execute, router, multi-agent — and the distinct ways each one breaks.

Agent engineering Reliability

May 4, 20267 min read

You Built an AI Agent. Now Prove It Works.

Building an AI agent got easy. Deploying AI agents to production is where it breaks. What proving your agent works actually requires, step by step.

Agent engineering Agent testing