Recipes

Named cross-category compositions: real systems compose many patterns at once. Each recipe lists the patterns that together produce a recognisable kind of system. Members carry a role: 'core' patterns define the recipe and removing one breaks the shape; 'hardening' patterns add safety, observability, or correctness; 'optional' patterns are common variations or upgrades.

Modern Coding Agent

An agent that reads, writes, and runs code in a sandbox, calling tools and (optionally) sub-agents while a human approves the destructive parts. The shape that powers Cursor, Claude Code, OpenHands, Aider, Codex CLI.

core react
core tool-use
core code-as-action
core code-execution
core agent-computer-interface
hardening step-budget
hardening subagent-isolation
hardening sandbox-isolation
hardening approval-queue
hardening decision-log
optional mcp
optional agent-skills
optional todo-list-driven-agent

Production RAG

Retrieval-grounded generation built to be defensible: hybrid retrieval, reranking, contextualised chunks, citations rendered to the user, and verification before the answer ships.

core agentic-rag
core hybrid-search
core cross-encoder-reranking
core contextual-retrieval
core citation-streaming
hardening chain-of-verification
hardening eval-harness
hardening confidence-reporting
optional crag
optional hyde
optional raft

Voice Agent Stack

A low-latency conversational agent over a phone or microphone, with handoff to humans, mid-utterance cancellation, and per-call session boundaries. The shape behind LiveKit, Pipecat, Vapi, Retell.

Sovereign / Regulated Deployment

An agent stack that satisfies data-residency and audit requirements: weights, inference, tools, and logs all sit inside an operator-controlled boundary, with provenance and incident response wired in.

Long-Running Autonomous Agent

An agent that operates over hours to weeks, surviving restarts and accumulating memory while remaining safe. The shape behind Devin, Manus, Sparrot, durable LangGraph runs.

Multi-Agent Debate

Two or more agents argue toward a better answer than any single agent would produce, with a frozen rubric to score the result. The shape behind debate-style alignment work and 'committee of critics' setups.

core debate
core inner-critic
core frozen-rubric-reflection
hardening llm-as-judge
hardening best-of-n
optional camel-role-playing

Browser & Computer-Use Stack

An agent that drives a real GUI: planning a task, grounding actions in pixels or DOM, and asking permission before destructive clicks. The shape behind OpenAI Operator, Anthropic Computer Use, Browser Use, Stagehand, MultiOn.

core computer-use
core browser-agent
core tool-use
hardening dual-system-gui-agent
hardening approval-queue
hardening sandbox-isolation
hardening step-budget
optional session-isolation

Memory Architecture

How long-running agents structure what they remember: tiered short-to-long-term cascade, compaction across the window, paging, and reasoning carry-forward across tool calls.

Multi-Agent Coordination

Several agents collaborate under a coordinator, with explicit hand-offs and a shared protocol. The shape behind LangGraph supervisor, OpenAI Swarm, AutoGen group chat, Bedrock multi-agent orchestrators.

Safety Hardening

The minimum set of constraints to put around any production agent before it touches the world: budgets, gates, charters, kill-switches, approvals.

Eval & Observability

How you keep an agent honest in production: harness, judge, decision log, provenance, shadow rollouts.

core eval-harness
core eval-as-contract
core decision-log
hardening agent-as-judge
hardening llm-as-judge
hardening confidence-reporting
hardening provenance-ledger
hardening lineage-tracking
optional shadow-canary

Structured Output Stack

Get typed, schema-conformant data out of the model and verify it. The shape behind Outlines, Instructor, Pydantic AI, DSPy.

Streaming UX Stack

User-perceivable real-time output: tokens streamed as they arrive, citations attached as they resolve, the user can stop at any time and the agent can interrupt the user when something matters.

Planning Loops

Different ways to structure 'think then act': linear ReAct, plan-then-execute, parallel DAG planning, tree search with backtracking, and the outer/inner planner+executor split.

core react
core plan-and-execute
optional rewoo
optional llm-compiler
optional lats
optional outer-inner-agent-loop
optional todo-list-driven-agent

Routing & Fallback

How requests get to the right model or specialist and how the system stays up when one upstream breaks. The shape behind LangChain fallbacks, model routers, provider cascades.

Reflection & Self-Correction

Patterns where the model reviews its own work before shipping it: scoped rubric reflection, self-refine, deterministic post-checks, process rewards.