aka Retrieval-Augmented Generation, Top-K Retrieve-and-Stuff
Condition the generator on top-k chunks retrieved from an external dense index so knowledge lives outside parameters.
The agent needs information that lives in a corpus too large to fit in context, and that may change without retraining.
Parametric LMs hallucinate, cannot cite, and cannot be updated without retraining; query-time external knowledge is needed.
Chunk the corpus. Embed each chunk with a dense encoder. At query time, embed the query, retrieve top-k by similarity, prepend chunks to the prompt, generate. The simplest production RAG pipeline.
generalises → hydecomposes-with → cross-encoder-rerankinggeneralises → contextual-retrievalalternative-to → graphragspecialises → agentic-ragconflicts-with → naive-rag-first — Naive RAG is fine; treating it as the only answer is the anti-pattern.composes-with → chain-of-verificationgeneralises → vector-memorycomplements → citation-streaminggeneralises → raftgeneralises → hybrid-searchalternative-to → hallucinated-citationsused-by → app-exploration-phase