AI Fundamentals
← All Concepts
intermediate

RAG — Retrieval-Augmented Generation

The Open-Book Exam

8 min read

The Analogy

The Open-Book Exam

Imagine a student who isn't expected to memorise everything — they have access to the library during the exam.

Instead of relying only on memory, the student searches relevant books for each question, reads the most relevant passage, and writes a well-informed answer. RAG gives AI models the same superpower — search relevant documents first, then answer using what you just retrieved. The AI doesn't have to have memorised the answer; it can look it up.

In Plain English

RAG is a technique where an AI first searches a knowledge base for relevant information, then uses that retrieved information to generate a more accurate, up-to-date answer — rather than relying purely on its training memory.


The Technical Picture

RAG combines a retrieval component (dense vector search using embeddings over a document corpus) with a generative LLM. Retrieved chunks are injected into the prompt context, enabling the model to ground its output in specific, verifiable source documents.

Real-World Examples

  • Perplexity searches the web before generating every answer
  • Enterprise chatbots using RAG over internal company documents
  • NotebookLM by Google lets you RAG over your own uploaded PDFs
Key Takeaway

RAG = search first, then answer. It gives AI real-time access to external knowledge beyond its training.

Related Concepts