intermediate

Perplexity — RAG-First Search AI

The Librarian Who Searches Before Answering

6 min read

The Analogy

The Librarian Who Searches Before Answering

Most AI models answer from memory. Perplexity checks the library first.

Ask a regular AI a current affairs question and it might confidently answer from outdated training data. Ask Perplexity and it first searches the web, retrieves the most relevant live sources, and then generates an answer grounded in those sources — with citations. It's less a chat AI and more a research tool that synthesises the web on demand.

In Plain English

Perplexity is an AI-powered search engine that searches the web in real time before generating any answer. It's essentially RAG at scale — retrieving live sources and grounding every response in them, with citations for verification.

The Technical Picture

Perplexity implements a search-augmented generation pipeline: user queries trigger web searches, results are retrieved and re-ranked, relevant chunks are injected into the LLM context (using models like Claude, GPT-4, and their own), and responses are generated with source attribution. This architecture bypasses knowledge cutoff limitations.

Real-World Examples

Asking Perplexity about last week's cricket match — it searches, reads, then answers
Perplexity's Deep Research feature compiles multi-source reports in minutes
Widely used by researchers who need cited, current information

Key Takeaway

Perplexity = search engine + LLM. It reads the web, then answers — solving the knowledge cutoff problem.

Related Concepts

RAG — Retrieval-Augmented Generation

Knowledge Cutoff

Grounding

AI Thinkers — Overview

Back to All Concepts