intermediate

Embeddings & Vector Search

The Library Where Similar Books Live Together

8 min read

The Analogy

The Library Where Similar Books Live Together

Imagine a library where books aren't sorted alphabetically, but by how similar their ideas are — so The Alchemist sits next to Siddhartha.

You walk in and ask for 'books about finding your life's purpose.' The librarian doesn't search by title — they walk to the section where all purpose-related books cluster, regardless of their language or author. Embeddings create this kind of map for text: turning words and sentences into coordinates in a giant space where similar ideas live close together.

In Plain English

Embeddings convert text into lists of numbers (vectors) that represent meaning. Similar meanings end up with similar numbers, so you can find related content by searching for nearby vectors — this is the foundation of semantic search and RAG.

The Technical Picture

Embedding models (e.g., text-embedding-3-large, sentence-transformers) map text to dense high-dimensional vectors in a semantic space. Cosine similarity or dot product measures proximity. Vector databases (Pinecone, Weaviate, Supabase pgvector) enable efficient approximate nearest-neighbour search at scale.

Real-World Examples

Semantic search finding relevant documents even without exact keyword matches
Recommendation engines finding similar products
RAG systems using embeddings to retrieve relevant document chunks

Key Takeaway

Embeddings turn meaning into maths — similar meanings become similar numbers, enabling semantic search.

Related Concepts

RAG — Retrieval-Augmented Generation

Unsupervised Learning

Context Window

Back to All Concepts