Embeddings & Vector Search
The Library Where Similar Books Live Together
8 min read
The Library Where Similar Books Live Together
Imagine a library where books aren't sorted alphabetically, but by how similar their ideas are — so The Alchemist sits next to Siddhartha.
You walk in and ask for 'books about finding your life's purpose.' The librarian doesn't search by title — they walk to the section where all purpose-related books cluster, regardless of their language or author. Embeddings create this kind of map for text: turning words and sentences into coordinates in a giant space where similar ideas live close together.
In Plain English
Embeddings convert text into lists of numbers (vectors) that represent meaning. Similar meanings end up with similar numbers, so you can find related content by searching for nearby vectors — this is the foundation of semantic search and RAG.
The Technical Picture
Embedding models (e.g., text-embedding-3-large, sentence-transformers) map text to dense high-dimensional vectors in a semantic space. Cosine similarity or dot product measures proximity. Vector databases (Pinecone, Weaviate, Supabase pgvector) enable efficient approximate nearest-neighbour search at scale.
Real-World Examples
- Semantic search finding relevant documents even without exact keyword matches
- Recommendation engines finding similar products
- RAG systems using embeddings to retrieve relevant document chunks
Embeddings turn meaning into maths — similar meanings become similar numbers, enabling semantic search.