intermediate

Large Language Models (LLMs)

The World's Most Avid Reader

8 min read

The Analogy

The World's Most Avid Reader

Imagine reading the entire internet — every book, blog, Wikipedia article, and forum post — and then being asked any question.

An LLM has done exactly that (almost). It read trillions of words and learned to predict what word comes next, millions of times over. Through this process, it picked up grammar, facts, reasoning, and even humour. It doesn't 'know' things the way you do — it predicts the most likely next word, over and over, until an answer appears.

In Plain English

An LLM is an AI trained on massive amounts of text to understand and generate human language. It works by predicting the next word based on all the previous words — do this billions of times and you get something that can write essays, answer questions, and code.

The Technical Picture

LLMs are Transformer-based neural networks trained on large text corpora using next-token prediction (autoregressive language modelling). The Transformer's attention mechanism allows the model to weigh relationships between all tokens in its context window simultaneously, enabling complex language understanding and generation.

Real-World Examples

ChatGPT (GPT-4), Claude, and Gemini are all LLMs
GitHub Copilot uses an LLM to complete your code
Grammarly uses LLMs to suggest better phrasing

Key Takeaway

LLMs are autocomplete on steroids — trained on the entire internet to predict the perfect next word.

Related Concepts

Foundational Models

Tokens

Context Window

Temperature

Back to All Concepts