intermediate

Foundational Models

The Base Dough

7 min read

The Analogy

The Base Dough

A master baker makes one perfect dough base — then adapts it into naan, pizza, or paratha depending on what's needed.

The base dough takes months to perfect and requires expensive ovens and rare ingredients. But once made, any chef can adapt it quickly into a specialised dish. Foundational Models work the same way — trained on massive data at enormous cost, they serve as the base that any company can fine-tune into a specialised product.

In Plain English

A Foundational Model is a large AI model trained on vast amounts of data that can be adapted for many different tasks. Instead of training a new model from scratch for every use case, companies build on top of these pre-trained foundations.

The Technical Picture

Foundational Models are large-scale models pre-trained on broad datasets using self-supervised learning. They learn general representations that transfer well across downstream tasks via fine-tuning or prompting. Examples include GPT-4, Claude 3, Gemini Ultra, and DALL-E 3.

Real-World Examples

GPT-4 is OpenAI's foundational model — ChatGPT is the product built on it
Claude 3 Sonnet powers dozens of enterprise applications via API
Stable Diffusion is an open-source foundational model for image generation

Key Takeaway

Foundational Models are the expensive pre-trained base — everything else is built on top of them.

Related Concepts

Large Language Models (LLMs)

Diffusion Models

Fine-Tuning

Generative AI

Back to All Concepts