When builders break ground on a skyscraper, they don’t start with the penthouse. They pour a foundation, a massive slab of concrete and steel designed to support whatever gets built on top of it. Office tower, luxury condos, hotel, hospital. The foundation doesn’t care. It just needs to be strong enough and broad enough to hold any of them.
That, as it turns out, is exactly the analogy a group of Stanford researchers had in mind when they needed a name for a new category of AI.
What a Foundation Model Actually Is
A foundation model is a large AI system trained on enormous, diverse datasets so that it develops a broad understanding of language, images, sound, or some combination of all three. It’s not designed for any single task. Instead, it’s designed to be adapted for thousands of different tasks, from writing poetry to generating book covers to transcribing audiobooks.
The official definition, courtesy of Stanford’s Center for Research on Foundation Models, is: “any model trained on broad data (generally using self-supervision at scale) that can be adapted to a wide range of downstream tasks.”
In plainer terms: it’s the raw, general-purpose intelligence that powers all the specific AI tools you actually use.
GPT-4 is a foundation model. Claude is a foundation model. So are Llama, Gemini, Stable Diffusion, and Whisper. When companies like Sudowrite, NovelCrafter, or Jasper build an AI writing tool, they don’t start from scratch. They take an existing foundation model and customize it for their specific audience, adding their own interface, instructions, and fine-tuning on top. The foundation model is the engine. The app is the car.
How 102 Researchers Named a Concept (And Almost Called It “Pluripotent Model”)
The term “foundation model” was born in August 2021, when a team of more than 100 researchers at Stanford’s Institute for Human-Centered Artificial Intelligence (HAI) published a sweeping paper called “On the Opportunities and Risks of Foundation Models.” Led by Rishi Bommasani and Percy Liang, the paper ran to over 200 pages and attempted something ambitious: to define and examine an entire category of AI that was reshaping the field faster than anyone had language for.
The naming process itself took weeks of debate. The team considered “base model,” “broadbase model,” “inframodel,” “polymodel,” “multi-purpose model,” and, memorably, “pluripotent model” (borrowing from biology, where pluripotent cells can develop into almost any cell type). They put it to a vote. “Foundation model” won.
Percy Liang later explained why. The word “foundation” captured two ideas simultaneously. First, incompleteness: a foundation isn’t a building, it’s what you build on. As Liang put it, “it’s just the foundation, it’s not the entire house.” These models still need to be adapted, fine-tuned, and prompted before they’re useful to anyone. Second, critical importance: everything built on top depends on the foundation’s strength. But Liang was honest about the limits of the metaphor, too. “Foundations doesn’t mean that they are good foundations,” he said. “In fact, they are shaky foundations.”
That candor is part of what makes the term useful. It acknowledges that these models are powerful and essential while reminding us they’re not finished products. They’re starting points.
How Foundation Models Differ from What Came Before
Before foundation models, AI development worked differently. You wanted a spam filter? You collected thousands of labeled examples of spam and not-spam, then trained a model on that specific task. You wanted a sentiment analyzer? Same process, different labels. Every task got its own bespoke model, trained on its own curated dataset. It worked, but it was slow, expensive, and limited.
Foundation models flipped the approach. Instead of training narrowly for one task, you train broadly on as much data as possible: books, websites, images, code, audio, conversations, scientific papers. The model learns the deep patterns and structures of language (or images, or sound) without being told what to look for. This is called self-supervised learning, and it’s what lets the model develop such versatile capabilities.
Once that broad training is complete, the foundation model can be adapted in two main ways. Fine-tuning involves additional training on a smaller, specialized dataset, like teaching a general-purpose model to understand romance novel conventions or medical terminology. Prompting (which includes prompt engineering) involves giving the model instructions at the moment you use it, without changing its underlying parameters at all. Most of the AI tools authors encounter use some combination of both.
One important distinction worth noting: “foundation model” is a broader term than “large language model.” Every LLM is a foundation model, but not every foundation model is an LLM. Foundation models also include image generators like Stable Diffusion and Midjourney, speech models like Whisper, and multimodal models like GPT-4o and Gemini that work across text, images, and audio simultaneously.
Why This Matters for Your Writing Life
Understanding foundation models gives you a clearer picture of the AI tool landscape, and that clarity saves you time, money, and frustration.
It explains why different tools feel similar. If you’ve ever noticed that Sudowrite, Jasper, and ChatGPT sometimes produce suspiciously similar prose, there’s a reason. Many AI writing tools are built on the same underlying foundation model (often GPT-4 or Claude), with different interfaces and customizations layered on top. Knowing this helps you evaluate what you’re actually paying for: not the AI itself, but the interface, the fine-tuning, and the workflow design.
It helps you understand upgrade cycles. When OpenAI releases a new version of GPT, or Anthropic launches a new Claude model, every tool built on that foundation gets a potential upgrade. That new version of Sudowrite that suddenly seems smarter? It may be running on a newer foundation model under the hood. Understanding the relationship between the foundation and the applications built on it helps you follow what’s actually changing and why.
It demystifies “AI-powered” marketing. Every app claims to be “AI-powered” now. But once you understand foundation models, you can ask sharper questions. Which foundation model does this tool use? How have they adapted it for writers? What’s the actual value they’re adding beyond the raw model? Some tools add genuine value through specialized fine-tuning, thoughtful interfaces, and author-specific features. Others are thin wrappers around an API. Knowing the difference starts with understanding what’s underneath.
The AI tools you use every day, from the chatbot that helps you brainstorm characters to the image generator that mocks up your cover, are all built on foundation models. They’re the common ground beneath a wildly diverse ecosystem of applications. You don’t need to understand every technical detail, but knowing that the foundation exists, and what it does, makes the whole landscape a lot less mysterious.