What is Context Window?
The maximum amount of text an AI model can process in a single interaction — its working memory.
Definition
The context window is the total amount of text (measured in tokens) that an LLM can process at once — including the system prompt, conversation history, and any documents you pass in. Think of it as the model's working memory: everything inside the context window is "visible" to the model; everything outside it is not. Context window size is one of the most practically important model specifications: larger windows mean you can analyze longer documents, maintain longer conversations, and give the model more context to work with.
Why it matters
Context window size determines what you can do with a model. A 4K-token window can handle short conversations. A 200K-token window (Claude 3.7+) can process a 150,000-word book in a single prompt. This opens entirely different use cases: full codebase analysis, complete contract review, meeting transcript synthesis. PMs evaluating AI tools need to understand context windows to know which models can handle their workloads.
How it works
Text is converted to tokens (roughly 4 characters per token in English). The entire prompt — system instructions + conversation history + attached documents — must fit within the model's context limit. When context fills up, either older messages are truncated or the model refuses. Modern models have context windows from 8K (small, fast) to 1M tokens (Gemini 1.5 Pro).
Examples in practice
Document analysis
Claude 3.7 Sonnet's 200K token context window can hold a 150,000-word document — roughly a full book — allowing you to ask questions about the entire text in a single conversation.
Codebase review
Claude Code loads an entire medium-sized codebase into context, allowing it to make consistent edits that understand the full architecture, not just the file being edited.
