The AI Internship
Language Models

What is Context Window?

The maximum amount of text an AI model can process in a single interaction — its working memory.

Definition

The context window is the total amount of text (measured in tokens) that an LLM can process at once — including the system prompt, conversation history, and any documents you pass in. Think of it as the model's working memory: everything inside the context window is "visible" to the model; everything outside it is not. Context window size is one of the most practically important model specifications: larger windows mean you can analyze longer documents, maintain longer conversations, and give the model more context to work with.

Why it matters

Context window size determines what you can do with a model. A 4K-token window can handle short conversations. A 200K-token window (Claude 3.7+) can process a 150,000-word book in a single prompt. This opens entirely different use cases: full codebase analysis, complete contract review, meeting transcript synthesis. PMs evaluating AI tools need to understand context windows to know which models can handle their workloads.

How it works

Text is converted to tokens (roughly 4 characters per token in English). The entire prompt — system instructions + conversation history + attached documents — must fit within the model's context limit. When context fills up, either older messages are truncated or the model refuses. Modern models have context windows from 8K (small, fast) to 1M tokens (Gemini 1.5 Pro).

Examples in practice

Document analysis

Claude 3.7 Sonnet's 200K token context window can hold a 150,000-word document — roughly a full book — allowing you to ask questions about the entire text in a single conversation.

Codebase review

Claude Code loads an entire medium-sized codebase into context, allowing it to make consistent edits that understand the full architecture, not just the file being edited.

Common questions about Context Window

What is a context window in AI?
The context window is the maximum amount of text an LLM can process at once, measured in tokens. It includes everything the model can "see": your instructions, the conversation history, and any documents you provide. Larger context windows enable longer documents, more conversation history, and richer context.
How many tokens is 1 page of text?
Roughly 750 tokens per page of English text (at ~250 words/page). A 10-page document is about 7,500 tokens. A full novel (100,000 words) is about 130,000 tokens — just under Claude's 200K context limit.
Does a larger context window mean better performance?
Not necessarily. Models often perform better on information at the start and end of the context window than in the middle (the "lost in the middle" problem). For very long contexts, RAG (retrieval) is often more effective than stuffing everything into the context window.

Related terms

Learn Context Window in depth