The AI Internship
Language Models

What is Large Language Model (LLM)?

A deep learning model trained on vast text data that can understand and generate human language across a broad range of tasks.

Definition

A Large Language Model (LLM) is a neural network with billions of parameters trained on massive text datasets to predict the next token in a sequence. Through this training process, LLMs develop surprisingly general capabilities: they can write code, answer questions, summarize documents, translate languages, reason through problems, and converse naturally. Claude, GPT-4, Gemini, and Llama are all LLMs. They are the foundation of the current AI revolution.

Why it matters

LLMs are the core technology underlying every AI product discussed in this site and most of the industry. Understanding what they are, how they work, their capabilities and limitations, enables better product decisions, more effective prompting, and more realistic expectations about what AI can and cannot do.

How it works

LLMs are transformer-based neural networks trained via next-token prediction on trillions of tokens of text (web pages, books, code, scientific papers). During training, the model learns representations of language so rich that general reasoning capabilities emerge. At inference time, the model generates text one token at a time, sampling from a probability distribution over the vocabulary.

Examples in practice

Claude, GPT-4, Gemini

The most capable frontier LLMs, used as the reasoning core in AI products, coding agents, enterprise chatbots, and research tools.

Llama 3, Mistral

Open-weight LLMs that can be run on your own infrastructure. Used by teams with data sovereignty requirements or very high inference volumes where API costs are prohibitive.

Common questions about Large Language Model (LLM)

What is an LLM?
An LLM (Large Language Model) is a neural network trained on vast amounts of text that can understand and generate language. Models like Claude, GPT-4, and Gemini are LLMs. They predict the next word (token) in a sequence, but through massive scale develop general reasoning capabilities.
What is the difference between an LLM and AI?
AI is a broad field covering any system that exhibits intelligent behavior. LLMs are a specific type of AI — generative language models based on the transformer architecture. Not all AI is an LLM (computer vision, robotics, and traditional ML are also AI), but LLMs are currently the dominant form of general-purpose AI.

Related terms

Learn Large Language Model (LLM) in depth