The AI Internship
Core AI

What is Vector Database?

A database optimized for storing and searching high-dimensional vector embeddings — the numeric representations that power semantic AI search.

Definition

A vector database stores data as high-dimensional numerical vectors (embeddings) and enables fast similarity search over those vectors. Unlike a traditional database that matches exact values, a vector database can find items that are semantically similar to a query. This is the foundation of RAG systems, semantic search, recommendation engines, and any AI application that needs to retrieve conceptually related content rather than exact keyword matches.

Why it matters

Vector databases are the memory of modern AI systems. Without them, AI can only know what is in its context window. With a vector database, AI can search millions of documents, memories, or data points in milliseconds — enabling applications that are genuinely useful at enterprise scale.

How it works

An embedding model (e.g., OpenAI text-embedding-3-small) converts text into a vector of ~1,536 numbers. Documents are embedded and stored in the vector database. At query time, the question is also embedded, and the database finds stored vectors with the highest cosine similarity — meaning they are semantically closest to the question.

Examples in practice

Semantic search over product catalog

A user searches "comfortable running shoes for wide feet." A vector database finds products matching the semantic intent, not just keyword matches, returning relevant results even if the descriptions don't use those exact words.

Long-term agent memory

An AI agent stores summaries of past conversations as vectors. When a new query arrives, it retrieves the most relevant past context to inform its response — giving the agent persistent, searchable memory.

Common questions about Vector Database

What are the most popular vector databases in 2026?
Pinecone (fully managed, easy to start), Weaviate (open-source, feature-rich), Qdrant (performance-focused, open-source), Chroma (lightweight, good for prototyping), and pgvector (Postgres extension, simplest for teams already on Postgres). For most startups, pgvector in Supabase is the easiest starting point.
Do I need a vector database for every AI project?
No. If your AI only uses information in its context window (recent conversation, pasted documents), you don't need a vector database. You need one when the information you want the AI to access is too large to fit in a single context window, or when you need fast semantic search across a large dataset.

Related terms

Learn Vector Database in depth