Skip to main content

Embeddings โ€” How Models Represent Meaning

PM: Read in full โ€” 15 min

The Meaning of Meaning, Numericallyโ€‹

A language model can't work with words directly โ€” it works with numbers. Embeddings are how text gets turned into numbers in a way that preserves meaning.

An embedding is a vector: a list of numbers (typically hundreds to thousands of values) representing the semantic content of a word, sentence, or document. The geometric relationships between vectors capture semantic relationships between the things they represent.

The Classic Example: Vector Arithmetic on Meaningโ€‹

The canonical demonstration comes from Word2Vec (Mikolov et al., 2013). Take the embedding for "king," subtract the vector for "man," add the vector for "woman," and the result lands near the embedding for "queen."

king โˆ’ man + woman โ‰ˆ queen

This is not a coincidence. It means the model has learned a dimension in embedding space that encodes something like gender, and another that encodes royalty. Semantic relationships become geometric ones.

Visualizing Embedding Spaceโ€‹

Embedding vectors typically have 768โ€“4096 dimensions, not directly visualizable. Researchers use dimensionality reduction (t-SNE, UMAP) to project them into 2D. The structure that emerges:

Words with similar meanings cluster together. This spatial structure is what makes similarity search work.

Sentence and Document Embeddingsโ€‹

Word embeddings were the first generation. Modern systems use sentence or document embeddings: a single vector representing an entire passage.

These are produced by passing text through an embedding model and pooling the per-token representations. The resulting vector captures the meaning of the full passage, not individual words.

This is the foundation of semantic search:

  1. At index time: embed every document. Store the vectors.
  2. At query time: embed the query using the same model.
  3. Find documents whose vectors are most similar to the query vector (cosine similarity or dot product).
  4. Return those documents.

The difference from keyword search: semantic search finds documents that mean the same thing even if they share no keywords. "How do I cancel my subscription?" matches "steps to terminate my account."

Why Embeddings Matter for AI Productsโ€‹

RAG systems depend entirely on embedding quality. If the embedding model can't distinguish "cancel my order" from "cancel my subscription," your retrieval returns the wrong documents and the LLM gives the wrong answer. Retrieval quality = embedding quality.

Classification and clustering can use embeddings as features. An embedding model plus a simple classifier is often faster, cheaper, and more interpretable than a large generative model for many classification tasks.

Embedding drift: changing your embedding model makes existing vectors incompatible. All stored vectors must be regenerated. This is a real operational concern for production systems.

Cross-modal embeddings: models like CLIP put images and text in the same vector space, enabling text queries to retrieve images and vice versa. This underlies multimodal search and the image understanding in models like GPT-4V and Claude.

PM Takeaway

Embedding quality is the hidden multiplier on every RAG system. When AI gives wrong answers, check what was retrieved before debugging the prompt. Better prompts don't fix a retrieval problem โ€” better embeddings or better chunking does.

Further Readingโ€‹

Mikolov, T. et al. โ€” arXiv 2013 (2013)
Read:Abstract only.Skip:Everything else.
Introduced word embeddings โ€” the idea that words can be represented as vectors that capture semantic meaning.