On the Theoretical Limitations of Embedding-Based Retrieval

Previous Card

Where AI is failing design systems, and where we are failing AI

This paper explores the fundamental theoretical limitations of embedding-based retrieval, demonstrating that the number of retrievable top-k document subsets is inherently limited by the embedding dimension. It introduces the novel LIMIT dataset, revealing that even state-of-the-art models struggle with simple tasks due to these constraints. The work advocates for developing new retrieval methods beyond the current single-vector paradigm to overcome these limitations. ✨

Article Points:

Embedding dimension fundamentally limits retrievable top-k document combinations.

Theoretical limits are empirically confirmed even with best-case free embedding optimization.

The new LIMIT dataset exposes state-of-the-art models' failure on simple tasks.

Model performance on complex retrieval tasks is critically dependent on embedding dimension.

Existing academic benchmarks often hide these limitations by testing limited query spaces.

Alternative architectures like cross-encoders or multi-vector models are needed.

Source:

On the Theoretical Limitations of Embedding-Based Retrieval

embedding rag

Fundamental Limits

Embedding dimension limits top-k combinations

Connected to sign-rank of qrel matrix

Empirical Validation

Free embedding optimization confirms limits

Critical-n points show dimension breaks

Polynomial relationship models capacity

LIMIT Dataset

Realistic, simple, yet difficult task

Stress tests theoretical results

High qrel graph density

SOTA Model Performance

Models struggle on LIMIT dataset

Performance depends on embedding dimension

Lack of domain shift impact

Implications

Current benchmarks hide limitations

Challenges for instruction-based retrieval

Future Directions

Alternative architectures needed

Cross-encoders, multi-vector, sparse models

Source:

On the Theoretical Limitations of Embedding-Based Retrieval

Next Card

Where AI is failing design systems, and where we are failing AI

Embedding dimension limits top-k combinations

Connected to sign-rank of qrel matrix

Free embedding optimization confirms limits

Critical-n points show dimension breaks

Polynomial relationship models capacity

Realistic, simple, yet difficult task

Stress tests theoretical results

High qrel graph density

Models struggle on LIMIT dataset

Performance depends on embedding dimension

Lack of domain shift impact

Current benchmarks hide limitations

Challenges for instruction-based retrieval

Alternative architectures needed

Cross-encoders, multi-vector, sparse models

Key-value memory in the brain

Related Cards

Learning Facts at Scale with Active Reading

The Illusion of Progress: Re-evaluating Hallucination Detection in LLMs

LEANN: A Low-Storage Vector Index

Building a web search engine from scratch in two months with 3 billion neural embeddings