LEANN: A Low-Storage Vector Index
LEANN is a novel, storage-efficient Approximate Nearest Neighbor (ANN) search index designed for resource-constrained personal devices. It addresses the high storage overhead of traditional embedding-based search by combining a compact graph structure with an efficient on-the-fly recomputation strategy. This approach significantly reduces index size while maintaining high search quality and low latency. ✨
Article Points:
1
LEANN: Low-storage ANN index for resource-constrained devices.
2
Reduces index size to <5% of raw data, 50x smaller than standard.
3
Combines compact graph with on-the-fly embedding recomputation.
4
Two-level search & dynamic batching minimize recomputation latency.
5
High-degree preserving graph pruning reduces metadata storage.
6
Achieves 90% top-3 recall in <2 seconds on real-world benchmarks.
LEANN: A Low-Storage Vector Index
Problem Addressed

High ANN storage overhead

Impractical for personal devices

Need low storage, high accuracy, low latency

Core Solution

Storage-efficient ANN index

Compact graph structure

On-the-fly recomputation

Key Techniques

Efficient Recomputation

- Two-Level Search
- Dynamic Batching

Storage-Optimized Graph

- High-Degree Preserving Pruning
Performance

<5% raw data size

50x smaller storage

90% recall in <2s

Evaluation

RPJ-Wiki dataset

NQ, HotpotQA, TriviaQA, GPQA

NVIDIA A10, M1 Mac

Future Work

Leverage GPU advancements

Smaller embedding models

Storage-efficient index building