LEANN: A Low-Storage Vector Index
LEANN is a novel, storage-efficient Approximate Nearest Neighbor (ANN) search index designed for resource-constrained personal devices. It addresses the high storage overhead of traditional embedding-based search by combining a compact graph structure with an efficient on-the-fly recomputation strategy. This approach significantly reduces index size while maintaining high search quality and low latency. ✨
Article Points:
1
LEANN: Low-storage ANN index for resource-constrained devices.
2
Reduces index size to <5% of raw data, 50x smaller than standard.
3
Combines compact graph with on-the-fly embedding recomputation.
4
Two-level search & dynamic batching minimize recomputation latency.
5
High-degree preserving graph pruning reduces metadata storage.
6
Achieves 90% top-3 recall in <2 seconds on real-world benchmarks.
LEANN: A Low-Storage Vector Index
Problem Addressed
High ANN storage overhead
Impractical for personal devices
Need low storage, high accuracy, low latency
Core Solution
Storage-efficient ANN index
Compact graph structure
On-the-fly recomputation
Key Techniques
Efficient Recomputation
- Two-Level Search
- Dynamic Batching
Storage-Optimized Graph
- High-Degree Preserving Pruning
Performance
<5% raw data size
50x smaller storage
90% recall in <2s
Evaluation
RPJ-Wiki dataset
NQ, HotpotQA, TriviaQA, GPQA
NVIDIA A10, M1 Mac
Future Work
Leverage GPU advancements
Smaller embedding models
Storage-efficient index building
Next Card