Memory Decoder: A Pretrained, Plug-and-Play Memory for Large Language Models
Memory Decoder is a novel plug-and-play pretrained memory designed for efficient domain adaptation of Large Language Models (LLMs). It employs a small transformer decoder that learns to imitate the behavior of external non-parametric retrievers. Once trained, Memory Decoder seamlessly integrates with any LLM sharing the same tokenizer, enhancing domain-specific performance without modifying original model parameters or incurring significant inference latency. ✨
Article Points:
1
Introduces Memory Decoder, a plug-and-play pretrained memory for LLMs.
2
Enables efficient domain adaptation without modifying original LLM parameters.
3
Replaces traditional non-parametric retrievers with a compact parametric model.
4
A single pretrained Memory Decoder integrates seamlessly across compatible LLMs.
5
Achieves superior performance with minimal inference latency compared to RAG.
6
Preserves general capabilities, avoiding catastrophic forgetting seen in DAPT.
Memory Decoder: A Pretrained, Plug-and-Play Memory for Large Language Models
Purpose
Efficient Domain Adaptation
Enhance LLMs
Mechanism
Small Transformer Decoder
Imitates Non-Parametric Retrievers
Distribution Alignment Loss
Advantages
Plug-and-Play Integration
No LLM Parameter Modification
Minimal Inference Latency
Cross-Model Adaptability
Preserves General Capabilities
Performance
Reduces Perplexity Significantly
Superior to DAPT & RAG
Excels in Knowledge-Intensive QA
Limitations