Memory Decoder: A Pretrained, Plug-and-Play Memory for Large Language Models
Memory Decoder is a novel plug-and-play pretrained memory designed for efficient domain adaptation of Large Language Models (LLMs). It employs a small transformer decoder that learns to imitate the behavior of external non-parametric retrievers. Once trained, Memory Decoder seamlessly integrates with any LLM sharing the same tokenizer, enhancing domain-specific performance without modifying original model parameters or incurring significant inference latency. ✨
Article Points:
1
Introduces Memory Decoder, a plug-and-play pretrained memory for LLMs.
2
Enables efficient domain adaptation without modifying original LLM parameters.
3
Replaces traditional non-parametric retrievers with a compact parametric model.
4
A single pretrained Memory Decoder integrates seamlessly across compatible LLMs.
5
Achieves superior performance with minimal inference latency compared to RAG.
6
Preserves general capabilities, avoiding catastrophic forgetting seen in DAPT.
Memory Decoder: A Pretrained, Plug-and-Play Memory for Large Language Models
Purpose

Efficient Domain Adaptation

Enhance LLMs

Mechanism

Small Transformer Decoder

Imitates Non-Parametric Retrievers

Distribution Alignment Loss

Advantages

Plug-and-Play Integration

No LLM Parameter Modification

Minimal Inference Latency

Cross-Model Adaptability

Preserves General Capabilities

Performance

Reduces Perplexity Significantly

Superior to DAPT & RAG

Excels in Knowledge-Intensive QA

Limitations

Pre-training Computational Cost

Requires Some Cross-Tokenizer Training