Previous Card
Tricks or Traps? A Deep Dive into RL for LLM Reasoning
Deep Think with Confidence
Deep Think with Confidence (DeepConf) is a method designed to enhance Large Language Model (LLM) reasoning efficiency and performance at test time. It utilizes model-internal confidence signals to dynamically filter out low-quality reasoning traces during or after generation. DeepConf requires no additional model training and significantly improves accuracy while substantially reducing computational overhead. ✨
Article Points:
1
DeepConf boosts LLM reasoning efficiency and performance at test time.
2
Leverages model-internal confidence signals to filter low-quality traces.
3
Operates in offline and online modes, requiring no additional training.
4
Introduces local confidence measures (group, bottom 10%, tail) for fine-grained assessment.
5
Online mode dynamically terminates unpromising traces, saving tokens.
6
Achieves up to 99.9% accuracy and 84.7% token reduction on benchmarks.
Source:
Deep Think with Confidence
Purpose(Enhances LLM Reasoning)
Efficiency
Performance
Mechanism(Confidence-Aware Filtering)
Filters low-quality traces
Uses internal confidence
Modes(Offline & Online Operation)
Offline: Post-generation filtering
Online: Real-time early stopping
Confidence Metrics(Local & Global Signals)
Group Confidence
Bottom 10% Group Confidence
Tail Confidence
Lowest Group Confidence
Key Results(Accuracy & Efficiency Gains)
Up to 99.9% accuracy
Up to 84.7% token reduction
No extra training needed
Implementation(vLLM Integration)
Minimal vLLM edits
Adaptive sampling
Offline warmup phase
Source: