Less is More: Recursive Reasoning with Tiny Networks
Hierarchical Reasoning Model (HRM) uses two small neural networks for recursive reasoning, outperforming Large Language Models (LLMs) on hard puzzle tasks. This paper proposes Tiny Recursive Model (TRM), a simpler and more generalized approach using a single tiny network with only two layers. TRM achieves significantly higher generalization than HRM and LLMs with less than 0.01% of the parameters, demonstrating the effectiveness of simplified recursive reasoning. ✨
Article Points:
1
TRM simplifies recursive reasoning with a single tiny network.
2
TRM outperforms LLMs and HRM on hard puzzle tasks with fewer parameters.
3
TRM eliminates complex fixed-point theorems and biological justifications.
4
TRM's simplified ACT requires only one forward pass during training.
5
"Less is more": 2-layer networks and single network improve generalization.
6
Exponential Moving Average (EMA) enhances stability and generalization.
Less is More: Recursive Reasoning with Tiny Networks
Core Idea
Recursive reasoning with single tiny network
Progressively improves answer
Advantages over HRM
Simpler design, no complex theorems
Higher generalization with fewer parameters
Simplified Adaptive Computational Time ACT
Key Design Choices
Single 2-layer network
Reinterpretation of latent features y, z
Exponential Moving Average EMA
Attention-free architecture for fixed context
Performance
Beats LLMs on Sudoku, Maze, ARC-AGI
Achieves SOTA on puzzle benchmarks
Limitations & Future Work
Not generative, only deterministic answers
Scaling laws needed for optimal parameters
Why recursion helps remains unexplained
Failed Ideas