ScoreFlow: Mastering LLM Agent Workflows via Score-based Preference Optimization
ScoreFlow is a high-performance framework designed to optimize large language model (LLM) multi-agent workflows. It addresses limitations of existing methods by employing efficient gradient-based optimization in a continuous space. The framework introduces Score-DPO, a novel direct preference optimization method that incorporates quantitative evaluation feedback, leading to an 8.2% performance improvement over baselines and enabling smaller models to surpass larger ones with reduced costs. ✨
Article Points:
1
ScoreFlow: Automated, adaptive framework for LLM agent workflow generation.
2
Score-DPO: Novel preference optimization method using quantitative evaluation scores.
3
Achieves 8.2% improvement over baselines across diverse tasks.
4
Enables smaller LLMs to outperform larger models with lower inference costs.
5
Leverages efficient gradient-based optimization in a continuous space.
6
Uses code as a flexible representation for workflow search space.
ScoreFlow: Mastering LLM Agent Workflows via Score-based Preference Optimization
Problem Addressed
Inflexible existing methods
Limited adaptability & scalability
High manual effort for workflows
Proposed Solution(ScoreFlow)
Automated, adaptive framework
Gradient-based optimization
Code as workflow representation
Cost-efficient with open-source LLMs
Key Component(Score-DPO)
Novel DPO variant
Incorporates quantitative scores
Enhances efficiency & stability
Addresses score variance & inaccuracies
Methodology
Iterative workflow generation
Evaluation score feedback
Fine-tuning generator with Score-DPO
Experimental Results
8.2% improvement over baselines
Outperforms on 6 benchmarks
Smaller models surpass larger ones
Contributions