ScoreFlow: Mastering LLM Agent Workflows via Score-based Preference Optimization
ScoreFlow is a high-performance framework designed to optimize large language model (LLM) multi-agent workflows. It addresses limitations of existing methods by employing efficient gradient-based optimization in a continuous space. The framework introduces Score-DPO, a novel direct preference optimization method that incorporates quantitative evaluation feedback, leading to an 8.2% performance improvement over baselines and enabling smaller models to surpass larger ones with reduced costs. ✨
Article Points:
1
ScoreFlow: Automated, adaptive framework for LLM agent workflow generation.
2
Score-DPO: Novel preference optimization method using quantitative evaluation scores.
3
Achieves 8.2% improvement over baselines across diverse tasks.
4
Enables smaller LLMs to outperform larger models with lower inference costs.
5
Leverages efficient gradient-based optimization in a continuous space.
6
Uses code as a flexible representation for workflow search space.
ScoreFlow: Mastering LLM Agent Workflows via Score-based Preference Optimization
Problem Addressed

Inflexible existing methods

Limited adaptability & scalability

High manual effort for workflows

Proposed Solution(ScoreFlow)

Automated, adaptive framework

Gradient-based optimization

Code as workflow representation

Cost-efficient with open-source LLMs

Key Component(Score-DPO)

Novel DPO variant

Incorporates quantitative scores

Enhances efficiency & stability

Addresses score variance & inaccuracies

Methodology

Iterative workflow generation

Evaluation score feedback

Fine-tuning generator with Score-DPO

Experimental Results

8.2% improvement over baselines

Outperforms on 6 benchmarks

Smaller models surpass larger ones

Contributions

ScoreFlow framework

Score-DPO optimization method

Extensive evaluations & robustness