Previous Card
Building your own CLI Coding Agent with Pydantic-AI
Diverse And Private Synthetic Datasets Generation for RAG evaluation: A multi-agent framework
This paper introduces a novel multi-agent framework for generating diverse and privacy-preserving synthetic QA datasets to evaluate Retrieval-Augmented Generation (RAG) systems. It addresses the critical need for high-quality evaluation datasets that capture real-world constraints like sensitive information protection and topical coverage. The framework aims to provide a practical and ethically aligned pathway for more comprehensive RAG system evaluation. ✨
Article Points:
1
RAG evaluation requires diverse, privacy-preserving synthetic QA datasets.
2
Existing RAG benchmarks often lack real-world complexity and topical coverage.
3
Proposed multi-agent framework generates diverse, privacy-aware QA datasets.
4
Diversity agent uses clustering for broad topical coverage and semantic variability.
5
Privacy agent detects and masks sensitive PII across various domains.
6
Framework outperforms baselines in diversity and ensures robust privacy masking.
Diverse And Private Synthetic Datasets Generation for RAG evaluation: A multi-agent framework
Problem
RAG evaluation challenges
Benchmarks lack diversity & real-world complexity
Retrieval systems face privacy issues
Proposed Solution
Multi-agent framework for synthetic QA
Prioritizes semantic diversity
Ensures privacy preservation
Ethically aligned pathway
Multi-Agent Framework
Diversity Agent
- Clustering for topical coverage
Privacy Agent
- Detects & masks sensitive PII
QA Curation Agent
- Synthesizes private QA pairs
LangGraph orchestration
Evaluation
Diversity assessment
- LLM-as-a-Judge & CosineSimilarity
Privacy assessment
- AI4Privacy datasets for entity masking
Outperforms baseline methods
Future Work