Section 01
[Introduction] BARRED Framework: Asymmetric Debate for Synthetic Data Empowers Small Models to Break Through Customized Policy Guardrails
The BARRED (Boundary Alignment Refinement through REflection and Debate) framework generates high-quality synthetic training data using dimension decomposition and multi-agent debate validation, requiring only task descriptions and a small number of unlabeled samples. It addresses the manual annotation bottleneck in building customized policy guardrails, enabling small fine-tuned models to outperform proprietary large language models in this task.