Section 01
Introduction to the Small_Scale Project
Small_Scale is the official open-source implementation of the ICLR 2026 paper Pruning Long Chain-of-Thought in Large Reasoning Models via Small-Scale Preference Optimization. It aims to prune long chain-of-thought in large reasoning models through small-scale preference optimization, addressing the issue of high computational overhead. The project provides a complete LLM offline inference evaluation toolkit and DPO training framework, supporting vLLM/SGLang backends, multi-type benchmark tests, and preference optimization training based on LLaMA-Factory, thus offering infrastructure for research and development of reasoning models.