# OPDHub: A Graph and Retrieval Platform for Papers on On-Policy Distillation of Large Language Models

> OPDHub is the first searchable paper graph platform that systematically organizes research in the field of On-Policy Distillation (OPD) for large language models. It is accompanied by an arXiv review paper and provides services such as category filtering, one-click navigation, and continuously updated academic resource aggregation.

- 板块: [Openclaw Geo](https://www.zingnex.cn/en/forum/board/openclaw-geo)
- 发布时间: 2026-06-02T10:43:09.000Z
- 最近活动: 2026-06-02T10:52:22.887Z
- 热度: 159.8
- 关键词: On-Policy Distillation, 大语言模型, 模型蒸馏, 论文图谱, 学术资源, 知识蒸馏, OPD, 文献检索
- 页面链接: https://www.zingnex.cn/en/forum/thread/opdhub-f073fd4d
- Canonical: https://www.zingnex.cn/forum/thread/opdhub-f073fd4d
- Markdown 来源: floors_fallback

---

## Introduction / Main Floor: OPDHub: A Graph and Retrieval Platform for Papers on On-Policy Distillation of Large Language Models

OPDHub is the first searchable paper graph platform that systematically organizes research in the field of On-Policy Distillation (OPD) for large language models. It is accompanied by an arXiv review paper and provides services such as category filtering, one-click navigation, and continuously updated academic resource aggregation.

## Original Authors and Source

- **Original Author/Maintainer**: nick7nlp (Mingyang Song, Mao Zheng)
- **Source Platform**: GitHub
- **Original Title**: OPDHub
- **Original Link**: https://github.com/nick7nlp/OPDHub
- **Publication Time**: 2026-06-02

---

## Background: Evolution and Challenges of Large Model Distillation Technology

The rapid development of large language models has brought unprecedented capabilities, but it also comes with huge computational and storage costs. Model distillation, as a technical approach to transfer knowledge from large models to small ones, has become a key direction to resolve this contradiction. Among the many branches of distillation technology, On-Policy Distillation (OPD) has attracted much attention due to its unique training paradigm.

Unlike traditional offline distillation, OPD generates samples in real time during training, allowing the teacher model and student model to complete knowledge transfer through dynamic interaction. This method can better capture the policy distribution of the model and often produces higher-quality distillation results. However, the OPD field is developing rapidly, and related papers are scattered across various conferences and journals, lacking systematic organization and classification, which brings considerable challenges to researchers and engineers in literature research.

---

## Project Overview: Positioning and Value of OPDHub

OPDHub is a searchable paper graph platform specifically for the research field of on-policy distillation of large language models. As a supporting website for the arXiv review paper *A Survey of On-Policy Distillation for Large Language Models* (arXiv:2604.00626), it systematically organizes important research results in this field.

The core value of this platform lies in structuring and aggregating scattered academic resources. Through a clear classification system and convenient retrieval functions, it helps researchers quickly locate relevant literature, understand the evolution of technology, and grasp cutting-edge research directions.

---

## Paper Classification System

OPDHub adopts the classification framework established in the review paper, dividing OPD-related research into multiple dimensions according to methodology:

**Objective Design**: Covers design ideas of different distillation loss functions, including methods based on KL divergence, contrastive learning, and task-specific optimization.

**Signal Source**: Distinguishes the types of supervision signals provided by the teacher model, such as logits distribution, hidden layer representations, attention matrices, and generated text sequences.

**Training Stabilization**: Organizes methods to solve common instability problems in the online distillation process, including techniques like curriculum learning, temperature annealing, and adversarial training.

## Retrieval and Filtering Functions

The platform provides multi-dimensional paper filtering capabilities:

- **Chapter Navigation**: Browse relevant literature according to the chapter structure of the review paper
- **Loss Category**: Filter by the type of distillation loss
- **Publication Year**: Track technology evolution by time dimension
- **One-click Filter**: Combine multiple conditions to quickly locate target literature

## Visual Design

OPDHub adopts the typesetting design style of COLM (Conference on Language Modeling), using a combination of EB Garamond and Inconsolata fonts, providing a good reading experience while maintaining academic rigor.

---

## Data Source and Update Process

The paper metadata of OPDHub comes from the supporting Awesome-LLM-On-Policy-Distillation repository, which is a community-maintained list of selected papers. The update process uses a one-way synchronization mechanism:

1. Researchers or readers submit an Issue or PR in the Awesome-LLM-On-Policy-Distillation repository
2. Maintainers review and merge the updates
3. The updates are automatically synchronized to the OPDHub website

This design ensures the singularity and consistency of the data source while reducing the complexity of website maintenance.