Zing Forum

Reading

PoetryQwen: A Specialized Large Model for Classical Chinese Poetry Understanding and Translation

This article introduces PoetryQwen, a specialized model for classical Chinese poetry based on Qwen2.5-14B fine-tuned via LoRA. Using the newly constructed CCPoetry-49K dataset, it achieves a 9.7% performance improvement on the CCL25-Eval Task 5 benchmark, significantly enhancing the ability for accurate translation and emotional understanding of classical poetry.

古诗词中文NLPLoRA微调领域专用模型情感理解QwenCCL评测文化传承指令微调
Published 2026-06-11 01:54Recent activity 2026-06-11 11:31Estimated read 5 min
PoetryQwen: A Specialized Large Model for Classical Chinese Poetry Understanding and Translation
1

Section 01

[Introduction] PoetryQwen: Core Breakthroughs of the Specialized Large Model for Classical Chinese Poetry

This article introduces PoetryQwen—a specialized model for classical Chinese poetry based on Qwen2.5-14B fine-tuned via LoRA. Using the newly constructed CCPoetry-49K dataset, it achieves a 9.7% performance improvement on the CCL25-Eval Task5 benchmark, significantly enhancing the ability for accurate translation and emotional understanding of classical poetry.

2

Section 02

Background: Technical Challenges and Existing Limitations of AI for Classical Chinese Poetry

Classical Chinese poetry is concise in language and profound in artistic conception, posing unique challenges to NLP. Its understanding requires overcoming obstacles in three dimensions: language (ancient-modern lexical differences, special grammar, rich allusions), literature (imagery systems, metrical requirements, implicit expressions), and culture (historical context, author's life, aesthetic traditions). Existing research limitations lie in the fact that generalized processing ignores the uniqueness of poetry, and there is a lack of high-quality specialized datasets (small scale, uneven quality, lack of emotional annotations).

3

Section 03

Methodology: Core Technical Strategies of PoetryQwen

  1. Domain Dataset Construction: Build the CCPoetry-49K dataset (49,404 samples covering word explanation/semantic understanding/emotional inference, multiple genres and eras), through multi-source integration, cleaning and alignment, manual verification. 2. Efficient LoRA Fine-tuning: Based on Qwen2.5-14B-Instruct, LoRA rank 64, learning rate 2e-4, trained for 3 epochs. 3. Three-task Joint Training: Shared underlying representations, task-specific output heads, dynamic weight adjustment, mixed sample training.
4

Section 04

Evidence: Outstanding Performance of PoetryQwen on CCL25-Eval and Comparative Analysis

In CCL25-Eval Task5, PoetryQwen scored 0.757, a 9.7% improvement over the baseline Qwen2.5-14B-Instruct (0.690). Sub-task performance: word explanation (+9.4%), semantic understanding (+9.3%), emotional inference (+10.5%, the most significant improvement). Compared with general-purpose models, the specialized PoetryQwen (14B) outperforms several larger general models, proving the value of domain specialization.

5

Section 05

Conclusion: Technical Contributions of PoetryQwen and Insights into Domain Specialization

Technical contributions include: 1. Dataset construction methodology (multi-source integration, quality control, task alignment); 2. Efficient fine-tuning strategy (LoRA configuration, multi-task training); 3. Domain specialization principles (data priority, task decomposition, progressive adaptation, evaluation-driven). These experiences can be extended to other vertical domains.

6

Section 06

Application Scenarios: Practical Value and Potential Applications of PoetryQwen

  1. Educational Assistance: Provide annotation translation and difficult sentence analysis for students, help teachers prepare materials; 2. Cultural Inheritance: Support poetry appreciation platforms, ancient book digitization, knowledge graph construction; 3. Creative Writing: Assist in poetry creation, cross-media adaptation (modern Chinese, image captioning).
7

Section 07

Limitations and Outlook: Shortcomings of PoetryQwen and Future Research Directions

Current limitations: Incomplete data coverage (obscure works, dialect poetry), narrow task scope (focus on understanding, generation tasks to be explored), limited cultural depth, no integration of multi-modality. Future directions: Expand the dataset to millions of samples, introduce multi-modal data, develop generation tasks, integrate historical knowledge bases, enhance interactivity.