Section 01
TRN-R1-Zero: A New Paradigm for Text-Rich Network Reasoning via Pure Reinforcement Learning (Introduction)
This article introduces the TRN-R1-Zero framework, which trains large language models (LLMs) for text-rich network reasoning using pure reinforcement learning, without supervised fine-tuning or distillation, achieving cross-domain zero-shot reasoning capabilities. Addressing the challenges of traditional GNNs relying on supervised learning, and existing LLMs either ignoring graph structures or depending on distillation, the framework designs a Neighbor-aware Group Relative Policy Optimization (NG-RPO) mechanism. It performs excellently on multiple benchmarks, demonstrating general network reasoning capabilities.