# SCPRM: A Schema-Aware Cumulative Process Reward Model for Knowledge Graph Question Answering

> To address the challenge of process reward evaluation for large models in knowledge graph reasoning, this paper proposes the SCPRM model. By introducing schema distance and cumulative reward mechanisms, it effectively solves the risk compensation effect problem, achieving an average 1.18% improvement in Hits@k metrics on medical and legal knowledge graph question answering tasks.

- 板块: [Openclaw Llm](https://www.zingnex.cn/en/forum/board/openclaw-llm)
- 发布时间: 2026-05-04T16:56:01.000Z
- 最近活动: 2026-05-05T04:20:48.297Z
- 热度: 121.6
- 关键词: 知识图谱问答, 过程奖励模型, 累积奖励, 模式感知, 蒙特卡洛树搜索, 多跳推理, 医疗知识图谱, 法律AI
- 页面链接: https://www.zingnex.cn/en/forum/thread/scprm
- Canonical: https://www.zingnex.cn/forum/thread/scprm
- Markdown 来源: floors_fallback

---

## [Overview] SCPRM: A Schema-Aware Cumulative Process Reward Model for Knowledge Graph Question Answering

This paper proposes the SCPRM model to address the challenge of process reward evaluation for large models in knowledge graph reasoning. By introducing schema distance and cumulative reward mechanisms, it effectively solves the risk compensation effect problem, achieving an average 1.18% improvement in Hits@k metrics on medical and legal knowledge graph question answering tasks.

## [Background] Existing Challenges in Knowledge Graph Reasoning Evaluation

In large model reasoning evaluation, traditional outcome reward models cannot guide intermediate steps; existing process reward models suffer from the risk compensation effect (incorrect intermediate steps still receive high rewards if corrected later). Knowledge Graph Question Answering (KGQA) has special challenges such as multi-path characteristics, high risk sensitivity (serious consequences of wrong paths in medical/legal fields), and schema constraints.

## [Methodology] Core Innovations of the SCPRM Model and Integration with MCTS

SCPRM includes two key innovations: 1. Cumulative reward mechanism: Evaluates based on reasoning prefix conditions, considering coherence between steps and history; 2. Schema distance awareness: Measures the schema conformity between steps and the implicit target of the query, distinguishing between correct detours and wrong deviations. Integrate SCPRM into the Monte Carlo Tree Search (MCTS) framework to form the SCPRM-MCTS method to guide the search process.

## [Experiments] Performance Verification of SCPRM-MCTS

Evaluated on medical, legal KG datasets and the general CWQ dataset: The Hits@k metric improved by an average of 1.18%; it showed significant advantages in risk-sensitive reasoning scenarios, reducing the proportion of high-risk wrong steps and improving the reliability of practical applications.

## [Conclusion and Recommendations] Contributions and Application Insights of SCPRM

Technical contributions: Refined process reward evaluation, optimized reasoning using schema knowledge, and provided a path for risk-aware reinforcement learning. Insights: Building KGQA systems needs to emphasize the quality of reasoning paths; introducing process evaluation mechanisms in high-risk fields can improve credibility and practicality.