Section 01
PRoSFI: Guide to the New Method for Improving Reasoning Reliability of Large Language Models
Core Guide to PRoSFI
PRoSFI (Process Reward over Structured Formal Intermediates) is a new method to enhance the reasoning reliability of large language models. Its core lies in enabling 7B-parameter-level models to generate machine-verifiable reasoning chains through structured formal intermediate steps and a process reward mechanism, solving the problem where traditional outcome rewards ignore intermediate reasoning errors. This method balances the reliability of formal verification and the feasibility of model generation, providing a new path for building trustworthy reasoning models.