Zing Forum

Reading

Quarry: Achieving Automated Theorem Proving for Rocq via Difficulty-Aware Decomposition Strategies

This article introduces the Quarry framework, which significantly enhances the automation level of interactive theorem provers by combining the high-level planning capabilities of large language models (LLMs) with the local reasoning capabilities of automated proof tools.

自动定理证明RocqCoq大语言模型形式化验证CoqHammer神经符号arXiv
Published 2026-06-16 22:33Recent activity 2026-06-17 10:33Estimated read 5 min
Quarry: Achieving Automated Theorem Proving for Rocq via Difficulty-Aware Decomposition Strategies
1

Section 01

Quarry Framework: Enhancing Rocq's Automated Theorem Proving via LLM Planning and Symbolic Reasoning

This article introduces the Quarry framework, which aims to address the automation bottleneck of interactive theorem provers (such as Rocq) in formal verification. By separating proof planning and execution, the framework combines the high-level planning capabilities of large language models (LLMs) with the local rigorous reasoning capabilities of automated proof tools (like CoqHammer), significantly improving Rocq's automated proof success rate. Core innovations include difficulty-aware decomposition strategies that prioritize solving easier subgoals and effectively allocate computational resources.

2

Section 02

Automation Dilemmas in Formal Verification and Limitations of Existing Methods

Formal verification is a key method to ensure software correctness, but constructing machine-checkable proofs still requires significant manual effort. Existing automation solutions have their own limitations: heuristic strategies (such as Coq's auto) have limited capabilities; Hammer tools (like CoqHammer) lack long-range planning; while LLM methods can propose high-level ideas, they lack local rigor. How to combine the advantages of both is an open problem in the field.

3

Section 03

Core Methods and Technical Implementation of the Quarry Framework

The core of Quarry is the separation of planning and execution: 1. Planning phase: LLM proposes a goal decomposition scheme (sub-lemmas + strategies); 2. Verification phase: Rocq performs type checking to verify the correctness of the decomposition, and uses a difficulty model to evaluate the Hammer solvability of subgoals; 3. Execution phase: recursively prove sub-lemmas in order of difficulty and control the computational budget. Technically, it integrates SerAPI (for interaction with Rocq), CoqHammer (automated proof engine), and a difficulty prediction model based on proof state features.

4

Section 04

Experimental Evaluation Results and Advantages of Quarry

In Rocq benchmark tests, Quarry increased the success rate by 7%-13% compared to the strongest baseline under a 10-minute budget; compared to pure LLM methods, its cost is more predictable; and it can adapt to open-source/commercial LLMs, with strong generality.

5

Section 05

Technical Contributions and Application Prospects of Quarry

Technical contributions include a new paradigm of neuro-symbolic collaboration (LLMs and symbolic systems each perform their own roles), difficulty-aware resource allocation, and progressive verification strategies. Application prospects cover critical software verification (reducing manual effort), mathematical formalization (assisting theorem transformation), and educational tools (helping students understand proofs).

6

Section 06

Limitations of Quarry and Future Research Directions

Limitations: decomposition quality depends on LLMs, insufficient generalization of the difficulty model, and large recursion depth for complex proofs. Future directions: stronger planning models, online learning for the difficulty model, interactive assistants, and porting to cross-provers (such as Isabelle).