Skip to main navigation Skip to search Skip to main content

MA-LoT: Model-Collaboration Lean-based Long Chain-of-Thought Reasoning enhances Formal Theorem Proving

  • Ruida Wang
  • , Rui Pan
  • , Yuxin Li
  • , Jipeng Zhang
  • , Yizhen Jia
  • , Shizhe Diao
  • , Renjie Pi
  • , Junjie Hu
  • , Tong Zhang

Research output: Contribution to journalConference articlepeer-review

Abstract

Solving mathematical problems using computer-verifiable languages like Lean has significantly impacted the mathematical and computer science communities. State-of-the-art methods utilize a single Large Language Model (LLM) to generate complete proof or perform tree search, but they fail to balance these tasks. We pro pose MA-LoT: Model-CollAboration Lean-based Long Chain-of-Thought, a comprehensive frame-work for Lean4 theorem proving to solve this is-sue. It separates the cognition tasks of general NL for whole-proof generation and error analysis for proof correction using the model-collaboration method. We achieve this by structured interac-tion of the LLM and Lean4 verifier in Long CoT. To implement the framework, we propose the novel LoT-Transfer Learning training-inference pipeline, which enables the Long CoT thinking ca-pability to LLMs without special data annotation. Extensive experiment shows that our framework achieves a 61.07% accuracy rate on the Lean4 ver-sion of the MiniF2F-Test dataset, largely outper-forming DeepSeek-V3 (33.61%), single-model tree search (InternLM-Step-Prover, 50.70%), and whole-proof generation (Godel-Prover, 55.33%) baselines. Furthermore, our findings highlight the potential of combining Long CoT with formal verification for a more insightful generation in a broader perspective.

Original languageEnglish (US)
Pages (from-to)63972-64004
Number of pages33
JournalProceedings of Machine Learning Research
Volume267
StatePublished - 2025
Event42nd International Conference on Machine Learning, ICML 2025 - Vancouver, Canada
Duration: Jul 13 2025Jul 19 2025

ASJC Scopus subject areas

  • Software
  • Control and Systems Engineering
  • Statistics and Probability
  • Artificial Intelligence

Fingerprint

Dive into the research topics of 'MA-LoT: Model-Collaboration Lean-based Long Chain-of-Thought Reasoning enhances Formal Theorem Proving'. Together they form a unique fingerprint.

Cite this