QEDCartographer: Automating Formal Verification Using Reward-Free Reinforcement Learning

Alex Sanchez-Stern, Abhishek Varghese, Zhanna Kaufman, Dylan Zhang, Talia Ringer, Yuriy Brun

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Formal verification is a promising method for producing reliable software, but the difficulty of manually writing verification proofs severely limits its utility in practice. Recent methods have automated some proof synthesis by guiding a search through the proof space using a theorem prover. Unfortunately, the theorem prover provides only the crudest estimate of progress, resulting in effectively undirected search. To address this problem, we create QEDCartographer, an automated proofsynthesis tool that combines supervised and reinforcement learning to more effectively explore the proof space. QEDCartographer incorporates the proofs' branching structure, enabling rewardfree search and overcoming the sparse reward problem inherent to formal verification. We evaluate QEDCartographer using the CoqGym benchmark of 68.5 K theorems from 124 open-source Coq projects. QEDCartographer fully automatically proves 21.4% of the test-set theorems. Previous search-based proof-synthesis tools Tok, Tac, ASTactic, Passport, and Proverbot9001, which rely only on supervised learning, prove 9.6%, 9.8%, 10.9%, 12.5%, and 19.8%, respectively. Diva, which combines 62 tools, proves 19.2%. Comparing to the most effective prior tool, Proverbot9001, QEDCartographer produces 26% shorter proofs 27% faster, on average over the theorems both tools prove. Together, QEDCartographer and non-learning-based CoqHammer prove 31.8% of the theorems, while CoqHammer alone proves 26.6%. Our work demonstrates that reinforcement learning is a fruitful research direction for improving proof-synthesis tools' search mechanisms.

Original languageEnglish (US)
Title of host publicationProceedings - 2025 IEEE/ACM 47th International Conference on Software Engineering, ICSE 2025
PublisherIEEE Computer Society
Pages307-320
Number of pages14
ISBN (Electronic)9798331505691
DOIs
StatePublished - 2025
Event47th IEEE/ACM International Conference on Software Engineering, ICSE 2025 - Ottawa, Canada
Duration: Apr 27 2025May 3 2025

Publication series

NameProceedings - International Conference on Software Engineering
ISSN (Print)0270-5257

Conference

Conference47th IEEE/ACM International Conference on Software Engineering, ICSE 2025
Country/TerritoryCanada
CityOttawa
Period4/27/255/3/25

Keywords

  • formal verification
  • proof assistants
  • proof synthesis
  • reinforcement learning

ASJC Scopus subject areas

  • Software

Fingerprint

Dive into the research topics of 'QEDCartographer: Automating Formal Verification Using Reward-Free Reinforcement Learning'. Together they form a unique fingerprint.

Cite this