Skip to main navigation Skip to search Skip to main content

TIGER: A Generating-Then-Ranking Framework for Practical Python Type Inference

  • Chong Wang
  • , Jian Zhang
  • , Yiling Lou
  • , Mingwei Liu
  • , Weisong Sun
  • , Yang Liu
  • , Xin Peng

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Python's dynamic typing system offers flexibility and expressiveness but can lead to type-related errors, prompting the need for automated type inference to enhance type hinting. While existing learning-based approaches show promising inference accuracy, they struggle with practical challenges in comprehensively handling various types, including complex parameterized types and (unseen) user-defined types. In this paper, we introduce TIGER, a two-stage generating-then-ranking (GTR) framework, designed to effectively handle Python's diverse type categories. TIGER leverages fine-tuned pre-trained code models to train a generative model with a span masking objective and a similarity model with a contrastive training objective. This approach allows TIGER to generate a wide range of type candidates, including complex parameterized types in the generating stage, and accurately rank them with user-defined types in the ranking stage. Our evaluation on the ManyTypes4Py dataset shows TIGER's advantage over existing methods in various type categories, notably improving accuracy in inferring user-defined and unseen types by 11.2% and 20.1% respectively in Top-5 Exact Match. Moreover, the experimental results not only demonstrate TIGER's superior performance and efficiency, but also underscore the significance of its generating and ranking stages in enhancing automated type inference.

Original languageEnglish (US)
Title of host publicationProceedings - 2025 IEEE/ACM 47th International Conference on Software Engineering, ICSE 2025
PublisherIEEE Computer Society
Pages321-333
Number of pages13
ISBN (Electronic)9798331505691
DOIs
StatePublished - 2025
Externally publishedYes
Event47th IEEE/ACM International Conference on Software Engineering, ICSE 2025 - Ottawa, Canada
Duration: Apr 27 2025May 3 2025

Publication series

NameProceedings - International Conference on Software Engineering
ISSN (Print)0270-5257

Conference

Conference47th IEEE/ACM International Conference on Software Engineering, ICSE 2025
Country/TerritoryCanada
CityOttawa
Period4/27/255/3/25

Keywords

  • contrastive learning
  • generating-then-ranking
  • pre-trained code models
  • type inference

ASJC Scopus subject areas

  • Software

Fingerprint

Dive into the research topics of 'TIGER: A Generating-Then-Ranking Framework for Practical Python Type Inference'. Together they form a unique fingerprint.

Cite this