Safety and Trust in Artificial Intelligence with Abstract Interpretation

Research output: Contribution to journalReview articlepeer-review

Abstract

Deep neural networks (DNNs) now dominate the AI landscape and have shown impressive performance in diverse application domains, including vision, natural language processing (NLP), and healthcare. However, both public and private entities have been increasingly expressing significant concern about the potential of state-of-the-art AI models to cause societal and financial harm. This lack of trust arises from their black-box construction and vulnerability against natural and adversarial noise. As a result, researchers have spent considerable time developing automated methods for building safe and trustworthy DNNs. Abstract interpretation has emerged as the most popular framework for efficiently analyzing realistic DNNs among the various approaches. However, due to fundamental differences in the computational structure (e.g., high nonlinearity) of DNNs compared to traditional programs, developing efficient DNN analyzers has required tackling significantly different research challenges than encountered for programs. In this monograph, we describe state-of-the-art approaches based on abstract interpretation for analyzing DNNs. These approaches include the design of new abstract domains, synthesis of novel abstract transformers, abstraction refinement, and incremental analysis. We will discuss how the analysis results can be used to: (i) formally check whether a trained DNN satisfies desired output and gradient-based safety properties, (ii) guide the model updates during training towards satisfying safety properties, and (iii) reliably explain and interpret the black-box workings of DNNs.

Original languageEnglish (US)
Pages (from-to)250-408
Number of pages159
JournalFoundations and Trends in Programming Languages
Volume8
Issue number3-4
Early online date2025
DOIs
StatePublished - Jun 26 2025

ASJC Scopus subject areas

  • Software
  • Theoretical Computer Science
  • Computer Science Applications

Fingerprint

Dive into the research topics of 'Safety and Trust in Artificial Intelligence with Abstract Interpretation'. Together they form a unique fingerprint.

Cite this