Explaining deep learning based security applications

Wenbo Guo, Jun Xu, Gang Wang, Xinyu Xing

Research output: Chapter in Book/Report/Conference proceedingChapter

Abstract

Many research attempts have been made to develop explanation techniques to provide interpretable explanations for deep learning results. However, the produced methods are optimized for non-security tasks ( e.g. , image analysis). Their key assumptions are often violated in security applications, resulting in a low explanation fidelity. In this chapter, we introduce LEMNA , a high-fidelity explanation method dedicated for security applications. Given an input data sample, LEMNA generates a small set of interpretable features to explain how the input sample is classified. The core idea is to approximate a local area of the complex deep learning decision boundary using a simple interpretable model. The local interpretable model is specially designed to (1) handle feature dependency to better work with security applications; and (2) handle nonlinear local boundaries to boost explanation fidelity. To demonstrate the utitlity of LEMNA , we apply it to a popular deep learning application in security: function start detection from binary executables. Extensive evaluations show that LEMNA 's explanation has a much higher fidelity level compared to existing methods. In addition, we demonstrate practical use cases of LEMNA to help machine learning developers to validate model behavior, troubleshoot classification errors, and automatically patch the errors of the target models.

Original languageEnglish (US)
Title of host publicationAI Embedded Assurance for Cyber Systems
PublisherSpringer
Pages219-246
Number of pages28
ISBN (Electronic)9783031426377
ISBN (Print)9783031426360
DOIs
StatePublished - Dec 12 2023

ASJC Scopus subject areas

  • General Computer Science
  • General Engineering
  • General Social Sciences

Fingerprint

Dive into the research topics of 'Explaining deep learning based security applications'. Together they form a unique fingerprint.

Cite this