Is It Overkill? Analyzing Feature-Space Concept Drift in Malware Detectors

Zhi Chen, Zhenning Zhang, Zeliang Kan, Limin Yang, Jacopo Cortellazzi, Feargus Pendlebury, Fabio Pierazzi, Lorenzo Cavallaro, Gang Wang

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Concept drift is a major challenge faced by machine learning-based malware detectors when deployed in practice. While existing works have investigated methods to detect concept drift, it is not yet well understood regarding the main causes behind the drift. In this paper, we design experiments to empirically analyze the impact of feature-space drift (new features introduced by new samples) and compare it with data-space drift (data distribution shift over existing features). Surprisingly, we find that data-space drift is the dominating contributor to the model degradation over time while feature-space drift has little to no impact. This is consistently observed over both Android and PE malware detectors, with different feature types and feature engineering methods, across different settings. We further validate this observation with recent online learning based malware detectors that incrementally update the feature space. Our result indicates the possibility of handling concept drift without frequent feature updating, and we further discuss the open questions for future research.

Original languageEnglish (US)
Title of host publicationProceeding - 44th IEEE Symposium on Security and Privacy Workshops, SPW 2023
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages21-28
Number of pages8
ISBN (Electronic)9798350312362
DOIs
StatePublished - 2023
Event44th IEEE Symposium on Security and Privacy Workshops, SPW 2023 - San Francisco, United States
Duration: May 22 2023May 25 2023

Publication series

NameProceeding - 44th IEEE Symposium on Security and Privacy Workshops, SPW 2023

Conference

Conference44th IEEE Symposium on Security and Privacy Workshops, SPW 2023
Country/TerritoryUnited States
CitySan Francisco
Period5/22/235/25/23

Keywords

  • concept-drift
  • machine-learning
  • malware-classifier

ASJC Scopus subject areas

  • Artificial Intelligence
  • Computer Networks and Communications
  • Information Systems
  • Signal Processing
  • Safety, Risk, Reliability and Quality

Fingerprint

Dive into the research topics of 'Is It Overkill? Analyzing Feature-Space Concept Drift in Malware Detectors'. Together they form a unique fingerprint.

Cite this