DrSec: Flexible Distributed Representations for Efficient Endpoint Security

Mahmood Sharif, Pubali Datta, Andy Riddle, Kim Westfall, Adam Bates, Vijay Ganti, Matthew Lentzk, David Ott

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

The increasing complexity of attacks has given rise to varied security applications tackling profound tasks, ranging from alert triage to attack reconstruction. Yet, security products, such as Endpoint Detection and Response, bring together applications that are developed in isolation, trigger many false positives, miss actual attacks, and produce limited labels useful in supervised learning schemes. To address these challenges, we propose DrSec - a system employing self-supervised learning to pre-train foundation language models (LMs) that ingest event-sequence data and emit distributed representations for processes. Once pre-trained, the LMs can be adapted to solve different downstream tasks with limited to no supervision, helping unify the currently fractured application ecosystem. We trained DrSec with two LM types on a real-world dataset containing ∼91M processes and ∼2.55B events, and tested it in three application domains. We found that DrSec enables accurate, unsupervised process identification; outperforms leading methods on alert triage to reduce alert fatigue (e.g., 75.11% vs. ≤64.31% precision-recall area under curve); and accurately learns expert-developed rules, allowing tuning incident detectors to control false positives and negatives.

Original languageEnglish (US)
Title of host publicationProceedings - 45th IEEE Symposium on Security and Privacy, SP 2024
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages3609-3624
Number of pages16
ISBN (Electronic)9798350331301
DOIs
StatePublished - 2024
Event45th IEEE Symposium on Security and Privacy, SP 2024 - San Francisco, United States
Duration: May 20 2024May 23 2024

Publication series

NameProceedings - IEEE Symposium on Security and Privacy
ISSN (Print)1081-6011

Conference

Conference45th IEEE Symposium on Security and Privacy, SP 2024
Country/TerritoryUnited States
CitySan Francisco
Period5/20/245/23/24

Keywords

  • alert triage
  • EDR
  • Endpoint security
  • language models
  • process identification
  • self-supervision

ASJC Scopus subject areas

  • Safety, Risk, Reliability and Quality
  • Software
  • Computer Networks and Communications

Fingerprint

Dive into the research topics of 'DrSec: Flexible Distributed Representations for Efficient Endpoint Security'. Together they form a unique fingerprint.

Cite this