A bayesian multi-armed bandit approach for identifying human vulnerabilities

Erik Miehling, Baicen Xiao, Radha Poovendran, Tamer Başar

Research output: Chapter in Book/Report/Conference proceedingConference contribution


We consider the problem of identifying the set of users in an organization’s network that are most susceptible to falling victim to social engineering attacks. To achieve this goal, we propose a testing strategy, based on the theory of multi-armed bandits, that involves a system administrator sending fake malicious messages to users in a sequence of unannounced tests and recording their responses. To accurately model the administrator’s testing problem, we propose a new bandit setting, termed the structured combinatorial multi-bandit model, that allows one to impose combinatorial constraints on the space of allowable queries. The model captures the diversity in attack types and user responses by considering multiple multi-armed bandits, where each bandit problem represents an attack (message) type and each arm represents a user. Users respond to test messages according to a response model with unknown statistics. The response model associates a Bernoulli distribution with an unknown mean with each message-user pair, dictating the likelihood that a user will respond to a given message. The administrator’s problem of identifying the most susceptible users can then be expressed as identifying the set of message-user pairs with means that exceed a given threshold. We adopt a Bayesian approach to solving the problem, associating a (beta) prior distribution with each unknown mean. In a given trial, the system administrator queries a selection of users with test messages, generating query responses which are then used to update posterior distributions on the means. By defining a state as the parameters of the posteriors, we show that the optimal testing strategy can be characterized as the solution of a Markov decision process (MDP). Unfortunately, solving the MDP is computationally intractable. As a result, we propose a heuristic testing strategy, based on Thompson sampling, that focuses queries on message-user pairs that are estimated to have means close to the threshold. The heuristic testing strategy is shown to yield accurate identifications.

Original languageEnglish (US)
Title of host publicationDecision and Game Theory for Security - 9th International Conference, GameSec 2018, Proceedings
EditorsLinda Bushnell, Radha Poovendran, Tamer Basar
Number of pages19
ISBN (Print)9783030015534
StatePublished - 2018
Externally publishedYes
Event9th International Conference on Decision and Game Theory for Security, GameSec 2018 - Seattle, United States
Duration: Oct 29 2018Oct 31 2018

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume11199 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349


Other9th International Conference on Decision and Game Theory for Security, GameSec 2018
Country/TerritoryUnited States


  • Dynamic programming
  • Multi-armed bandits
  • Social engineering attacks
  • Sociotechnical systems
  • Thompson sampling

ASJC Scopus subject areas

  • Theoretical Computer Science
  • General Computer Science


Dive into the research topics of 'A bayesian multi-armed bandit approach for identifying human vulnerabilities'. Together they form a unique fingerprint.

Cite this