Abstract
This article takes into account general repeated security games with no prior knowledge, i.e., the game payoffs and the attacker's behavior model are unknown and limited observability. Besides the traditional 'regret' criterion, 'reallocation times' is introduced as an additional criterion that provides a more comprehensive evaluation of the defense strategies. For such games, a novel random-walk perturbations with uniform exploration (RWP-UE) algorithm is proposed and we deduce the corresponding upper bound of the expected regret and expected reallocation times. Theoretical analysis shows that the RWP-UE algorithm achieves not only low regret with the same magnitude as existing achievements but also fewer reallocation times. Experiments are carried out against four types of attackers, and the results illustrate that the RWP-UE algorithm achieves superior performance.
Original language | English (US) |
---|---|
Pages (from-to) | 2156-2168 |
Number of pages | 13 |
Journal | IEEE Transactions on Cognitive and Developmental Systems |
Volume | 15 |
Issue number | 4 |
DOIs | |
State | Published - Dec 1 2023 |
Keywords
- Low reallocation times
- no prior knowledge and limited observability
- random-walk perturbations with uniform exploration (RWP-UE)
- repeated security games
- theoretical upper bound
ASJC Scopus subject areas
- Software
- Artificial Intelligence