Abstract
Eigendecomposition methods have been shown to generate sets of useful options which improve learning speed when used in hierarchical reinforcement learning. However, these methods focus on navigation by learning reward-agnostic representations and struggle when presented with environments with dynamic reward structures, such as adversarial agents. Taking inspiration from mammals, which are known to maintain specialized groupings of cells to perform complex planning, we propose leveraging the linear feature space of the successor features framework to independently encode spatial and reward information of an environment. We show that subsequent decomposition of this representation results in options which separately relate to spatial or rewarding features, allowing for complex spatial planning around dynamic objects, such as adversaries. We also propose the use of principle component analysis to perform this decomposition, due to its use of a clustering basis, which we show better identifies options for spatial planning than common methods. We combine these ideas in our Specialized Neurons and Clustering Architecture (SNAC), which uses a split successor feature encoding and cluster-based decomposition, and empirically demonstrate that this architecture produces options that are sensitive to adversarial agents, thus improving learning speed and performance in challenging and dynamic spatial planning tasks.
Original language | English (US) |
---|---|
State | Published - 2022 |
Event | Adaptive and Learning Agents Workshop, ALA 2022 at AAMAS 2022 - Auckland, New Zealand Duration: May 9 2022 → May 10 2022 |
Conference
Conference | Adaptive and Learning Agents Workshop, ALA 2022 at AAMAS 2022 |
---|---|
Country/Territory | New Zealand |
City | Auckland |
Period | 5/9/22 → 5/10/22 |
Keywords
- Hierarchical Reinforcement Learning
- Options Learning
- Reinforcement Learning
ASJC Scopus subject areas
- Artificial Intelligence
- Software