Abstract

Eigendecomposition methods have been shown to generate sets of useful options which improve learning speed when used in hierarchical reinforcement learning. However, these methods focus on navigation by learning reward-agnostic representations and struggle when presented with environments with dynamic reward structures, such as adversarial agents. Taking inspiration from mammals, which are known to maintain specialized groupings of cells to perform complex planning, we propose leveraging the linear feature space of the successor features framework to independently encode spatial and reward information of an environment. We show that subsequent decomposition of this representation results in options which separately relate to spatial or rewarding features, allowing for complex spatial planning around dynamic objects, such as adversaries. We also propose the use of principle component analysis to perform this decomposition, due to its use of a clustering basis, which we show better identifies options for spatial planning than common methods. We combine these ideas in our Specialized Neurons and Clustering Architecture (SNAC), which uses a split successor feature encoding and cluster-based decomposition, and empirically demonstrate that this architecture produces options that are sensitive to adversarial agents, thus improving learning speed and performance in challenging and dynamic spatial planning tasks.

Original languageEnglish (US)
StatePublished - 2022
EventAdaptive and Learning Agents Workshop, ALA 2022 at AAMAS 2022 - Auckland, New Zealand
Duration: May 9 2022May 10 2022

Conference

ConferenceAdaptive and Learning Agents Workshop, ALA 2022 at AAMAS 2022
Country/TerritoryNew Zealand
CityAuckland
Period5/9/225/10/22

Keywords

  • Hierarchical Reinforcement Learning
  • Options Learning
  • Reinforcement Learning

ASJC Scopus subject areas

  • Artificial Intelligence
  • Software

Fingerprint

Dive into the research topics of 'Feature Specialization and Clustering Improves Hierarchical Subtask Learning'. Together they form a unique fingerprint.

Cite this