Improving predictive state representations via gradient descent

Nan Jiang, Alex Kulesza, Satinder Singh

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Predictive state representations (PSRs) model dynamical systems using appropriately chosen predictions about future observations as a representation of the current state. In contrast to the hidden states posited by HMMs or RNNs, PSR states are directly observable in the training data; this gives rise to a moment-matching spectral algorithm for learning PSRs that is computationally efficient and statistically consistent when the model complexity matches that of the true system generating the data. In practice, however, model mismatch is inevitable and while spectral learning remains appealingly fast and simple it may fail to find optimal models. To address this problem, we investigate the use of gradient methods for improving spectrally-learned PSRs. We show that only a small amount of additional gradient optimization can lead to significant performance gains, and moreover that initializing gradient methods with the spectral learning solution yields better models in significantly less time than starting from scratch.

Original languageEnglish (US)
Title of host publication30th AAAI Conference on Artificial Intelligence, AAAI 2016
PublisherAmerican Association for Artificial Intelligence (AAAI) Press
Pages1709-1715
Number of pages7
ISBN (Electronic)9781577357605
StatePublished - 2016
Externally publishedYes
Event30th AAAI Conference on Artificial Intelligence, AAAI 2016 - Phoenix, United States
Duration: Feb 12 2016Feb 17 2016

Publication series

Name30th AAAI Conference on Artificial Intelligence, AAAI 2016

Other

Other30th AAAI Conference on Artificial Intelligence, AAAI 2016
Country/TerritoryUnited States
CityPhoenix
Period2/12/162/17/16

ASJC Scopus subject areas

  • Artificial Intelligence

Fingerprint

Dive into the research topics of 'Improving predictive state representations via gradient descent'. Together they form a unique fingerprint.

Cite this