Theory for automatic learning under partially observed Markov-dependent noise

Sid Yakowitz, Thusitha Jayawardena, Shu Li

Research output: Contribution to journalArticlepeer-review


A vigorous branch of automatic learning is directed at the task of locating a global minimum of an unknown multimodal function f (θ) on the basis of noisy observations L (θ(i)) = f (θ(i)) + W(θ(i)) taken at sequentially-chosen control points {θ(i)}. In all preceding convergence deviations known to us, the noise is postulated to depend on the past only through control selection. The present paper contributes to the literature by allowing that the observation noise sequence may be stochastically dependent, being a function of an unknown underlying Markov decision process, the observations being the stage-wise losses. In a sense, in order to be made precise, the algorithm offered here is shown to attain asymptotically optimal performance, and rates are assured. A motivating example from the queueing theory is offered, and connections with classical problems of Markov control theory and other disciplines are mentioned.

Original languageEnglish (US)
Pages (from-to)1316-1324
Number of pages9
JournalIEEE Transactions on Automatic Control
Issue number9
StatePublished - Sep 1992
Externally publishedYes


Dive into the research topics of 'Theory for automatic learning under partially observed Markov-dependent noise'. Together they form a unique fingerprint.

Cite this