T1 - Solutions to a Class of Nonstandard Stochastic Control Problems with Active Learning

AU - Basar, Tamer

Funding Information:
Manuscript received August 21, 1987; revised April 12, 1988. Paper recommended by Associate Editor, D. A. Castanon. This work was supported in part by the U.S. Air Force Office of Scientific Research under Grant AFOSR 84-0056.

PY - 1988/12

Y1 - 1988/12

N2 - We formulate and solve a dynamic stochastic optimization problem of a nonstandard type, whose optimal solution features active learning. The proof of optimality and the derivation of the corresponding control policies is an indirect one, which relates the original single-person optimization problem to a sequence of nested zero-sum stochastic games. Existence of saddle points for these games implies the existence of optimal policies for the original stochastic control problem, which, in turn, can be obtained from the solution of a nonlinear deterministic optimal control problem. The paper also studies the problem of existence of stationary optimal policies when the time horizon is infinite and the objective function is discounted.

JO - IEEE Transactions on Automatic Control

