TY - JOUR

T1 - Solutions to a Class of Nonstandard Stochastic Control Problems with Active Learning

AU - Basar, Tamer

N1 - Funding Information:
Manuscript received August 21, 1987; revised April 12, 1988. Paper recommended by Associate Editor, D. A. Castanon. This work was supported in part by the U.S. Air Force Office of Scientific Research under Grant AFOSR 84-0056.

PY - 1988/12

Y1 - 1988/12

N2 - We formulate and solve a dynamic stochastic optimization problem of a nonstandard type, whose optimal solution features active learning. The proof of optimality and the derivation of the corresponding control policies is an indirect one, which relates the original single-person optimization problem to a sequence of nested zero-sum stochastic games. Existence of saddle points for these games implies the existence of optimal policies for the original stochastic control problem, which, in turn, can be obtained from the solution of a nonlinear deterministic optimal control problem. The paper also studies the problem of existence of stationary optimal policies when the time horizon is infinite and the objective function is discounted.

AB - We formulate and solve a dynamic stochastic optimization problem of a nonstandard type, whose optimal solution features active learning. The proof of optimality and the derivation of the corresponding control policies is an indirect one, which relates the original single-person optimization problem to a sequence of nested zero-sum stochastic games. Existence of saddle points for these games implies the existence of optimal policies for the original stochastic control problem, which, in turn, can be obtained from the solution of a nonlinear deterministic optimal control problem. The paper also studies the problem of existence of stationary optimal policies when the time horizon is infinite and the objective function is discounted.

UR - http://www.scopus.com/inward/record.url?scp=0024122882&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=0024122882&partnerID=8YFLogxK

U2 - 10.1109/9.14434

DO - 10.1109/9.14434

M3 - Letter

AN - SCOPUS:0024122882

VL - 33

SP - 1122

EP - 1129

JO - IEEE Transactions on Automatic Control

JF - IEEE Transactions on Automatic Control

SN - 0018-9286

IS - 12

ER -