Nonparametric adaptive control for discrete-time Markov processes with unbounded costs under average criterion

J. Minjárez-Sosa

ArticleOriginal scientific text

Title

Nonparametric adaptive control for discrete-time Markov processes with unbounded costs under average criterion

Authors ¹

Affiliations

Departamento de Matemáticas, Universidad de Sonora, Rosales s/n Col. Centro, C.P. 83000, Hermosillo, Son., México

Abstract

We introduce average cost optimal adaptive policies in a class of discrete-time Markov control processes with Borel state and action spaces, allowing unbounded costs. The processes evolve according to the system equations

x_{t + 1} = F (x_{t}, a_{t}, ξ_{t})

, t=1,2,..., with i.i.d.

ℝ^{k}

-valued random vectors

ξ_{t}

, which are observable but whose density ϱ is unknown.

Keywords

ENG

Markov control process, discounted and average cost criterion, adaptive policy

D. Blackwell, Discrete dynamic programming, Ann. Math. Statist. 33 (1962), 719-726.
E. B. Dynkin and A. A. Yushkevich, Controlled Markov Processes, Springer, New York, 1979.
E. I. Gordienko, Adaptive strategies for certain classes of controlled Markov processes, Theory Probab. Appl. 29 (1985), 504-518.
E. I. Gordienko and O. Hernández-Lerma, Average cost Markov control processes with weighted norms: existence of canonical policies, Appl. Math. (Warsaw) 23 (1995), 199-218.
E. I. Gordienko and J. A. Minjárez-Sosa, Adaptive control for discrete-time Markov processes with unbounded costs: discounted criterion, Kybernetika 34 (1998), no. 2, 217-234.
E. I. Gordienko and J. A. Minjárez-Sosa, Adaptive control for discrete-time Markov processes with unbounded costs: average criterion, Math. Methods Oper. Res. 48 (1998), 37-55.
R. Hasminskii and I. Ibragimov, On density estimation in the view of Kolmogorov's ideas in approximation theory, Ann. Statist. 18 (1990), 999-1010.
O. Hernández-Lerma, Adaptive Markov Control Processes, Springer, New York, 1989.
O. Hernández-Lerma, Infinite-horizon Markov control processes with undiscounted cost criteria: from average to overtaking optimality, Reporte Interno 165, Departamento de Matemáticas, CINVESTAV-IPN, México, 1994.
O. Hernández-Lerma and R. Cavazos-Cadena, Density estimation and adaptive control of Markov processes: average and discounted criteria, Acta Appl. Math. 20 (1990), 285-307.
S. A. Lippman, On dynamic programming with unbounded rewards, Manag. Sci. 21 (1975), 1225-1233.
P. Mandl, Estimation and control in Markov chains, Adv. Appl. Probab. 6 (1974), 40-60.
U. Rieder, Measurable selection theorems for optimization problems, Manuscripta Math. 24 (1978), 115-131.
J. A. E. E. Van Nunen and J. Wessels, A note on dynamic programming with unbounded rewards, Manag. Sci. 24 (1978), 576-580.

Title

Nonparametric adaptive control for discrete-time Markov processes with unbounded costs under average criterion

Affiliations

Abstract

Keywords

Bibliography