On adaptive control of a partially observed Markov chain

Di Masi, Giovanni B.; Stettner, Łukasz

Artykuł - szczegóły

Czasopismo

Applicationes Mathematicae

1993-1995 | 22 | 2 | 165-180

Tytuł artykułu

On adaptive control of a partially observed Markov chain

Autorzy

Giovanni B. Di Masi , Łukasz Stettner

Treść / Zawartość

Pełne teksty:

http://matwbn.icm.edu.pl/ksiazki/zm/zm22/zm2223.pdf [zdalny]

Języki publikacji

EN

Abstrakty

EN

A control problem for a partially observable Markov chain depending on a parameter with long run average cost is studied. Using uniform ergodicity arguments it is shown that, for values of the parameter varying in a compact set, it is possible to consider only a finite number of nearly optimal controls based on the values of actually computable approximate filters. This leads to an algorithm that guarantees nearly selfoptimizing properties without identifiability conditions. The algorithm is based on probing control, whose cost is additionally assumed to be periodically observable.

Słowa kluczowe

EN

uniform ergodicity long run average cost filtering process adaptive control approximate filter partially observed systems

Wydawca

Institute of Mathematics Polish Academy of Sciences

Czasopismo

Applicationes Mathematicae

Rocznik

1993-1995

Tom

22

Numer

2

Strony

165-180

Opis fizyczny

Daty

wydano

1994

otrzymano

1992-11-02

Twórcy

autor

Giovanni B. Di Masi

Dipartimento di Matematica, Pura ed Applicata and CNR-Ladseb, Università di Padova, I-35100 Padova, Italy

autor

Łukasz Stettner

Institute of Mathematics, Polish Academy of Sciences, Śniadeckich 8, 00-950 Warszawa, Poland

Bibliografia

[1] A. Arapostathis and S. I. Marcus, Analysis of an identification algorithm arising in the adaptive estimation of Markov chains, Math. Control Signals Systems 3 (1990), 1-29.
[2] V. V. Baranov, A recursive algorithm in Markovian decision processes, Cybernetics 18 (1982), 499-506.
[3] D. P. Bertsekas, Dynamic Programming and Stochastic Control, Academic Press, New York, 1976.
[4] J. L. Doob, Stochastic Processes, Wiley, New York, 1953.
[5] W. Feller, An Introduction to Probability Theory and Its Applications II, Wiley, New York, 1971.
[6] E. Fernández-Gaucherand, A. Arapostathis and S. I. Marcus, On the adaptive control of a partially observable Markov decision process, in: Proc. 27th IEEE Conf. on Decision and Control, 1988, 1204-1210.
[7] E. Fernández-Gaucherand, A. Arapostathis and S. I. Marcus, On the adaptive control of a partially observable binary Markov decision process, in: Advances in Computing and Control, W. A. Porter, S. C. Kak and J. L. Aravena (eds.), Lecture Notes in Control and Inform. Sci. 130, Springer, New York, 1989, 217-228.
[8] L. G. Gubenko and E. S. Shtatland, On discrete-time Markov decision processes, Theory Probab. Math. Statist. 7 (1975), 47-61.
[9] O. Hernández-Lerma, Adaptive Markov Control Processes, Springer, New York, 1989.
[10] O. Hernández-Lerma and S. I. Marcus, Adaptive control of Markov processes with incomplete state information and unknown parameters, J. Optim. Theory Appl. 52 (1987), 227-241.
[11] O. Hernández-Lerma and S. I. Marcus, Nonparametric adaptive control of discrete-time partially observable stochastic systems, J. Math. Anal. Appl. 137 (1989), 312-334.
[12] A. H. Jazwinski, Stochastic Processes and Filtering Theory, Academic Press, New York, 1970.
[13] N. W. Kartashov, Criteria for uniform ergodicity and strong stability of Markov chains in general state space, Theory Probab. Math. Statist. 30 (1985), 71-89.
[14] P. R. Kumar and P. Varaiya, Stochastic Systems: Estimation, Identification and Adaptive Control, Prentice-Hall, Englewood Cliffs, 1986.
[15] H. J. Kushner and H. Huang, Approximation and limit results for nonlinear filters with wide bandwidth observation noise, Stochastics 16 (1986), 65-96.
[16] G. E. Monahan, A survey of partially observable Markov decision processes: theory, models and algorithms, Management Sci. 28 (1982), 1-16.
[17] W. J. Runggaldier and Ł. Stettner, Nearly optimal controls for stochastic ergodic problems with partial observation, SIAM J. Control Optim. 31 (1993), 180-218.
[18] Ł. Stettner, On nearly self-optimizing strategies for a discrete-time uniformly ergodic adaptive model, J. Appl. Math. Optim. 27 (1993), 161-177.

Identyfikator YADDA

bwmeta1.element.bwnjournal-article-zmv22z2p165bwm

Artykuł - szczegóły

Czasopismo

Applicationes Mathematicae

Tytuł artykułu

On adaptive control of a partially observed Markov chain

Autorzy

Treść / Zawartość

Warianty tytułu

Języki publikacji

Abstrakty

Słowa kluczowe

Wydawca

Czasopismo

Rocznik

Tom

Numer

Strony

Opis fizyczny

Daty

Twórcy

Bibliografia

Typ dokumentu

Bibliografia

Identyfikatory

Identyfikator YADDA