ArticleOriginal scientific text

Title

Recursive self-tuning control of finite Markov chains

Authors 1

Affiliations

  1. Department of Computer Science and Automation, Indian Institute of Science, Bangalore-560012, India

Abstract

A recursive self-tuning control scheme for finite Markov chains is proposed wherein the unknown parameter is estimated by a stochastic approximation scheme for maximizing the log-likelihood function and the control is obtained via a relative value iteration algorithm. The analysis uses the asymptotic o.d.e.s associated with these.

Keywords

controlled Markov chains, stochastic approximation, relative value iteration, self-tuning control, adaptive control

Bibliography

  1. D. Bertsekas, Dynamic Programming--Deterministic and Stochastic Models, Prentice-Hall, Englewood Cliffs, N.J., 1987.
  2. V. S. Borkar, Identification and adaptive control of Markov chains, Ph.D. Thesis, Dept. of Electrical Engrg. and Computer Science, Univ. of California, Berkeley, 1980.
  3. V. S. Borkar, Topics in Controlled Markov Chains, Pitman Res. Notes in Math. 240, Longman Scientific and Technical, Harlow, 1991.
  4. V. S. Borkar, The Kumar-Becker-Lin scheme revisited, J. Optim. Theory Appl. 66 (1990), 289-309.
  5. V. S. Borkar, On Milito-Cruz adaptive control scheme for Markov chains, ibid. 77 (1993), 385-393.
  6. V. S. Borkar and K. Soumyanath, A new analog parallel scheme for fixed point computation I--theory, submitted.
  7. V. S. Borkar and P. P. Varaiya, Adaptive control of Markov chains I: finite parameter case, IEEE Trans. Automat. Control AC-24 (1979), 953-957.
  8. V. S. Borkar and P. P. Varaiya, Identification and adaptive control of Markov chains, SIAM J. Control Optim. 20 (1982), 470-488.
  9. Y.-S. Chow and H. Teicher, Probability Theory: Independence, Interchangeability, Martingales, Springer, New York, 1979.
  10. B. Doshi and S. Shreve, Randomized self-tuning control of Markov chains, J. Appl. Probab. 17 (1980), 726-734.
  11. Y. El Fattah, Recursive algorithms for adaptive control of finite Markov chains, IEEE Trans. Systems Man Cybernet. SMC-11 (1981), 135-144.
  12. --, Gradient approach for recursive estimation and control in finite Markov chains, Adv. Appl. Probab. 13 (1981), 778-803.
  13. M. Hirsch, Convergent activation dynamics in continuous time networks, Neural Networks 2 (1987), 331-349.
  14. A. Jalali and M. Ferguson, Adaptive control of Markov chains with local updates, Systems Control Lett. 14 (1990), 209-218.
  15. P. R. Kumar and A. Becker, A new family of adaptive optimal controllers for Markov chains, IEEE Trans. Automat. Control AC-27 (1982), 137-142.
  16. P. R. Kumar and W. Lin, Optimal adaptive controllers for Markov chains, ibid., 756-774.
  17. H. Kushner and D. Clark, Stochastic Approximation for Constrained and Unconstrained Systems, Springer, Berlin, 1978.
  18. P. Mandl, Estimation and control in Markov chains, Adv. Appl. Probab. 6 (1974), 40-60.
  19. R. Milito and J. B. Cruz Jr., An optimization oriented approach to adaptive control of Markov chains, IEEE Trans. Automat. Control AC-32 (1987), 754-762.
  20. J. Neveu, Discrete-Parameter Martingales, North-Holland, Amsterdam, 1975.
  21. B. Sagalovsky, Adaptive control and parameter estimation in Markov chains: a linear case, IEEE Trans. Automat. Control AC-27 (1982), 414-417.
  22. Ł. Stettner, On nearly self-optimizing strategies for a discrete-time uniformly ergodic adaptive model, Appl. Math. Optim. 27 (1993), 161-177.
  23. T. Yoshizawa, Stability Theory by Liapunov's Second Method, The Mathematical Society of Japan, 1966.
Pages:
169-188
Main language of publication
English
Received
1995-10-04
Accepted
1996-04-02
Published
1997
Exact and natural sciences