Finite-state approximation of Markov decision processes

Name: Finite-state approximation of Markov decision processes
Author: Saldı, Naci, Linder, T., Yüksel, S.

İsim	Finite-state approximation of Markov decision processes
Yazar	Saldı, Naci, Linder, T., Yüksel, S.
Basım Tarihi:	2018
Basım Yeri	- Springer
Tür	kitap
Dil	İngilizce
Dijital	Evet
Yazma	Hayır
Kütüphane:	Özyeğin Üniversitesi
Demirbaş Numarası	978-3-319-79032-9
Kayıt Numarası	bc8e21af-ac8d-40f0-b049-399ca7f5f409
Lokasyon	Natural and Mathematical Sciences
Tarih	2018
Örnek Metin	In this chapter we study the finite-state approximation problem for computing near optimal policies for discrete-time MDPs with Borel state and action spaces, under discounted and average costs criteria. Even though existence and structural properties of optimal policies of MDPs have been studied extensively in the literature, computing such policies is generally a challenging problem for systems with uncountable state spaces. This situation also arises in the fully observed reduction of a partially observed Markov decision process even when the original system has finite state and action spaces. Here we show that one way to compute approximately optimal solutions for such MDPs is to construct a reduced model with a new transition probability and one-stage cost function by quantizing the state space, i.e., by discretizing it on a finite grid. It is reasonable to expect that when the one-stage cost function and the transition probability of the original model has certain continuity properties, the cost of the optimal policy for the approximating finite model converges to the optimal cost of the original model as the discretization becomes finer. Moreover, under additional continuity conditions on the transition probability and the one stage cost function we also obtain bounds on the accuracy of the approximation in terms of the number of points used to discretize the state space, thereby providing a tradeoff between the computation cost and the performance loss in the system. In particular, we study the following two problems.
DOI	10.1007/978-3-319-79033-6_4

Kaynağa git Özyeğin Üniversitesi

Aramaya Dön

Özyeğin Üniversitesi

Kaynağa git

Finite-state approximation of Markov decision processes

Yazar Saldı, Naci, Linder, T., Yüksel, S.

Basım Tarihi 2018

Basım Yeri - Springer

Tür kitap

Dil İngilizce

Dijital Evet

Yazma Hayır

Kütüphane Özyeğin Üniversitesi

Demirbaş Numarası 978-3-319-79032-9

Kayıt Numarası bc8e21af-ac8d-40f0-b049-399ca7f5f409

Lokasyon Natural and Mathematical Sciences

Tarih 2018

Örnek Metin In this chapter we study the finite-state approximation problem for computing near optimal policies for discrete-time MDPs with Borel state and action spaces, under discounted and average costs criteria. Even though existence and structural properties of optimal policies of MDPs have been studied extensively in the literature, computing such policies is generally a challenging problem for systems with uncountable state spaces. This situation also arises in the fully observed reduction of a partially observed Markov decision process even when the original system has finite state and action spaces. Here we show that one way to compute approximately optimal solutions for such MDPs is to construct a reduced model with a new transition probability and one-stage cost function by quantizing the state space, i.e., by discretizing it on a finite grid. It is reasonable to expect that when the one-stage cost function and the transition probability of the original model has certain continuity properties, the cost of the optimal policy for the approximating finite model converges to the optimal cost of the original model as the discretization becomes finer. Moreover, under additional continuity conditions on the transition probability and the one stage cost function we also obtain bounds on the accuracy of the approximation in terms of the number of points used to discretize the state space, thereby providing a tradeoff between the computation cost and the performance loss in the system. In particular, we study the following two problems.

DOI 10.1007/978-3-319-79033-6_4