Tavakol Aghaei, Vahid and Onat, Ahmet and Yıldırım, Sinan (2018) A Markov chain Monte Carlo algorithm for Bayesian policy search. Systems Science and Control Engineering, 6 (1). pp. 438-455. ISSN 2164-2583
This is the latest version of this item.
PDF (submitted version)
A_Markov_Chain_Monte_Carlo_Algorithm_for_Bayesian_Policy_Search_Taylor_Francis_2nd_Revision.pdf
Restricted to Repository staff only
Download (3MB) | Request a copy
A_Markov_Chain_Monte_Carlo_Algorithm_for_Bayesian_Policy_Search_Taylor_Francis_2nd_Revision.pdf
Restricted to Repository staff only
Download (3MB) | Request a copy
Official URL: https://doi.org/10.1080/21642583.2018.1528483
Abstract
Policy search algorithms have facilitated application of Reinforcement Learning (RL) to dynamic systems, such as control of robots. Many policy search algorithms are based on the policy gradient, and thus may suffer from slow convergence or local optima complications. In this paper, we take a Bayesian approach to policy search under RL paradigm, for the problem of controlling a discrete time Markov decision process with continuous state and action spaces and with a multiplicative reward structure. For this purpose, we assume a prior over policy parameters and aim for the ‘posterior’ distribution where the ‘likelihood’ is the expected reward. We propound a Markov chain Monte Carlo algorithm as a method of generating samples for policy parameters from this posterior. The proposed algorithm is compared with certain well-known policy gradient-based RL methods and exhibits more appropriate performance in terms of time response and convergence rate, when applied to a nonlinear model of a Cart-Pole benchmark.
Item Type: | Article |
---|---|
Uncontrolled Keywords: | Reinforcement learning; Markov chain Monte Carlo; particle filtering; risk sensitive reward; policy search; control |
Subjects: | Q Science > QA Mathematics |
Divisions: | Faculty of Engineering and Natural Sciences > Academic programs > Industrial Engineering Faculty of Engineering and Natural Sciences > Academic programs > Mechatronics Faculty of Engineering and Natural Sciences Faculty of Engineering and Natural Sciences > Academic programs > Manufacturing Systems Eng. |
Depositing User: | Sinan Yıldırım |
Date Deposited: | 27 May 2019 15:38 |
Last Modified: | 12 Jun 2023 15:16 |
URI: | https://research.sabanciuniv.edu/id/eprint/37093 |
Available Versions of this Item
-
A Markov chain Monte Carlo algorithm for Bayesian policy search. (deposited 16 Aug 2018 09:42)
- A Markov chain Monte Carlo algorithm for Bayesian policy search. (deposited 27 May 2019 15:38) [Currently Displayed]