Prioritized experince deep deterministic policy gradient method for dynamic systems

Cebeci, Serhat Emre (2019) Prioritized experince deep deterministic policy gradient method for dynamic systems. [Thesis]

[thumbnail of 10297395_SerhatEmreCebeci.pdf] PDF
10297395_SerhatEmreCebeci.pdf

Download (1MB)

Abstract

In this thesis, the problem of learning to control a dynamic system through reinforcement learning is taken up. There are two important problems in learning to control dynamic systems under this framework: correlated sample space and curse of dimensionality: The first problem means that samples sequentially taken from the plant are correlated, and fail to provide a rich data set to learn from. The second problem means that plants with a large state dimension are untractable if states are quantized for the learning algorithm. Recently, these problems have been attacked by state-of-the-art algorithm called Deep Deterministic Policy Gradient method (DDPG). In this thesis, we propose a new algorithm Prioritized Experience DDPG (PE-DDPG) that improves the sample efficiency of DDPG, through a Prioritized Experience Replay mechanism integrated into the original DDPG. It allows the agent experience some samples more frequently depending on their novelty. PE-DDPG algorithm is tested on OpenAI Gym's Inverted Pendulum task. The results of experiment show that the proposed algorithm can reduce training time and it has lower variance which implies more stable learning process.
Item Type: Thesis
Uncontrolled Keywords: Deep reinforcement learning. -- Neural networks. -- Reinforcement learning. -- Dynamic systems. -- Deep learning. -- Derin pekiştirmeli öğrenme. -- Yapay sinir ağları. -- Pekiştirmeli öğrenme. -- Dinamik sistemle. -- Derin öğrenme.
Subjects: T Technology > TJ Mechanical engineering and machinery > TJ163.12 Mechatronics
Divisions: Faculty of Engineering and Natural Sciences > Academic programs > Mechatronics
Faculty of Engineering and Natural Sciences
Depositing User: IC-Cataloging
Date Deposited: 21 Oct 2019 13:53
Last Modified: 26 Apr 2022 10:32
URI: https://research.sabanciuniv.edu/id/eprint/39357

Actions (login required)

View Item
View Item