SSPQL: Stochastic Shortest Path-based Q-learning - 학지사ㆍ교보문고 스콜라

Reinforcement learning (RL) has been widely used as a mechanism for autonomous robots to learn state-action pairs by interacting with their environment. However, most RL methods usually suffer from slow convergence when deriving an optimum policy in practical applications. To solve this problem, a stochastic shortest path-based Q-learning (SSPQL) is proposed, combining a stochastic shortest path-finding method with Q-learning, a well-known model-free RL method. The rationale is, if a robot has an internal state-transition model which is incrementally learnt, then the robot can infer the local optimum policy by using a stochastic shortest path-finding method. By increasing state-action pair values comprising of these local optimum policies, a robot can then reach a goal quickly and as a result, this process can enhance convergence speed. To demonstrate the validity of this proposed learn-ing approach, several experimental results are presented in this paper.

SSPQL: Stochastic Shortest Path-based Q-learning
SSPQL: Stochastic Shortest Path-based Q-learning

(0)

(0)

(0)

(0)

SSPQL: Stochastic Shortest Path-based Q-learning SSPQL: Stochastic Shortest Path-based Q-learning

(0)

(0) 팝업 열기 팝업 닫기

(0)

(0)

SSPQL: Stochastic Shortest Path-based Q-learning
SSPQL: Stochastic Shortest Path-based Q-learning

(0)