카테고리 없음
lab3: Dummy Q-learning (table) code
code0xff
2017. 10. 24. 23:00
김성훈 교수님의 Reinforcement Learning 강의 lab3의 Q-learning 실습 예제를 구현한 소스입니다.
강의가 필요하신 분을 위해 link 남겨드립니다.
https://www.youtube.com/watch?v=yOBKtGU6CG0&feature=youtu.be
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 | import gym import numpy as np import matplotlib.pyplot as plt from gym.envs.registration import register import random as pr def rargmax(vector): m = np.amax(vector) indices = np.nonzero(vector == m)[0] return pr.choice(indices) register( id='FrozenLake-v3', entry_point='gym.envs.toy_text:FrozenLakeEnv', kwargs={'map_name': '4x4', 'is_slippery': False} ) env = gym.make('FrozenLake-v3') Q = np.zeros([env.observation_space.n,env.action_space.n]) num_episodes = 2000 rList = [] for i in range(num_episodes): state = env.reset() rAll = 0 done = False while not done: action = rargmax(Q[state, :]) new_state, reward, done,_ = env.step(action) Q[state,action] = reward + np.max(Q[new_state,:]) rAll += reward state = new_state rList.append(rAll) print("Success rate: " + str(sum(rList)/num_episodes)) print("Final Q-Table Values") print("LEFT DOWN RIGHT UP") print(Q) plt.bar(range(len(rList)), rList, color="blue") plt.show() | cs |