The Kyoto College of Graduate Studies for Informatics (京都情報大学院大学) / Assistant Professor
Deep Reinforcement Learning Nanodegree Projects
• Deep Q-Learning on Banana Navigation Project Implemented deep Q-learning model with replay buffer and experience replay to train the agent • Deep Deterministic Policy Gradient (DDPG) for Reacher environment Implemented the DDPG in continuous action space that use actor-critic and model-free algorithms • Multi Agent DDPG for Tennis environment Implemented the MADDPG for Tennis environment Used the first agent’s experience to train the second agent.