Reducing the Burden of Supervision in Deep Reinforcement Learning
Reinforcement learning approaches have shown significant success in enabling a variety of applications in robotic control, bypassing the need for accurate models and controllers. The paradigm of reinforcement learning depends heavily on the definition of rewards. Reward functions are often quite hard to define, and require significant instrumentation of the environment and the robotic system or significant human interaction. As tasks become harder and involve significant environmental interaction, this reward supervision becomes increasingly hard to provide. In this talk, I will cover a number of approaches to reduce the burden of reward supervision in reinforcement learning, making it easier to specify rewards, and to instrument the system for reward specifications, moving towards learning systems which are more applicable for real world use. I will describe approaches using imitation learning, meta-learning and unsupervised exploration as a means to tackle the problem of supervision in deep reinforcement learning, and show several robotic applications.
Abhishek Gupta is a third year Ph.D student at UC Berkeley, working with Professor Sergey Levine and Professor Pieter Abbeel. Abhishek's research interests focus on Deep Reinforcement Learning in robotics, with an emphasis on multi-task learning, transfer learning, imitation learning and dexterous manipulation. Abhishek received a B.S in Electrical Engineering and Computer Science from UC Berkeley working with Professor Pieter Abbeel on apprenticeship learning and hierarchical planning. Abhishek is the recipient of the NSF graduate research fellowship as well as the NDSEG graduate fellowship.