Combining Planning and Learning
In order to build agents that can reason about the consequences of their actions and control the world, it is important to acquire rich priors that capture a notion of how to plan forward into the future. Imagining the consequences in raw pixels is unlikely to scale well for large environments and also has no incentive to ignore aspects of the raw sensory stream that are irrelevant to the task at hand. I will talk about ways to circumvent this issue by introducing explicit differentiable planning inside the policy's computation graph and show that the learned priors are generalizable across different robot morphologies and can capture a generic notion of the underlying task in its representation.
Aravind is a second year Ph.D. student at UC Berkeley advised by Prof. Pieter Abbeel and is part of the Berkeley AI Research lab. He has spent time at OpenAI and is interested in learning representations from raw sensory data for general intelligence.