Ofir Nachum

Learning Abstractions with Hierarchical Reinforcement Learning

Hierarchical RL has long held the promise of enabling deep RL to solve more complex and temporally extended tasks by abstracting away lower-level details from a higher-level agent. In this talk, we describe how to turn this promise into a reality. We present a hierarchical design in which a higher-level agent solves a task by iteratively directing a lower-level policy to reach certain goals. We describe how both levels may be trained concurrently in a highly-efficient, off-policy manner. Furthermore, we present a provably-optimal technique for learning abstract notions of `goals' without explicit supervision. Our resulting method achieves excellent performance on a suite of difficult navigation tasks.

Key Takeaways:

  • Hierarchy can multiply the capabilities of an RL agent
  • The key to good hierarchical RL is using goal-conditioned policies
  • Recent research provides the tools to train these models efficiently

Ofir Nachum currently works at Google Brain as a Research Scientist. His research focuses on reinforcement learning, with notable work including PCL (path consistency learning) and HIRO (hierarchical reinforcement learning with off-policy correction). He received his Bachelor's and Master's from MIT. Before joining Google, he was an engineer at Quora, leading machine learning efforts on the feed, ranking, and quality teams.

Buttontwitter Buttonlinkedin
This website uses cookies to ensure you get the best experience. Learn more