Learning to Reason, Summarize and Discover with Deep Reinforcement Learning
Recently, deep reinforcement learning (RL) has seen remarkable success in complex simulated environments, such as single and multi-agent video-games. However, applying deep RL to real-life problems remains challenging due to several key obstacles, such as sample inefficiency, lack of strong generalization and inherent task structure complexity. In this talk, I will showcase approaches that augment and apply RL to address these challenges.
First, I will show applications of RL to natural language tasks, such as multi-hop reasoning in knowledge graphs and abstractive summarization. Second, I will demonstrate how hierarchical RL can be accelerated through structured exploration with world-graphs derived from unsupervised learning. Finally, I will highlight recent theoretical and algorithmic innovations to decompose complex tasks, stabilize meta-learning, and provide bounds on generalization error across tasks.
- Learning the structure of (collections of) tasks can improve speed and generalization of RL
- RL can improve performance on real-world tasks, like summarization and reasoning in knowledge graphs
- World-graphs are a powerful way to abstract an agent's environment and accelerate RL
Stephan Zheng is a research scientist at Salesforce Research, where he focuses on deep reinforcement learning and multi-agent learning. He has also worked on improving the robustness of deep learning, detecting adversarial examples and hierarchical models for human behavioral and spatiotemporal data.
Stephan obtained his PhD in 2018 in the Machine Learning group at Caltech, advised by Yisong Yue. Before that, he completed an MSc in Theoretical Physics and BSc in Physics/Mathematics at Utrecht University, Part III Mathematics at the University of Cambridge, and was a visiting student at Harvard University. He received the 2011 Lorenz Prize in Theoretical Physics from the Dutch Academy of Arts and Sciences for his thesis on exotic dualities in topological quantum field theory, and was twice a research intern with Google Research and Google Brain.