Teaching Artificial Agents to Understand Language by Modelling Reward
Recent progress in Deep Reinforcement Learning has shown that agents can be taught complex behaviour and solve difficult tasks, such as playing video games from pixel observations, or mastering the game of Go without observing human games, with relatively little prior information. Building on these successes, researchers such as Hermann and colleagues have sought to apply these methods to teach–in simulation–agents to complete a variety of tasks specified by combinatorially rich instruction languages. In this talk, we discuss some of these highlights and some of the limitations which inhibit scalability of such approaches to more complex instruction languages (including natural language). Following this, we introduce a new approach, inspired by recent work in adversarial reward modelling, which constitutes a first step towards scaling instruction-conditional agent training to “real world” language.
Edward Grefenstette is a Research Scientist at Facebook AI Research, and Honorary Associate Professor at UCL. He previously was, in reverse order, a Staff Research Scientist at DeepMind, the CTO of Dark Blue Labs, and a Junior Research Fellow within Oxford’s Department of Computer Science and Somerville College. He completed his DPhil (PhD) at the University of Oxford in 2013 under the supervision of Profs Coecke and Pulman, and Dr Sadrzadeh, working on applying category-theoretic tools–initially developed to model quantum information flow–to model compositionality of distributed representations in natural language semantics. His recent research has covered topics at the intersection of deep learning and machine reasoning, addressing questions such as how neural networks can model or understand logic and mathematics, infer implicit or human-readable programs, or learn to understand instructions from simulation.