Understanding How Value Predictions Shape Deep Representations
A reinforcement learning agent is only as good as its internal representation of the environment. To wit, a great part of the success of deep reinforcement learning (deep RL) is due to the ease with which its algorithms adapt their state representations; improving our control over this process is a necessary step towards taking reinforcement learning into everyday usage. This talk presents some of our recent work on demystifying the mechanisms by which deep RL algorithms acquire their representations, and explaining why some methods are more successful than others. In particular, I will show how a certain class of auxiliary predictions, derived from the notion of an adversarial value function, help shape good representations. I will illustrate these findings with useful visualizations of the representation learning process in the context of Atari game-playing and on synthetic environments.
- Deep learning and reinforcement learning interact in strange ways
- Deep RL is getting more stable/usable by the day
- Distributional reinforcement learning is one way to get stable deep RL
Marc G. Bellemare is a research scientist at Google Brain in Montreal, Canada; a CIFAR Learning in Machines & Brain Fellow; adjunct professor at McGill University; and was recently awarded a Canada CIFAR AI Chair, held at the Montreal Institute for Learning Algorithms (Mila). He received his Ph.D. from the University of Alberta where he studied the concept of domain-independent agents and built the highly-successful Arcade Learning Environment, the platform for AI research on Atari 2600 games. From 2013 to 2017 he was research scientist at DeepMind where he made important contributions to the field of deep reinforcement learning. He is known for his work on reinforcement learning, including approximate exploration, representation learning, and the distributional method.