-
THIS SCHEDULE TAKES PLACE ON DAY 2
-
08:00
WELCOME & OPENING REMARKS - 8am PST | 11am EST | 4pm GMT
Shivani Poddar - Tech Lead (Machine Learning/Artificial Intelligence) - Facebook
Shivani is an machine learning engineer on the Facebook Assistant team working on both the product and research arms around machine learning reasoning for assistants, and multi-modal assistants of the future. Before Facebook, she was at Carnegie Mellon University, where she helped build the CMU Magnus system for social chit chat ground up for the first wave of Amazon Alexa Prize Challenge. She has also published work on modeling user psychology, and building argumentation systems that help in negotiation. Her research background spans across-disciplines such as computer science, psychology and machine learning.
-
REINFORCEMENT LEARNING ADVANCEMENTS
-
08:10
Learning Representations for Reinforcement Learning
Martha White - Associate Professor - University of Alberta
Learning Representations for Reinforcement Learning
The learning performance of a reinforcement learning (RL) agent is highly dependent on its data representation—the features. In this talk, I will discuss several reasons why the representation is so critical in RL, related to the fact that the agent typically learns online, needs to explore, constantly sees data in new parts of the environment and often uses algorithms that bootstrap off their own value estimates. I will describe some strategies for learning representations suitable for this setting, particularly highlighting the utility of sparse or orthogonal representations.
Key takeaways: 1. It is important to consider the role of the representation for your RL agent,
The choice of representation is not just about accuracy, but interacts with the stability of the update, the ability to explore and interference in online updating, and
There is much more to be done to understand the types of representations currently learned, and what properties we want.
Martha White is an Associate Professor of Computing Science at the University of Alberta. Before joining the University of Alberta in 2017, she was an Assistant Professor of Computer Science at Indiana University. Martha is a PI of AMII---the Alberta Machine Intelligence Institute---which is one of the top machine learning centres in the world, and a director of RLAI---the Reinforcement Learning and Artificial Intelligence Lab at the University of Alberta. She holds a Canada CIFAR AI Chair and has authored more than 40 papers in top journals and conferences. Her research focus is on developing algorithms for agents continually learning on streams of data, with an emphasis on representation learning and reinforcement learning.
-
08:35
Emergent Complexity and Zero-Shot Transfer via Unsupervised Environment Design
Natasha Jaques - Research Scientist - Google Brain
Emergent Complexity and Zero-Shot Transfer via Unsupervised Environment Design
How can we move deep RL beyond games, without having to hand-build a simulator that covers real-world complexity? We train an RL adversary to generate a curriculum of challenging environments. To ensure the adversary cannot create impossible environments, we constrain it using the performance of a second agent. The adversary is trained to maximize the regret, defined as the difference between the performance of the pair of agents. This motivates the adversary to generate environments that are solvable, but challenging. PAIRED produces a natural curriculum of increasingly complex environments, and PAIRED agents achieve higher zero-shot transfer performance when tested in challenging, novel environments.
Key Takeaways: 1. RL agents train in a simulated environment, but for many real-world problems we can't program a simulator to cover every possible test-case. 2. Instead, we can learn to automatically generate environments that exploit weaknesses in our agent, using a second, adversary agent. 3. We propose a new technique for adversarial environment generation which optimizes minimax regret. This produces a curriculum of environments by adjusting the difficulty level to be feasible, but outside the agent's current skill level.
Natasha Jaques recently finished her PhD at MIT, which focused on improving the social and affective intelligence of deep learning and deep reinforcement learning. She is now a Research Scientist at Google Brain and Berkeley working with Sergey Levine and Doug Eck. Her work has received an honourable mention for best paper at ICML 2019, a best paper award at the NeurIPS ML for Healthcare workshop and was part of the team that received Best Demo at NeurIPS 2016. She has interned at DeepMind, Google Brain, and was an OpenAI Scholars mentor. Her work has been featured in Quartz, the MIT Technology Review, Boston Magazine, and on CBC radio. Natasha earned her Masters degree from the University of British Columbia, and undergraduate degrees in Computer Science and Psychology from the University of Regina.
-
IMPROVING REINFORCEMENT LEARNING
-
09:00
Towards Safe, Interpretable, and Moral Reinforcement Learning Agents
Joel Lehman - Research Scientist - OpenAI
Towards Safe, Interpretable, and Moral Reinforcement Learning Agents
Reinforcement learning is a powerful paradigm of machine learning, where agents (like robots) are trained through rewards to perform tasks. While such an approach has proven successful in solving closed-world video games, it is difficult to apply in the real world. One reason it is difficult is the challenge of creating safe agents, i.e. agents that do not unintentionally damage themselves or the environment. This talk describes the challenges of AI safety, and reviews three research projects aimed as steps towards agents that are safer, more interpretable, and that respect moral rules.
Joel Lehman is a research scientist at OpenAI, and previously was a founding member of Uber AI Labs and an assistant professor at the IT University of Copenhagen. His research focuses on open-endedness, reinforcement learning, and AI safety. His PhD dissertation introduced the novelty search algorithm, which inspired a popular science book co-written with Ken Stanley on what search algorithms imply for individual and societal objectives, called “Why Greatness Cannot Be Planned.”
-
09:25
COFFEE & NETWORKING BREAK
-
09:35
Predictability Maximization: Empowerment As An Intelligence Measure
Shane Gu - Research Scientist - Google Brain
Predictability Maximization: Empowerment As An Intelligence Measure
Intelligence is often associated with the ability to optimize the environment for maximizing one's objectives (e.g. survival). In particular, the ability to predictably change the environment -- empowerment -- is an essential skill that allows agents to efficiently achieve many goals. In this talk, I will discuss empowerment from multiple perspectives, including model-based and classic goal-based RL, and relate it to classic and recently-proposed definitions and measures of intelligence.
Key Takeaways:
Empowerment = mutual information between actions and future states
Maximizing empowerment = maximizing diversity of futures achievable given all actions + maximizing predictability of the future given each possible action
Empowerment could be a more direct measure of general intelligence
Shane Gu is a Research Scientist at Google Brain, where he mainly works on problems in deep learning, reinforcement learning, robotics, and probabilistic machine learning. His recent research focuses on sample-efficient RL methods that could scale to solve difficult continuous control problems in the real-world, which have been covered by Google Research Blogpost and MIT Technology Review. He completed his PhD in Machine Learning at the University of Cambridge and the Max Planck Institute for Intelligent Systems in Tübingen, where he was co-supervised by Richard E. Turner, Zoubin Ghahramani, and Bernhard Schölkopf. During his PhD, he also collaborated closely with Sergey Levine at UC Berkeley/Google Brain and Timothy Lillicrap at DeepMind. He holds a B.ASc. in Engineering Science from the University of Toronto, where he did his thesis with Geoffrey Hinton in distributed training of neural networks using evolutionary algorithms.
-
10:00
Human and Multi-Agent Collaboration in a Human-AI Teaming Framework
Neda Navidi - Postdoctoral Researcher - Quebec University of Montreal
Human and AI agents Collaboration in High Dynamic Environment
The main focus of this talk is "human-AI teaming", specifically the mode of "human-AI collaboration" where humans and AI agents accomplish tasks together in a complex system. Therefore the objective cannot be achieved by just alone human or agent, and the responsibilities in the environment are partitioned and/or shared between humans and agents. Collaborative multi-agent reinforcement learning (MARL) as a specific category of reinforcement learning (RL) provides effective results with agents learning from their observations, received rewards, and internal interactions between agents. However, centralized learning methods with a joint global policy in a highly dynamic environment present unique challenges in dealing with large amounts of information. This study proposes innovative solutions to address the complexities of a collaboration between human and RL agents where the goals pursued cannot be achieved by a human alone or agents alone.
Dr. Neda Navidi is an expert AI researcher with more than fifteen years of experience in designing and developing optimization systems, signal processing, practical AI, and theoretical ML/DL/RL algorithms. Neda has leveraged her extensive experience to harness the potential of new technologies and implement them across industrial solutions and services related to human-AI collaboration. She has also been a guest lecturer at the Quebec University of Montreal. She has more than 30 scientific papers in different journals and conferences. Dr. Neda holds a Ph.D. in AI (autonomous driving field) from École de Technologie Supérieure (ÉTS), and postdoctoral from HEC Montréal, McGill University, and Polytechnique Montréal.
-
10:25
BREAKOUT SESSIONS
Roundtable Discussions & Demos with Speakers - - AI EXPERTS
Join a roundtable discussion hosted by AI experts to get your questions answered on a variety of topics.
You are free to come in and out of all sessions to ask your questions, share your thoughts, and learn more from the speakers and other attendees.
Roundtable Discussions 28th January:
• ‘Multiple Clouds. One Cluster’ hosted by Nathan Reid, Staff Solutions Architect, CAST.AI
Roundtable Discussions 29th January:
• ‘Curriculum Generation for Reinforcement Learning’ hosted by Natasha Jaques, Research Scientist, Google Brain
• ‘The AI Economist’ hosted by Stephan Zhang, Lead Research Scientist, Salesforce Research
• ‘A Win-Win in Precision Ag’ hosted by Jennifer Hobbs, Director of Machine Learning, Intelin Air
-
10:45
COFFEE & NETWORKING BREAK
-
REINFORCEMENT LEARNING APPLICATIONS
-
10:55
Augmenting Automated Game Testing with Deep Reinforcement Learning
Linus Gisslén - Senior Research Engineer - Machine Learning - Electronic Arts
Augmenting Automated Game Testing with Deep Reinforcement Learning
Testing of games is generally a slow and expensive process that become more and more crucial as game grows in size and complexity. Previous standard approaches includes scripting of bots to automatically play and explore the game. This approach is effective in certain areas but lacks the dynamics and learnability to fully test modern AAA games. Therefore, we at SEED and EA are looking into how we can use ML as a tool to further extend that capacity. In this talk we describe our efforts at SEED and EA to use machine learning, specifically reinforcement learning, to improve automated testing of games.
SEED is an advanced R&D group at Electronic Arts. Our goal is to explore the future of game and game creation.
Linus Gisslén is a Senior Research Engineer in Machine Learning at SEED. SEED is an advanced R&D group at Electronic Arts (EA). His current research focus is on Reinforcement Learning (RL) and Procedural Generated Content (PCG). He is the project lead on their effort to use machine learning to improve automated testing of games. Previous experience includes a PhD. from TU München, Germany, and a PostDoc position at Jürgen Schmidhuber's AI lab in Switzerland where the main research focus was on Reinforcement Learning.
-
11:20
The AI Economist: Improving Equality and Productivity with AI-Driven Tax Policies
Stephan Zheng - Lead Research Scientist - Salesforce Research
The AI Economist: Improving Equality and Productivity with AI-Driven Tax Policies
Designing sound economic policy is hard in practice, given a lack of high-quality economic data and limited opportunity to experiment. Bridging these gaps, the AI Economist is a two-level deep RL framework to learn economic policy in economic simulations with agents and a planner who both learn and co-adapt. This approach yields tax policies that improve the equality and productivity trade-off by at least 16%, compared to the Saez tax, US federal tax, and the free market. These results show that the AI Economist can overcome many limitations of traditional economics and provides an exciting new approach to economic design.
Stephan Zheng is a Lead Research Scientist and heads the AI Economist team at Salesforce Research. He currently works on using deep reinforcement learning and economic simulations to design economic policy. His work has been widely covered in the media, including the Financial Times, Axios, Forbes, Zeit, Volkskrant, MIT Tech Review, and others. He holds a Ph.D. in Physics from Caltech (2018), where he worked on imitation learning of NBA basketball players and neural network robustness, amongst others. He was twice a research intern with Google. Before machine learning, he studied mathematics and theoretical physics at the University of Cambridge, Harvard University, and Utrecht University. He received the Lorenz graduation prize from the Royal Netherlands Academy of Arts and Sciences for his master's thesis on exotic dualities in topological string theory and was twice awarded the Dutch national Huygens scholarship. www.stephanzheng.com.
-
11:45
MAKE CONNECTIONS: Meet with Attendees Virtually for 1:1 Conversations and Group Discussions over Similar Topics and Interests
-
12:00
END OF SUMMIT