BOB-ROS: A Deep RL Simulation Environment for ROS and Gazebo
eep Reinforcement Learning (Deep RL) is presently one of the hottest and fastest-paced application areas of deep learning and machine learning as a whole. This intense focus is driven by the eye-watering potential of importing the ground-breaking accuracy improvements of deep neural networks seen in large-scale supervised learning benchmarks, into the world of optimal decision making and control. Arguably one of the most important components of Deep RL is the simulation environment. Simulation environments play the key role of providing a benchmarking platform for comparing the cornucopia of different RL algorithms, hence giving researchers and practitioners crucial feedback on how effective their ideas are. To date, most simulation environments have been focused on gaming environments, due to their fast physics engines, semi-realistic rendering pipelines and the ease with which one can use their points systems as an out-of-the-box reward function, which typically isn't too delayed. Ultimately, the end goal of RL research is to build agents/robots which can interact effectively within real-world environments. The applications are limitless, ranging from autonomous vehicles to drones which can deliver packages. When viewed from this lens, the gaming-as-a-benchmark approach suffers from several shortcomings. The first issue is somewhat obvious: a game environment, by design, has its own rules and isn't completely governed by the rules of the physical world a robot must operate in, and as such, has limited applicability. We argue that the physical world presents plenty of highly challenging environments to navigate, and there simply is no need to use the additional rules imposed by games to test RL systems, and, indeed these additional rules can be a distraction from building agents which are actually useful for society. The second is the lack of realistic agent-centric input channels. Almost all real-world agents have multiple sensors recording stimuli in parallel. A good example is an autonomous vehicle which receives, in real-time, data from odometric sensors, GPS, IMU, LIDAR, RADAR, SONAR and cameras. Data from all these sensors need to be fused effectively to form a state representation useful for the task at hand. Gaming agents have a very limited (if any) array of sensors: one simply learns from pixels showing the state of the environment and the agent form a third-person point of view.
To address these shortcomings we introduce a new RL benchmarking tool: the Benchmark Of Behavior in the Robot Operating System (BOB-ROS). BOB-ROS is a simulation environment made for the sole purpose of actually building robots. Therefore, it is a highly pertinent simulation environment for deep RL systems in order to determine their utility for robotics specifically.
We present results of using several standard deep RL tools for the purpose of training a drone to fly in a maze-like office environment from A to B without hitting any obstacles. We will present the challenges in using deep RL in a slower and more complex simulation environment like those built upon ROS, and the solutions we used to overcome these challenges.
Prerequisites: Basics of Reinforcement Learning fundamentals, Basic knowledge of probability and statistics, (Optional) Some familiarity with OpenAI gym.
Jane received her PhD from MIT and the Broad Institute in Anne Carpenter’s Imaging Platform. While applying deep learning-based computer vision models to biological problems like malaria detection, she became interested in software that bridges the gap between research and real world use cases. At Uber AI Labs, she has been working with product teams like Elevate on machine learning models for improved planning and Freight for improved price forecasting. She is also working on combining reinforcement learning algorithms with realistic simulations.