Habitat: A Platform for Embodied AI Research
We present Habitat, a new platform for the development of embodied artificial intelligence (AI). Training robots in the real world is slow, dangerous, expensive, and not easily reproducible. We aim to support a complementary paradigm – training embodied AI agents (virtual robots) in a highly photorealistic 3D simulator before transferring the learned skills to reality.
The ‘software stack’ for training embodied agents involves datasets providing 3D assets, simulators that render these assets and simulate agents, and tasks that define goals and evaluation metrics, enabling us to benchmark scientific progress. We aim to standardize this entire stack by contributing specific instantiations at each level: unified support for scanned and designed 3D scene datasets, a new simulation engine (Habitat-Sim), and a modular API (Habitat-API).
The Habitat architecture and implementation combine modularity and high performance. For example, when rendering a realistic scanned scene from the Matterport3D dataset, Habitat-Sim achieves several thousand frames per second (FPS) running single-threaded and can reach over 10,000 FPS multi-process on a single GPU! Finally, we describe the Habitat Challenge, an autonomous navigation challenge that aims to benchmark and advance efforts in embodied AI.
- Photo-realistic simulators are the future
Dhruv Batra is an Assistant Professor in the School of Interactive Computing at Georgia Tech and a Research Scientist at Facebook AI Research (FAIR).
His research interests lie at the intersection of machine learning, computer vision, natural language processing, and AI, with a focus on developing intelligent systems that are able to concisely summarize their beliefs about the world with diverse predictions, integrate information and beliefs across different sub-components or `modules' of AI (vision, language, reasoning, dialog), and interpretable AI systems that provide explanations and justifications for why they believe what they believe.
In past, he has also worked on topics such as interactive co-segmentation of large image collections, human body pose estimation, action recognition, depth estimation, and distributed optimization for inference and learning in probabilistic graphical models.
He is a recipient of the Office of Naval Research (ONR) Young Investigator Program (YIP) award (2017), the Early Career Award for Scientists and Engineers (ECASE-Army) (2015), the National Science Foundation (NSF) CAREER award (2014), Army Research Office (ARO) Young Investigator Program (YIP) award (2014), Outstanding Junior Faculty awards from Virginia Tech College of Engineering (2015) and Georgia Tech College of Computing (2018), two Google Faculty Research Awards (2013, 2015), Amazon Academic Research award (2016), Carnegie Mellon Dean's Fellowship (2007), and several best paper awards (EMNLP 2017, ICML workshop on Visualization for Deep Learning 2016, ICCV workshop Object Understanding for Interaction 2016) and teaching commendations at Virginia Tech. His research is supported by NSF, ARO, ARL, ONR, DARPA, Amazon, Google, Microsoft, and NVIDIA. Research from his lab has been extensively covered in the media (with varying levels of accuracy) at CNN, BBC, CNBC, Bloomberg Business, The Boston Globe, MIT Technology Review, Newsweek, The Verge, New Scientist, and NPR.
From 2013-2016, he was an Assistant Professor in the Bradley Department of Electrical and Computer Engineering at Virginia Tech, where he led the VT Machine Learning & Perception group and was a member of the Virginia Center for Autonomous Systems (VaCAS) and the VT Discovery Analytics Center (DAC). From 2010-2012, he was a Research Assistant Professor at Toyota Technological Institute at Chicago (TTIC), a philanthropically endowed academic computer science institute located on the University of Chicago campus. He received his M.S. and Ph.D. degrees from Carnegie Mellon University in 2007 and 2010 respectively, advised by Tsuhan Chen. In past, he has held visiting positions at the Machine Learning Department at CMU, CSAIL MIT, Microsoft Research, and Facebook AI Research.