Deep Incremental Scene Understanding
This talk demonstrates recent advances in the field of deep learning and computer vision aimed at scene understanding from RGB sensors. It first introduces recent works aimed at using SLAM for real-time scene understanding, object detection and 3D pose estimation. Then, it shows how deep learning can be usefully employed for simultaneous reconstruction and semantic segmentation in absence of a depth sensor. The outcome is a technique able to carry out accurate real-time semantic mapping and SLAM from a single RGB camera. Finally, it demonstrates how the modeling of such problem as an ambiguous prediction task can enable improved depth estimation and reconstruction accuracy, through the estimation of multiple hypotheses.
Federico Tombari is Senior Research Scientist and Team Leader at the CAMPAR Chair, Technical University of Munich (TUM). He has more than 10 years of research experience in the field of computer vision and machine learning. He has co-authored more than 120 refereed papers on topics such as visual data representation, RGB-D object recognition, 3D reconstruction and matching, stereo vision, deep learning for computer vision. He got his Ph.D at 2009 from University of Bologna, where he was Assistant Professor from 2013 to 2016. In 2008 he was an intern at Willow Garage, California. He is a Senior Scientist volunteer for the Open Perception Foundation and a developer for the Point Cloud Library, for which he served, in 2012 and 2014, respectively as mentor and administrator in the Google Summer of Code. In 2015 he was the recipient of a Google Faculty Research Award. His works have been awarded at conferences and workshops such as 3DIMPVT'11, MICCAI'15, ECCV-R6D'16. He is a research partner of Google, BMW, Toyota and Zeiss.