Visual Recognition beyond 2D
Undoubtedly 2D visual recognition has seen unprecedented success, with the state of the art advancing every single conference cycle. But as we develop sophisticated machines that predict 2D object masks and object classes, we tend to ignore the fact that the world is not 2D and objects don't live in a 2D grid. On the other hand, the focus of 3D object recognition is dramatically different than its 2D counterpart, with benchmarks that lack in complexity compared to COCO or ImageNet and models that can not tackle diversity in appearance and shapes across object instances. In this talk, I will present some of our efforts to marry the advances in 2D recognition with 3D shape inference in the wild. I will also introduce our new library of 3D operators which builds on PyTorch and contains highly optimized 3D operations (including a differentiable renderer!) which are essential when designing and training deep learning models with 3D data sources. Lastly, I will present our new project on novel view synthesis for real complex scenes, in an effort to convince you that 3D representations can be quite impactful in a variety of tasks!
Georgia Gkioxari is a research scientist at Facebook AI Research (FAIR). She received a PhD in computer science and electrical engineering from the University of California at Berkeley under the supervision of Jitendra Malik in 2016. Her research interests lie in computer vision, with a focus on object and person recognition from static images and videos. In 2017, Georgia received the Marr Prize at ICCV for "Mask R-CNN".