Towards Perceptual Machines That See, Converse, and Reason
A successful autonomous system needs to not only understand the visual world but also communicate its understanding with humans. To make this possible, language can serve as a natural link between high level semantic concepts and low level visual perception. In this talk, I'll discuss recent work in the domain of vision and language, covering topics such as image/video captioning and retrieval, and question-answering. I’ll also talk about our recent work on task execution via language instructions.
Sanja Fidler is an Assistant Professor at the Department of Computer Science, University of Toronto. Previously she was a Research Assistant Professor at TTI-Chicago, a philanthropically endowed academic institute located in the campus of the University of Chicago. She completed her PhD in computer science at the University of Ljubljana in 2010, and was a postdoctoral fellow at University of Toronto during 2011-2012. She has served in program committees of numerous international conferences, and has received three outstanding reviewer awards. Together with Rich Zemel and Raquel Urtasun, she received the NVIDIA Pioneer of AI award. Her main research interests are object detection, 3D scene understanding, and the intersection of language and vision.