Natural Language for Explainable AI
Natural language is an effective medium through which most humans communicate. Enabling natural language interfaces in Artificial Intelligence (AI) systems would make them more intuitive and useful to the end users. In this talk I will focus on two projects where we rely on natural language to support the AI systems. First, I will cover our work on robust change captioning, where we learn to analyze pairs of images in order to detect significant semantic changes which we also summarize with natural language. Second, I will present our work on explainable and advisable driving models. Here, we develop models that can both, generate textual explanations of their actions, as well as incorporate user advice in the form of observation-action rules.
Key Takeaways: - We can learn to localize changes between two images without spatial supervision by learning to explain what has changed using natural language. - Making deep models explain their decisions in natural language does not lead to performance degradation and may in fact improve performance. - Incorporating human knowledge or advice in deep models leads to better performing, more interpretable models that also gain higher human trust.
I am a Research Scientist at UC Berkeley, working with Prof. Trevor Darrell. I have completed my PhD at Max Planck Institute for Informatics under supervision of Prof. Bernt Schiele. My research is at the intersection of vision and language. I am interested in a variety of tasks, including image and video description, visual grounding, visual question answering, etc. Recently, I am focusing on building explainable models and addressing bias in existing vision and language models.