Blind Spots in Neural Networks
Deep neural networks are highly expressive models that have recently achieved state of the art performance on speech and visual recognition tasks. While their expressiveness is the reason they succeed, it also causes them to learn uninterpretable solutions.
We find that deep neural networks learn input-output mappings that are fairly discontinuous to a significant extent. We can cause the network to misclassify an image by applying a certain hardly perceptible perturbation, which is found by maximizing the network’s prediction error. In addition, the specific nature of these perturbations is not a random artifact of learning: the same perturbation can cause a different network, that was trained on a different subset of the dataset, to misclassify the same input.
Wojciech Zaremba is a PhD student at the New York University, and a scientist at Facebook AI Research. His expertise lies on deep learning, where he has experience in computer vision, and natural language processing problems. He is interested in solving symbolic manipulation tasks, which include reasoning about mathematical formulas, or computer program properties.
Wojciech has worked as a member of Google Brain under supervision of Prof. Geoffrey Hinton, and Ilya Sutskever. He holds a Master's degree summa cum laude from Ecole Polytechnique in Paris. He has also received a silver medal at the International Mathematical Olympiad in 2007.