Deep Learning With Flickr Tags
The members of the Flickr community manually tag photos with the goal of making them searchable and discoverable. With the advent of mobile phone cameras and auto-uploaders, photo uploads have become more numerous and asynchronous, and manual tagging is cumbersome for most users. However, using recent advances in deep learning we now can accurately and automatically identify the content of photos. Progress has been largely driven by training deep neural networks (DNNs) on datasets such as Imagenet that were built using manual annotators. In this talk, we show how it is possible to train DNNs using the user-generated tags directly. Although they are not always accurate, they have the advantage of being plentiful and allow us to train DNNs using an order of magnitude more data than previously done. Furthermore, they capture how the Flickr community tags their photos which is what an automated system should emulate. Training DNNs using hundreds of millions of user tags requires new tools. We also describe Caffe-On-Spark, our infrastructure for large scale distributed deep learning on Hadoop clusters.
Pierre Garrigues is a researcher in machine perception and learning. As a graduate student in the Redwood center for theoretical neuroscience at UC Berkeley, he developed computational models of human visual processing. He then applied the technology from his research to practical applications at IQ Engines, a Berkeley startup providing an image recognition platform that powered mobile visual search as well as the organization of large photo collections. He is currently a research engineer at Flickr. He holds a PhD from the department of Electrical Engineering and Computer Sciences at the University of California, Berkeley, and an undergraduate degree from the Ecole Polytechnique in France.