Massively Improving Data-efficiency of Supervised Learning Systems using Self-supervision from Unlabeled Data
I will talk about some recent work on improving the data efficiency of supervised learning models by learning rich representations of unlabeled data. Using self-supervised learning methods to predict missing information from a given context, useful features can be learned and used for downstream tasks with very few labels. It is possible to now match the performance of powerful image recognition systems such as AlexNet and VGG using as few as 2% and 10% of the labeled data respectively on the widely benchmarked ImageNet dataset. The benefits from pre-training also remain in the high data regime with the performance improving to 80% top-1 accuracy on ImageNet.
Aravind is a Ph.D. student at UC Berkeley advised by Prof. Pieter Abbeel where he co-created and taught the first edition of the Deep Unsupervised Learning class. He has spent time at OpenAI and DeepMind and is broadly interested in unsupervised representation learning.