The Lottery Ticket Hypothesis: Finding Sparse, Trainable Neural Networks
Neural network pruning techniques can reduce the parameter counts of trained networks by over 90%, decreasing storage requirements and improving computational performance of inference without compromising accuracy. However, until recently, it was believed that the sparse architectures produced by pruning were difficult to train from the start, which would similarly improve training performance. In this talk, I will discuss a series of experiments that contradicted this received wisdom, showing that sparse neural networks are indeed trainable, provided they are given the same initialization they received at or near the start of training. This observation culminates in a new conjecture about opportunities to improve neural network training: the lottery ticket hypothesis.
Jonathan Frankle is a fourth-year PhD student at MIT (with Prof. Michael Carbin), where he studies empirical deep learning with the hope of improving our understanding of neural networks and making training more efficient. His dissertation explores the "Lottery Ticket Hypothesis" (for which he received a "Best Paper" award at ICLR 2019) and its implications. He has spent summers at Google Brain and FAIR, and he expects to be seeking full-time employment in the near future. Jonathan is also deeply involved in technology and AI policy: he advises policymakers, lawyers, journalists, and advocates on topics of contemporary relevance and has created a "Programming for Lawyers" course that he teaches at Georgetown Law.