Beating Perils of Non-convexity: Guaranteed Training of Neural Networks using Tensor Methods
Neural networks have revolutionized performance across multiple domains such as computer vision and speech recognition. However, training a neural network is a highly non-convex problem and the conventional stochastic gradient descent can get stuck in spurious local optima. We propose a computationally efficient method for training neural networks that also has guaranteed risk bounds. It is based on tensor decomposition which is guaranteed to converge to the globally optimal solution under mild conditions. We explain how this framework can be leveraged to train feedforward and recurrent neural networks.
Hanie Sedghi is a Research Scientist at Allen Institute for Artificial Intelligence (AI2). Her research interests include large-scale machine learning, high-dimensional statistics and probabilistic models. More recently, she has been working on inference and learning in latent variable models. She has received her Ph.D. from University of Southern California with a minor in Mathematics in 2015. She was also a visiting researcher at University of California, Irvine working with professor Anandkumar during her Ph.D. She received her B.Sc. and M.Sc. degree from Sharif University of Technology, Tehran, Iran.