Deep Learning in Speech Recognition
While neural networks had been used in speech recognition in the early 1990s, they did not outperform the traditional machine learning approaches until 2010, when Alex’s team members at Microsoft Research demonstrated the superiority of Deep Neural Networks (DNN) for large vocabulary speech recognition systems. The speech community rapidly adopted deep learning, followed by the image processing, and many other disciplines. In this talk I will explain the transition to deep learning, what the speech recognition field has accomplished, and the remaining challenges.
Alex Acero (PhD, Carnegie Mellon, 1990) is Sr. Director in the Siri team in charge of speech recognition, speech synthesis, and machine translation. Prior to joining Apple, he spent 20 years at Microsoft Research managing teams in speech, audio, multimedia, computer vision, natural language processing, machine translation, machine learning, and information retrieval. Dr. Acero is an IEEE Fellow and ISCA Fellow. Alex has served as President of the IEEE Signal Processing Society and is currently a member of the IEEE Board of Directors. He is the author of the textbook “Spoken Language Processing”. Dr. Acero has published over 250 technical papers and has over 150 US patents.