From Word Embeddings to Pre-trained Models : A New Age in NLP
In computer vision, for a few years now, the trend has been to pre-train vision models on the huge ImageNet corpus to achieve state of the art results. With the latest innovations in the natural language processing world like ULMFiT, BERT, GPT-2, etc. pre-trained models based on Language Modeling have been dubbed as NLP’s ‘ImageNet’ moment. The standard approach of conducting NLP projects has been to initialize the first layer of a neural network with vanilla (context independent) word embeddings like Word2Vec and GloVe and then training the rest of the network from scratch on task-specific data. However this is now changing and many of the current state-of-the-art models for supervised NLP tasks are models trained on language modeling and then fine tuned on task-specific data. In this talk, we will explore some of these techniques that have taken the NLP world by storm.
Shreya is a Data Scientist and ML practitioner at Amazon. At Amazon, she spends her time making Alexa smarter and more productive for her customers by working on some very challenging ML problems ranging from personalization to relevance to text classification and natural language understanding. Before joining Amazon, Shreya was at the University of Cincinnati where she got her master's degree in Analytics and thoroughly enjoyed deep diving into the data mining and applied machine learning space.