From Word2vec to ELMo: Using Context to Improve Word Vectors for NLU
Word vectors such as word2vec are ubiquitous in natural language processing (NLP) systems as they allow models to leverage large amounts of unlabeled text. However, they have several shortcomings, notably they produce a single vector for each word. This is especially problematic for words with many senses since understanding the syntactic and semantic roles of these words requires examining the broader context in which they are used. In this talk, I will show how to overcome these limitations and learn contextual representations of word meaning from unlabeled text. When added to existing NLP systems, these ELMo representations provide a significant increase in overall performance across a wide range range of tasks including question answering and sentiment classification. I will also provide some intuition for what the ELMo representations encode and why they are empirically successful.
Matthew Peters is a Research Scientist at AI2 exploring applications of deep neural networks to fundamental questions in natural language processing. Prior to joining AI2, he was the Director of Data Science at a Seattle start up, a research analyst in the finance industry and a post-doc investigating cloud-climate feedback. He has a PhD in Applied Math from the University of Washington.