Utilizing Sentence Embeddings to Extract High Coverage Key-Phrases
Extracting key-phrases is a Natural Language Processing task which has been studied for decades. What makes the task difficult is its intrinsic subjectivity and the fact that we cannot rely on labelled data. Traditionally, extracting key-phrases is tackled by graph based algorithms. This presentation aims to introduce an alternative technique, which leverages on recent research. Instead of using graph metrics, this novel technique relies on sentence embeddings to capture the main points of a document. A robust unsupervised technique, using compositional n-Gram features, is applied to form the embeddings. Finally, this method is incorporated into one of our core products with much success.
Sotirios Fokeas is a Machine Learning specialist working for Swisscom. He is located at the Innovation Park in Lausanne, where he remains at close contact with EPFL's research teams. Sotirios obtained his master's degree in Computer Science from EPFL with a focus on Machine Learning. Prior to Swisscom, he worked as a data scientist in the banking industry, where he developed a new approach for countering Money Laundering. Sotirios research is now focused on unsupervised techniques for analysing and extracting information from large volumes of textual data.