Zero-To-Hero: Solving the NLP Cold Start Problem
Mailchimp is the world's largest marketing automation platform. Over a billion emails are sent everyday by users through the platform. This mass of marketing text data creates lots of opportunities to leverage natural language processing to improve and create content for users. Like many natural language processing (NLP) practitioners, data scientists at Mailchimp have found annotating text data to be costly, time consuming, and in some cases legally prohibited. So how do they work around it? We'll do a deep dive into how Mailchimp uses state-of-the-art NLP models and unlabeled data to cold start NLP products. We'll cover its data-centric (over model-centric) approach and how it positions its products to facilitate a data flywheel.
Muhammed Ahmed is a Senior Data Scientist at Mailchimp who specializes in natural language processing and computer vision. At Mailchimp, he has majorly contributed to the implementation and deployment of several AI-assisted products including multimodal classification, preview text generation, stock photo recommendation, campaign engagement scoring, and semi-supervised topic clustering using large transformer models (T5, BART, RoBERTa, UNITER, and similar). Most recently, his focus has been on developing a systematic approach to use zero-shot learning to extract arbitrary information from text.