Data Collection and Synthesis
Data science practitioners identify a problem space where developing a Machine Learning model can help their business goals. The first step in the process is finding the right data. Data Scientists collaborate with people from different disciplines in obtaining the data of interest. Given the complexity of the process, scientists have to design the data collection pipeline in such a way that the data collected by the team is representative of the problem and devoid of any kind of bias.
Lakshmi is an Applied Scientist with Amazon.She has been working with Amazon Machine Learning teams for the last 4.5 years. She had the chance to be part of Alexa's NLP team, Behavior Analytics (a causal Inference division in Amazon) and Amazon Music teams (improving the voice experience in Alexa).