The Goldilocks Principle of Data Engineering
Before teams can leverage big-data technologies like Spark and Hadoop to power their Deep Learning Models, raw data often needs to be pre-processed and restructured significantly. Learn valuable pre-processing lessons from Freebird, who leverages terabytes of dynamic travel data daily, into files that have “just-the-right” size and structure for MapReduce based technologies.
Sam is the CTO and co-founder of Freebird. As part of his role, Sam leads the data science team developing the data systems and predictive analytics that power the Freebird travel intelligence and rebooking solution. Freebird dynamically predicts the impact of flight disruptions and the expected rebooking costs, by leveraging a diverse range of data science, statistical analysis, and machine-learning techniques. Sam has extensive experience in the commercial application of machine-learning algorithms. Prior to this, Sam worked as a quantitative risk analyst in the currency markets and as a team lead automating a large-scale data classification problem for an energy intelligence company. Sam is a Duke University graduate and works on a grant with MIT’s Computational Cognitive Science group to extend decision theory using advancements in machine learning and artificial intelligence.