Fairness and Bias in Machine Learning
Machine Learning Models rely on data to be able to train on and make predictions. As a result, the biases that are present in these datasets get reflected in these models. In addition, many ML systems in production are trained in a streaming fashion, where they are further adapted on some history of previous traffic data. This creates a negative feedback loop that reinforces these biases that might be present in the system, working well for groups for which abundant data is present and failing for other groups where there is a scarcity of data.
In this talk, I will highlight some of the current challenges in identifying and quantifying these biases and highlight the need for good stress test datasets to test for blindspots in these models. In addition, I will highlight the need for moving away from reporting performance metrics on a static data set towards continuous evaluation of such systems, that take into account effects of bias reinforcing feedback loops.
Pallavi Baljekar is a Software Engineer in the Google Brain team in Cambridge where her main research focus is on making Google services and products more inclusive and less biased. Previously she obtained her PhD from the School of Computer Science at Carnegie Mellon University working with Dr. Alan Black in building speech synthesis systems for low resource languages.