Last week at the Deep Learning Summit in San Francisco I spoke with Shubha Nabar, Senior Director of Data Science at Salesforce Einstein about her current work at Salesforce as well as in ethical AI. Shubha and her team make machine learning technologies accessible to the hundreds of thousands of businesses that use Salesforce every day. In 2017, she was featured as one of 20 Incredible Women Advancing AI Research by Forbes Magazine. She has over a decade of experience building data products and data science teams at Microsoft, LinkedIn, and Salesforce.
At the summit, we chatted about Shubha's career and she explained that before Salesforce she had extensive experience building data products and data science teams at places like Microsoft, Linkedin, and now, Salesforce. Prior to all this, she received her PhD in computer Science at Stanford University. She went on to explain that at Salesforce Einstein, 'we work with some of the most diverse data sets in the world.' Hundreds of thousands of businesses and millions of people use Salesforce every day to manage all kinds of customer touch-points. Thus Shubha and her team deal with sales, service, and marketing data from Fortune 500 companies, web browsing and purchase data from large e-commerce companies, IoT data from connected devices, patient care data from large hospitals and clinics, and so forth. She explained that their 'goal is to democratize data science so that every business, even one without a sizeable data science or engineering team, should be able to harness all its data to make smarter decisions. The platform that we are building for this purpose, and the specific data products that we are building on top of it, have to deal with tremendous scale and variance in data in order to service the hundreds of thousands of Salesforce-driven businesses. We work extensively with Spark and SparkML, and are the team behind the top-rated Spark-based open source machine learning projects -- Apache PredictionIO, as well as the recently open-sourced TransmogrifAI part of the AI engine that fuels Salesforce’s Einstein AI platform'.
Keen to learn more about everyday life at Salesforce, I asked Shubha what a typical day looks like for her and the team;
A typical day involves wearing many different hats in order to guide planning and technical decisions, product roadmap and messaging, recruiting and career development, and most importantly, blocking and tackling in order to get stuff done!
She explained that one the team's biggest achievements last year was to open source their automated ML library, TransmogrifAI, to empower other developers and data scientists to build ML solutions faster and at scale. I asked Shobha why this was an important feat for your team and where she saw it going;
When we set out to deliver AI to our customers, we ran into unique challenges around delivering AI at enterprise-scale on structured data. The diversity of data and of use cases across our customers necessitated automation of the typical data science processes, and this is why we built TransmogrifAI. It has enabled our data scientists to deploy thousands of models in production with minimal hand tuning and reduced the average turn-around time for training a performant model from weeks to just a couple of hours. As the power of machine learning automation became evident to us, we wanted to share TransmogrifAI with the broader community since every business today has more machine learning use cases than it has data scientists. Machine learning has the potential to transform how these businesses operate, and barriers to adoption can only be lowered through an open exchange of ideas and code.
Q: What are some of the recent advancements in AI and DL and how have they impacted your work?
A: The most impactful advancements for us have been in compute paradigms. Frameworks such as Apache Spark allow our data scientists to manipulate large volumes of data without worrying about the details of distributed computing. Advances in operationalization of machine learning models allow our data scientists to deploy models in production and not worry about serving and scale. At Salesforce, we’re taking this a step further, and enabling our business users to build and use machine learning models on their data without the need for armies of engineers and data scientists to solve the machine learning and integration problems. With the number of models we automatically deploy in production, our compute infrastructure needs become all the more important.
Q: What are some of the other industries that could benefit from the work you’re doing at the moment?
A: There are hundreds of thousands of businesses across industries that use Salesforce every day to manage a wide variety of customer data across sales, service, marketing and more. They range from NGOs to Fortune-500 companies. The work we do has the potential to transform how each of these businesses operates. As an example, one of our customers is an NGO that helps get low-income students through college. Prior to using Salesforce Einstein, they had dabbled with hiring a data scientist to build predictive machine learning models on students who were likely to drop out. This year of work ended up in a spreadsheet that nobody really used. By using Salesforce Einstein they were able to get a fully integrated machine learning model producing live predictions on their data with just a few clicks. That translates to so many more at-risk students getting the attention they need in a timely manner. Now this was an example of an NGO with a fraction of the resources of a Fortune 500 company. But the truth is that every business today has more machine learning use cases than data scientists it can hire and the work we do can help all of them.
Q: There’s been a lot of discussion around the ethics of AI and how systems can be built to be fair or unbiased - how do you think this can be done, and are we close to this?
A: The problem of ensuring ethical AI needs to be tackled at multiple levels in an organization. To start off, any organization that is ready to build or deploy AI-based automation needs to acknowledge the potential problems and create an organizational culture where it is okay to speak up, even if at the cost of short-term profit. (At Salesforce, we recently announced the creation of the Office of Ethical and Humane Use of Technology to bring together various internal and external experts to ensure we’re having constant discussions and a consistent way to evaluate ourselves).
Hand-in-hand with this, the teams that build or deploy AI-based automation need to reflect the diversity of the populations they cater to. History is rife with examples where a lack of diversity resulted in poor outcomes. At Salesforce, Trust and Equality are two of our core values and guiding factors as we build and use these tools. Ethics is a mindset, not a checklist, and so it is a consistent lens that we use to evaluate everything we do around these innovations.
Teams should also understand their data well and understand the types of biases they seek to avoid. There are emerging algorithmic techniques for detecting and eliminating undesirable bias, and we have features to raise awareness of potential biases that may be present in a dataset. The tools we use to build AI should incorporate these techniques and also provide explanability for individual automated decisions.
And finally, we should measure, measure, measure. An organization cannot begin to fix that which it does not measure! Ultimately, I feel optimistic about where we’re headed -- with emerging algorithmic techniques for detection and elimination of bias, as well as advances in model explainability and transparency, we stand to introduce a lot more fairness, consistency, and auditability around decision-making, if human and machine only work together!
Q: What’s next for you in your work?
A: My focus continues to be on democratization of machine learning technologies. Supporting new use cases, better usability, and greater scale!
If you're interested in watching Shubha's panel discussion, 'Harnessing Automation for a Just World' from the summit register your information to receive presentation access.