Enabling Real-time Model Scoring and Serving at Scale
The online world of Walmart generates millions of user events every second. There are multiple machine learning and deep learning models across different channels that is based on Customer online and offline data. At WalmartLabs, I'm part of the Customer data group that runs a horizontal platform known as "Customer BackBone (CBB)", which processes these events and seamlessly scores the models in real time.
The underlying architecture is supported by Kafkastreams, RocksDB and MLEAP runtime with the following main components: 1. Customer feed acquisition, ingestion & preprocessing pipeline 2. Online servable Customer graph, which is the sink for all the customer data/feed 3. Hosted Model Inferencing and post processing pipeline
I am a lead Big Data Engineer at Walmart Labs responsible for building scalable, reliable and intelligent Customer data platform, that processes customer activity across markets, programs, devices, apps and to some extent over the Internet, in close to real time, to offer superior insights into customer understanding. I have solid background in the full life cycle of data that enables critical data driven business decisions. Currently, I am building Customer Identity Graph, with 30+ billion nodes to enable Walmart identify its customers irrespective of the channel which brings them to Walmart. Machine Learning is helping me solve Graph data quality issues, which is otherwise a near-impossible mission. Previously, I worked at JP Morgan Chase where I built and managed machine learning pipelines that solved critical business challenges.