Using AI to Transform Informational Videos and Our Watching Behavior
Videos account for about 75% of the internet traffic and enterprises are increasingly using videos for various informational purposes, including training of customers, partners and employees, marketing and internal communication. However, most viewers do not have the patience to watch these videos end-to-end and our video watching experience has not evolved much in over a decade. We present an AI-based approach to automatically index videos in the form of a table-of-contents, a phrase cloud and a searchable transcript, which helps summarize the key topics in a video and lets viewers navigate directly to the topics of interest. We use a combination of visual classification, object detection, automated speech recognition, text summarization, and domain classification, and show the results achieved on a range of informational videos. We conclude with some thoughts on the promise of transforming how informational videos are consumed as well as open problems and future directions.
Dr. Manish Gupta is the co-founder and CEO of VideoKen Inc., a video technology startup. He has served as the Vice President and Director of Xerox Research Centre India and has held various leadership positions with IBM, including that of Director, IBM Research - India and Chief Technologist, IBM India/South Asia. As a Senior Manager at the IBM T.J. Watson Research Center in Yorktown Heights, New York, Manish led the team developing system software for the Blue Gene/L supercomputer. In 2009, IBM was awarded a National Medal of Technology and Innovation for Blue Gene by then US President Barack Obama. Manish holds a Ph.D. in Computer Science from the University of Illinois at Urbana Champaign. He has co-authored about 75 papers and has been granted 19 US patents. While at IBM, Manish received two Outstanding Technical Achievement Awards, an Outstanding Innovation Award, and the Lou Gerstner Team Award for Client Excellence.