Bryan Russell

Research Scientist
Adobe Research

Learning from Video: Recognizing Actions and Localizing Moments with Natural Language

This talk will describe two works in video understanding. In the first part, I will describe ActionVLAD, a new video representation for action classification that aggregates local convolutional features across an entire spatio-temporal extent of a video. In the second part, I will describe an approach that retrieves a specific temporal segment (moment) from a video given a natural language text description. We address lack of video datasets for this task by collecting the Distinct Describable Moments (DiDeMo) dataset which consists of over 10,000 unedited, personal videos in diverse visual settings with pairs of localized video segments and referring expressions.

Bryan Russell is currently a Research Scientist at Adobe Research in San Francisco, CA. He received his Ph.D. from MIT in the Computer Science and Artificial Intelligence Laboratory and was a post-doctoral fellow in the INRIA Willow team in Paris, France. He was a Research Scientist with Intel Labs as part of the Intel Science and Technology Center for Visual Computing (ISTC-VC) and was Affiliate Faculty at the University of Washington.

Buttontwitter

As Featured In

Original
Original
Original
Original
Original
Original

Partners & Attendees

Intel.001
Nvidia.001
Graphcoreai.001
Ibm watson health 3.001
Facebook.001
Acc1.001
Rbc research.001
Twentybn.001
Forbes.001
Maluuba 2017.001
Mit tech review.001
Kd nuggets.001