DEEP LEARNING THEORY & APPLICATION
Ziming Zhang - Mitsubishi Electric Research Laboratories
BPGrad: Towards Global Optimality in Deep Learning via Branch and Pruning
Understanding the global optimality in deep learning (DL) has been attracting more and more attention recently. Conventional DL solvers, however, have not been developed intentionally to seek for such global optimality. In this paper we propose a novel approximation algorithm, BPGrad, towards optimizing deep models globally via branch and pruning. Our BPGrad algorithm is based on the assumption of Lipschitz continuity in DL, and as a result it can adaptively determine the step size for current gradient given the history of previous updates, wherein theoretically no smaller steps can achieve the global optimality. We prove that, by repeating such branch-and-pruning procedure, we can locate the global optimality within finite iterations. Empirically an efficient solver based on BPGrad for DL is proposed as well, and it outperforms conventional DL solvers such as Adagrad, Adadelta, RMSProp, and Adam in the tasks of object recognition, detection, and segmentation.
Dr. Ziming Zhang is a research scientist at Mitsubishi Electric Research Laboratories (MERL). Before joining MERL he was a research assistant professor at Boston University. He received his PhD degree in 2013 from Oxford Brookes University, UK, under the supervision of Prof. Philip H. S. Torr. His research areas include object recognition and detection, machine learning, optimization, large-scale information retrieval, visual surveillance, and medical imaging analysis.
DEEP LEARNING MODELS & FRAMEWORKS
Nikhil Thorat - Google Brain
Deeplearn.js: A Hardware Accelerated Machine Intelligence Library for the Web
Nikhil is a Software Engineers in Google Brain, working on interpretability, visualization and democratization of machine learning. Some of his projects include the Graph visualizer and the Embedding Projector, which are part of TensorBoard, as well as new saliency techniques for neural networks. Recently they created deeplearn.js, a hardware accelerated browser-based machine learning library.
Daniel Smilkov - Google Brain
Deeplearn.js: A Hardware Accelerated Machine Intelligence Library for the Web
Daniel is a Software Engineers in Google Brain, working on interpretability, visualization and democratization of machine learning. Some of his projects include the Graph visualizer and the Embedding Projector, which are part of TensorBoard, as well as new saliency techniques for neural networks. Recently they created deeplearn.js, a hardware accelerated browser-based machine learning library.
William Moses - MIT CSAIL
Deep learning models analyze massive amounts of data, with applications in everything from translation to image generation. Competing deep learning frameworks explore different tradeoffs between usability and expressiveness and operate on a DAG of computational operators, which wrap high-performance libraries such as CUDNN or NNPACK. When such libraries are unable to efficiently represent a computation, users need to build custom operators, often at high engineering cost. This is required when operators are invented by researchers, which suffer a severe performance penalty and limits innovation. Furthermore, even existing runtime calls often does not offer optimal performance, missing optimizations between operators as well as optimizations on the size and shape of data. Our contributions include (1) an easy-to-use language called Tensor Comprehensions, (2) a polyhedral Just-In-Time compiler to convert a mathematical description of a deep learning DAG into a high-performance CUDA kernel, providing optimizations such as operator fusion and specialization, (3) a compilation cache populated by an autotuner. We demonstrate the suitability of the polyhedral framework to construct a domain-specific optimizer effective on state-of-the-art GPU deep learning. Our flow reaches up to 4× speedup over NVIDIA libraries on machine-learning kernels, and one of Facebook’s production models.
William S. Moses is a PhD student in Computer Science at MIT, where he also received his Bachelor’s and Master’s degrees in EECS and Physics in 2017. He works at MIT's Computer Science and Artificial Intelligence Lab, where he researches the intersection of computer systems and machine learning with the goal of creating systems that allow anyone to automatically produce high-quality, efficient, and correct code. As an undergraduate, his work on the Tapir compiler extensions for parallel programming won best paper at the 2017 Symposium on Principles and Practice of Parallel Programming. He’s previously worked at Facebook’s AI Research Lab, SpaceX, and the U.S. Naval Research Lab.
Fabio Buso - Logical Clocks
Hyperscale Deep Learning for the Masses
State-of-the-Art Deep Learning systems at hyper-scale AI companies attack the toughest problems with distributed deep learning. Distributed Deep Learning systems enable both AI researchers and practioners to be more productive and the training of models that would be intractable on a single GPU server. In this talk, we will introduce the latest developments in distributed Deep Learning (synchronous stochastic gradient descent) and how distribution can both massively reduce training time and parallel experimentation, using large-scale hyperparameter optimization. We will introduce different distributed architectures, including the parameter server and Ring-AllReduce models. In particular, we will describe open-source TensorFlow frameworks that leverage Apache Spark to manage distributed training, such as Yahoo’s TensorflowOnSpark, Uber’s Horovod platform, and Hops’ TfSpark. We will introduce the different programming models supported and highlight the importance of cluster support for managing GPUs as a resource. To this end, we will also introduce Hops, an open-source distribution of Hadoop with support for GPUs as a resource, and show how TensorFlow/Spark applications can be easily run from a Jupyter Notebook. We will also show that on-premise distributed Deep Learning is gaining traction, as both enterprise and commodity GPUs can be integrated into a single platform.
Fabio Buso is the head of engineering at Logical Clocks AB, focusing mainly on the Machine Learning service of the HopsHadoop platform. He is currently leading the development of a scalable model serving infrastructure over Hops and Kubernetes. He is also involved in the development of a Feature Store for machine learning on Hops which is integrated with the TensorFlow framework. Fabio has an international background, holding a master's degree in Cloud Computing and Services, with a focus on data intensive applications, awarded by a joint program between KTH Stockholm and TU Berlin. During his Master's Thesis at RISE SICS AB, he implemented a strongly consistent metastore for Apache Hive on Hops.
DEEP CONVOLUTIONAL NEURAL NETWORKS
Michael Sollami - Salesforce
The Future of NLP in e-Commerce: Generative Multimodal Language Models
Recent advances in Deep Learning have resulted in a new generation of natural language technologies. In the e-commerce setting, new transformer-based models have enabled enhancements across a vast array of product features and services. For instance, in online shopping product descriptions, call-to-actions, and blogs are all primary ways to inform and attract customers, playing crucial roles in conversion rates and SEO. At Einstein, we have developed new multimodal conditional natural language models trained to automatically craft unique, interesting, contextualized copy. We will present these new methods for generating new and enhancing existing text e-commerce, e.g. product catalogs, merchant sites, and other marketing channels.
Michael received a doctorate in mathematics from the University of Wyoming. Since 2012 he has led research and development teams at a number of successful Boston-based startups. Currently a lead data scientist on Salesforce's Einstein team, he enoys designing and building deep learning systems with applications to e-commerce and computer vision.
Sergul Aydore - Amazon
Neural Networks for Forecasting Demand
Large e-commerce such as Amazon sell millions of physical products on a daily basis. Demand forecasting for those products is needed to enable better in-stock positions. Therefore, it is crucial to provide forecasts with adequate accuracy in a reasonable time. In this work, we focus on the problem of product-level demand forecasting at weekly and multi-weekly aggregates. Optimal buying decisions require specific quantile of the forecast demand distributions. As a result, forecasts are generated and consumed as full predictive distributions in addition to point forecasts. In this work, we generate a full predictive distribution for a specific product, lead time, span (duration of demand aggregation) and forecast creation date (FCD) given all available information at FCD such as past demand (essentially sales), item details, past traffic and so on. We have proposed to use neural network (NN) method as an alternative to a benchmark model which is based on decision trees and k-nearest neighbor clustering. In this presentation, I will describe our approach, concentrating on feature selection and optimization.
Sergul has been a Machine Learning Scientist at Amazon’s demand forecasting team since 2016. She builds neural network models to predict demands of millions of products to enable better in-stock positions. Sergul’s expertise lies in time series data modeling for accurate prediction and classification. She received her PhD degree from the Signal and Image Processing Institute at University of Southern California in 2014. Her PhD work was on developing robust connectivity measures for neuroimaging data. Prior to Amazon, Sergul was a postdoctoral researcher at Columbia University where she implemented machine learning models for EEG data. She also made contributions to the open source machine learning library scikit-learn and she is an active collaborator of Parietal team at Inria, Paris where she works on novel regularization methods for machine learning.
NATURAL LANGUAGE UNDERSTANDING
Anoop Deoras - Netflix
Latent Models (Shallow and Deep) for Recommender Systems
In this talk, we will survey latent models, starting with shallow and progressing towards deep, as applied to personalization and recommendations. After providing an overview of the Netflix recommender system, we will discuss research at the intersection of deep learning, natural language processing and recommender systems and how they relate to traditional collaborative filtering techniques. We will discuss techniques for embedding discrete user action events into continuous but latent space for building a context aware collaborative filtering model for personalization and recommendations. Finally, we will highlight promising new directions in this space.
Anoop Deoras is a Lead Researcher at Netflix, where he leads the algorithmic innovation and productization of deep learning based recommender system models. He is interested in building the next generation of Machine Learning algorithms to drive the Netflix experience. Before that, he was a Lead Researcher at Microsoft, working on Cortana, an AI based virtual personal assistant for Windows OS. He holds a PhD from Johns Hopkins University where he proposed innovative algorithms for the first ever successful integration of Recurrent Neural Network based language models in Large Vocabulary Continuous Speech Recognition and Statistical Machine Translation.
Bjarke Felbo - MIT
Transferring Knowledge From Emojis
Emotional content is an important part of language, making emotional analysis an interesting and useful task. The classic use case is companies wanting to make sense of what their customers are saying about them, but there are many other applications as well as AI becomes an increasingly important part of many products. We show how emojis can be used to teach our deep learning models very rich representations of emotional content in text. The method is based on the basic idea that if the model is able to predict which emoji was included with a given sentence, then it has a good understanding of the emotional content of that sentence. We train our model to predict emojis on a dataset of 1.2B tweets (filtered from 55B tweets) and can then transfer this knowledge to many other tasks such as sarcasm and racism detection using only small annotated datasets for these tasks. With this approach we obtain state-of-the-art performance on 8 benchmark datasets within sentiment, emotion and sarcasm detection using a single pretrained model. As a conclusion I’ll describe how you can use our freely available model for your task as well using only a minimal amount of coding.
Bjarke Felbo is a graduate student at the MIT Media Lab working under the supervision of Iyad Rahwan in the Scalable Cooperation research group. His research interests lie in the intersection of machine learning and computational social science with much of his work focused on finding insights related to human behavior in large-scale datasets. Bjarke is one of three Marvin Minsky Fellows supported for their promising research within AI. Before becoming a researcher he worked at various companies including Gjensidige Insurance and the Boston Consulting Group (BCG).
David Harwath - MIT CSAIL
Jointly Learning to Hear and See
Humans learn language at an early age by simply observing the world around them. Why can't computers do the same? This talk describes our ongoing work to develop methodologies for grounding continuous speech signals at the raw waveform level to natural image scenes. I will present models capable of learning to recognize spoken words and visual objects, without requiring supervision in either modality. I will show that these models can be applied across multiple languages, and that the visual domain can function as an "interlingua" that may be able to serve as the basis for unsupervised speech-to-speech translation systems.
David Harwath is a research scientist at the Spoken Language Systems group in the MIT Computer Science and Artificial Intelligence Lab (CSAIL). He holds a B.S. in electrical engineering from the University of Illinois at Urbana-Champaign, a S.M. in computer science from MIT, and a Ph.D. in computer science from MIT. In his Ph.D. thesis, he developed unsupervised models for the joint perception of speech and vision. His long-term research goal is to imbue computers with the ability to learn language in more human-like ways, without the need for the enormous amount of supervision upon which current methodologies rely.
Kelly Davis - Mozilla
Childhood's End: Maturation of Deep Speech and Common Voice
We’ll talk about the blossoming of Deep Speech, an open deep learning based speech-to-text engine, and Common Voice, an open crowd-sourced speech corpora. We will cover recent Deep Speech advancements (streaming, small platform support, and product integrations) as well as Common Voice advancements (multi-language support, multi-language corpora, and profiles). Also we’ll give a overview of future plans and how to get involved.
Kelly Davis has many irons in the fire. He studied Mathematics and Physics at MIT, then went on to do graduate work in Superstring Theory/M-Theory. He then jumped ship, coding at a startup that eventually went public in the late 90's. When the bubble burst, he jumped back into an academic setting and joined the Max Planck Institute for Gravitational Physics where he worked on software systems used to help simulate black hole mergers. Jumping ship yet again, he went back into industry, writing 3D rendering software at Mental Images/NVIDIA. When that lost its charm, he founded a NLU at a startup, 42, that created a system, based off of IBM'S Watson, able to answer general knowledge questions. After a brief stint as the Director of Machine Learning at another Berlin startup, he joined Mozilla where he now leads the machine learning group.
GENERATIVE ADVERSARIAL NETWORKS
Miriam Cha - Harvard University
Adversarial Learning for Text-to-Image Synthesis
Recent approaches in generative adversarial networks (GANs) can automatically synthesize realistic images from descriptive text. Despite the overall fair quality, the generated images often expose visible flaws that lack structural definition for an object of interest. In this talk, I will present various GAN-based methods for text-to-image synthesis and extend state-of-the-art by improving perceptual quality of generated images. Differentiated from previous work, our synthetic image generator optimizes on perceptual loss functions that measure pixel, feature activation, and texture differences against a natural image. We present visually more compelling synthetic images generated from text descriptions in comparison to some of the most prominent existing work.
Miriam Cha is a fourth year Computer Science PhD candidate at Harvard University. Her research centers around shared representation learning for multimodal data and synthesis of one modality conditioned on another. Before joining Harvard, she was a research scientist at MIT Lincoln Laboratory. She received the B.S. and M.S. degrees in Electrical and Computer Engineering from Carnegie Mellon University. She was a recipient of National Science Foundation Graduate Research Fellowship, a National Defense Science and Engineering Graduate Fellowship, and a Lincoln Scholar Fellowship.
DEEP LEARNING & ROBOTICS
Sudeep Pillai - Toyota Research Institute
Self-Supervision in Mobile Robots in the Deep Learning era
With the unreasonable effectiveness of data in the deep-learning era, most state-of-the-art computer vision solutions today require large amounts of training data and ever-increasing training resources. Furthermore, amassing large amounts of labeled data for task-specific needs becomes increasingly tedious and expensive. While gathering large amounts of cross-modal robot data poses a challenge in itself, we envision that robots will be able self-supervise themselves in certain tasks by transferring or bootstrapping capabilities with the rich set of cross-modal information that these robots typically collect. In this talk, we show that this bootstrap mechanism can also leverage spatio-temporal constraints that are implicitly maintained in robots via techniques like Simultaneous Localization and Mapping (SLAM).
We envision that self-supervised solutions to task learning will have far-reaching implications especially in the context of life-long learning in autonomous systems, while alleviating the need to procure large amounts of labeled data. To conclude, I will talk about some of the recent machine learning efforts at Toyota Research Institute, and how self-supervision hopes to be at the core of our vision of petabyte-scale learning from robot data.
Sudeep is a Machine Learning Research Scientist at Toyota Research Institute. He recently received his PhD in Computer Science from MIT, where he focused on enabling self-supervised perception and learning in SLAM-aware mobile robots. Prior to MIT, he was a software developer working on real-time computer-vision related technologies. He completed his Bachelors in Mechanical Engineering at the University of Michigan - Ann Arbor. He has also had the opportunity to work as a research intern at exciting companies such as Mitsubishi Electric Research Labs (MERL) and Segway.
DEEP LEARNING & SOCIETAL IMPACTS
PANEL: Is the Biggest Challenge Facing AI an Ethical One?
Simon Mueller - The Future Society
Simon Mueller is Co-founder and Vice President of The Future Society, Inc (TFS). A former student organization at the Harvard Kennedy School of Government, The Future Society has since evolved into a “think-and-do-tank” run out of Paris, Dubai, London and Boston. The Society’s goal is to design tech policy that serves humanity, and it does so by sponsoring cutting-edge research and organizing summits to foster best-practice sharing between policy makers and practitioners.
Cansu Canca - AI Ethics Lab
Cansu is the founder and director of the AI Ethics Lab, where she leads teams of computer scientists and legal scholars to provide ethics analysis and guidance to researchers and practitioners. She has a Ph.D. in philosophy specializing in applied ethics. She works on ethics of technology and population-level bioethics with an interest in policy questions. Prior to the AI Ethics Lab, she was a lecturer at the University of Hong Kong, and a researcher at the Harvard Law School, Harvard School of Public Health, Harvard Medical School, Osaka University, and the World Health Organization.
Gabriele Fariello - Harvard University / University of Rhode Island
Gabriele Fariello is a Harvard instructor in ML and IA, researcher in Neuroinformatics, and Chief Information Officer at University or Rhode Island. He is the former Head of Neuroinformatics at Harvard's Center for Brain Science, Assistant Dean for Computing and CIO at Harvard's engineering School, and Director of Clinical Research Informatics at Massachusetts General Hospital. Gabriele first implemented a Neural Network in 1991 and has been doing computational science including Bioinformatics and Neuroinformatics including the use of machine learning and artificial intelligence for several decades. He consults for businesses interested in ML and AI.
Kathy Pham - Harvard Berkman Klein Center
Kathy Pham is a computer scientist, software engineer, and product manager who has spent over 12 years building products at Google, IBM, and the government at the United States Digital Service where she was a founding product and engineering member. Kathy is currently researching the Ethics and Governance of Artificial Intelligence and Software Engineering at the Harvard Berkman Klein Center and MIT Media Lab. She leads the Ethical Tech working group, where she explores topics across computer science ethics curricula, industry norms and ethics, user and community voices in product development, and how to infuse tech back into other disciplines like law, policy, and social science. Kathy holds a B.S. and M.S. in Computer Science from the Georgia Institute of Technology and Supelec in Metz, France.
Conversation & Drinks
Bill Aronson - Artificial Intelligence Research Group
The Matrix – Programmable Matter
Artificial Intelligence Research Group will introduce the Matrix, a world first in 'programmable matter'. Internal pulse patterns in the material cause constructive and destructive interference and thereby enables controlled manipulation of supramolecular elements. Key attributes are: 1. Intrinsic Security: since every instance is unique 2. Trainability: since it can function as a reservoir computer 3. High Performance: since the reservoir computer is not simulated but implemented directly in analog hardware Immediate applications include hacker-proof Physical Unclonable Function based authentication as well as fast machine learning in areas such as adaptive signal processors, object recognition, speech recognition, and financial modeling.
Bill Aronson is the UK-based CEO of Artificial Intelligence Research Group Ltd (AIRG). The London based start-up has expertise in neuromorphic computing and artificial intelligence and undertakes research in areas which have the potential to transform and disrupt. Bill has an MA from Cambridge University and over forty years of experience in senior management, coaching, consulting and strategy. In his spare time, Bill is the author of nine non-fiction business books and twenty-five thousand students have taken his online courses.
Neil Yager - Phrasee
Using Deep Learning to Generate and Assess Natural Language
Deep learning has impacted many fields of natural language processing. In this talk, we will take a look at how deep learning and recurrent neural networks can be used to generate text. Furthermore, we'll also see how it can predict the impact of language on its audience.
Dr. Neil Yager is the Chief Scientist of Phrasee, and the architect of the Phrasee method. A 20-year veteran of the tech industry, Dr Yager has worked in various digital roles for prominent innovative and forward-thinking tech brands. These include, amongst others, Canon Information Systems Research Australia (CiSRA), BT Imaging, and Biometix. Dr. Yager has written over a dozen academic publications, authored a book on data mining, and holds several patents. As one of the world’s leading experts in the commercialization of artificial intelligence, he holds a PhD in Computer Science from the University of New South Wales in Australia. Phrasee’s market-leading learning engine is the culmination of Dr. Yager’s vast data experience, digital expertise, and boundless passion for all aspects of the machine learning field.
Charles Ahmadzadeh - Bunch.ai
Helping Humans Understand Each Other - Emma, Your AI Psychologist
Ever wanted to scratch the surface and understand the inner motivations of the people you encounter in your professional life? Behind a CV or an online profile, there is a person with tons of drives that won't be possible to see for most humans. That is why we created Emma, a AI-powered psychologist that tells you what's behind the surface of your business partners, potential new hire or your boss. But the road to design a machine that can dig deeper than any human mind could go was definitely not an easy one: join this talk / session to discover how to make relationships more human thanks to AI.
Charles is co-founder at Bunch.ai and runs their engineering team. Bunch.ai is the team success platform for managing culture with data. Full stack software generalist with a knack for data, Charles is passionate about Software Craftsmanship, reactive architectures and Audio Synthesis. In addition to growing the engineers team at Bunch Charles mentors developers at askadev.org.
Yibiao Zhao - iSee
Engineering Common Sense
Artificial intelligence has beaten the best human player at Go, and also achieved superhuman performance in many video games. However, current AI systems utilizing advanced deep learning techniques still can not reliably navigate a car in the real world, even with millions of miles of driving data. A human does not need significant driving experience to be a driver. Instead, humans learn to drive using a commonsense understanding of physical objects and intentional agents. At iSee, inspired by computational cognitive science, we are developing algorithms that model the way humans understand and learn about the physical world. Our technique equips self-driving vehicles to better deal with unfamiliar situations and complex interactions on the road.
iSee, an MIT spin-off, is paving the way for level 4 autonomous vehicles that can deal with unfamiliar situations and complex interactions on the road. Inspired by computational cognitive science, our humanistic artificial intelligence will seamlessly integrate into society and benefit human lives.
Yibiao Zhao is the co-founder and CEO of iSee AI, a startup developing humanistic AI for autonomous driving. He completed his PhD at UCLA, studying computer vision, and his postdoc at MIT, studying cognitive robots. As a pioneer in the computer vision field, he did a series of work that engineers common sense to reason about visual scenes. Yibiao also co-chaired a series of workshops at the CVPR and CogSci conferences, which influenced an innovative research direction in the computer vision and cognitive science fields.
APPLICATIONS OF DEEP LEARNING IN INDUSTRY
Jeshua Bratman - Twitter Cortex
Deep User-History Aggregates With Application to Twitter Timelines Ranking
Standard practice for representing entities such as Users for the purpose of making predictions is to engineer a set of features that summarize and aggregate all past events associated with these entities. For example, a representation of a Twitter user might include features that summarizes the past Tweets with which they engaged. Our recent NIPs paper from Twitter challenges this paradigm with a pure deep learning approach. As part of this research we modeled Tweet engagement using raw historical user events directly resulting in better performance on the Timelines Ranking problem than relying on hand-engineered aggregates. This research has wide-ranging implications for how we extract signal from large-scale event data, providing better predictive performance and lower feature engineering overhead.
Jeshua Bratman is a staff engineer on the Twitter Cortex team. Cortex is responsible for building Twitter's AI platform and developing deep learning models to power the Twitter product. His background is in reinforcement learning and deep learning from the University of Michigan. Prior to Twitter, he lead machine learning and realtime ad-bidding algorithms for TellApart, a startup focussed on using ML for predictive marketing. After Twitter acquired TellApart, he built out Twitter's direct response ad technology before moving into Cortex where he has applied deep learning techniques for abuse detection, timelines ranking, and has lead several AI platform initiatives.
Chris Lott - Qualcomm Technologies
The Path to a Personalized, On-Device Virtual Assistant
Artificial intelligence (AI) is reshaping our lives. In fact, machine learning has ignited the voice UI and virtual assistant revolution as machine speech recognition approaches the accuracy of humans. The AI powering key voice UI components has traditionally run in the cloud due to computing, storage, and power constraints. However, on-device processing of voice UI provides unique benefits, such as instant response, reliability, and privacy. And fusing multiple on-device sensor inputs adds a level of personalization that will take us closer to a true personal assistant.
Chris has worked at Qualcomm Research since 2001. His initial focus was on wireless network system design and standards, from the first 3G data systems through 4G/LTE. Branching into more general mobile device topics, he has led projects on minimal computation, SoC resource management and control, on-device learning, planning and robotics, and efficient machine learning on mobile platforms. He has a MS from Stanford and a PhD in EE/Systems from U of Michigan, with specialties in stochastic control, machine learning, optimization, and algorithms.
PANEL: How Can You Redefine Your Industry With the Application of Deep Learning?
Aditya Kaul - Tractica
Aditya Kaul is a research director at Tractica, with a primary focus on artificial intelligence and robotics. He also covers blockchain and wearables as part of his research. Kaul has more than 12 years of experience in technology market research and consulting. He is based in London. Prior to Tractica, Kaul was a practice director at ABI Research, where he led the firm’s Mobile Networks research group. Kaul has also worked as an analyst and team leader at firms including Pioneer Consulting and Evalueserve, and has provided independent consulting services in the areas of Internet of Things, wearables, and smart cities. Kaul started his career as an electrical engineer designing chipsets and wireless networks with stints at Qualcomm and Siemens. Kaul has been a prolific speaker, moderator, and panelist at industry conferences and events, and has appeared frequently in the media including The Wall Street Journal, The Financial Times, Forbes, CNBC, The Motley Fool, VentureBeat, Unstrung, ZDNet, Wireless Week, EE Times, and CommsDesign, among others.
Tom Wilde - indico
Tom brings 25 years of experience in solving the complex problems of digital content to the role of CEO of indico, which focuses on making deep learning practical in the enterprise. Prior to indico, Tom was the Chief Product Officer at Cxense (see-sense), a leading Data Management provider, founder of Ramp, an enterprise video content management company, and held senior roles at Fast Search, Miva Systems, and Lycos. Tom has extensive experience with company building and venture backed startups and holds an MBA in Entrepreneurial Management from Wharton.
David Nydam - Business Intelligence Advisors (BIA)
David Nydam is the CEO of Business Intelligence Advisors (BIA), which applies behavioral analysis originally developed by the Central Intelligence Agency to discover what other miss in corporate and personal disclosures. Immediately prior to BIA, he was the CEO of American Biomass, a renewable energy company and remains an Executive in Residence at .406 Ventures, a Boston-based venture capital firm. Before those roles, he was President of BCC Research, a market research and forecasting firm specializing in analysis of high-technology markets. He has also held positions at EF Education and Deloitte Consulting. He has an MBA from the Tuck School at Dartmouth and a B.S. in Astrophysics from Yale.
Sumedh Mehta - Putnam Investments
Mr. Mehta is Chief Information Officer for Putnam Investments. He is responsible for the overall strategic direction and execution of Putnam’s global technology solutions. In addition, he is a member of Putnam’s Operating Committee. Mr. Mehta has over 20 years of experience in managing information technology systems in the investment industry, where he has led transformational change in the areas of software development, information technology, and business operations. Mr. Mehta joined Putnam in 2015 and has been in the investment industry since 1988.
Greg Amis - TripAdvisor
Improving TripAdvisor Photo Selection With Deep Learning
The newly redesigned TripAdvisor.com emphasizes traveler photos throughout the site, but not all of these photos make the best first impression. Deep learning networks provide an excellent opportunity for us to improve our users’ experience by highlighting the most attractive and useful photos for varying presentation contexts. This talk will discuss our approach for gathering training data, developing a model, and scaling it up to 150+ million photos and 7+ million places of interest. Technologies discussed: Keras, TensorFlow, PySpark, Python multiprocessing, siamese networks, and to a lesser degree, S3, Hadoop/Hive/HDFS, and Kubernetes.
Greg Amis is a Principal Software Engineer on the Machine Learning team at TripAdvisor, where we tend to focus on very pragmatic projects-- ML that will quickly and directly improve our business. He’s been at TripAdvisor for over 3.5 years, working on machine vision, text processing (e.g., catching inappropriate content), and metadata processing (e.g., catching fraudulent reviews). Prior to TripAdvisor, he worked on government contracts, doing everything from adaptive radar jamming to forecasting Navy personnel needs. Greg has a PhD from Boston University in Cognitive and Neural Systems, studying a type of neural network called Adaptive Resonance Theory and its application to semi-supervised learning and remote sensing.
Miguel Campo - Twentieth Century Fox Film Corp
Geometric Deep Learning in Multigraphs for Movie Recommender Systems
Movie studios face a complex landscape. AI isn’t going to help make movies better, but maybe can help make them more successful. Product recommendation systems are becoming important during the movie greenlight process and as part of machine learning personalization pipelines. Collaborative Filtering (CF) and generative natural language models applied to movie scripts have been shown to increase performance, but tend to underperform for movies that cross genres or are novel. Geometric Deep Learning can increase recommender performance in extreme ‘cold start’ situations, which is the kind of situation a studio faces when deciding whether to greenlight a movie. GDL models are equipped to naturally propagate ‘neighbors’ data to improve the prediction, and to identify/isolate patterns in the training signal that are intrinsic to the local geometry of the multigraph structure. In this talk we’ll discuss our current GDL implementation, how we apply the model at different stages of the production process, how we use the graph spectrum and the movie harmonics to better understand movie positioning before the movie is made, and what related promising areas we are actively exploring.
Miguel Campo is SVP of Data Science & Analytics at Twentieth Century Fox Film Corp. Miguel’s expertise is at the intersection of machine learning and mathematical modeling of customer behavior. At Fox, he heads up the team developing the machine learning pipeline to support all aspects of the film business, from greenlight to theatrical release and home entertainment. Prior to Fox, Miguel led the data science practice at EY Media Advisory, Convertro (now part of Verizon’s Oath), and Disney Science. Miguel is an electronics engineer with a PhD in Information Systems from NYU Stern and a Post-Doc from Dartmouth College. He holds a number of patents in algorithmic modeling, has published in peer review journals, and is actively working with academics in AI and the social sciences to understand key AI governance issues and to articulate feasible industry solutions. He lives in LA with his wife and two daughters, and in his free time enjoys trail running and surfing.
Nitin Sharma - PayPal
Deep Learning Architectures for Large-Scale Online Payments Fraud Detection
The talk will cover some applications and use-cases of deep neural network architectures applied to the problem of payments fraud detection. With the multi-fold objectives such as maximizing fraud catch rate while approving the good user volume reliably and quickly, the underlying problem formulation and considerations applicable to large-scale online payment transaction data, such as dimensionality reduction, sparsity, high cardinality and temporality will be covered. Covering an assortment of deep learning methodologies applied to each problem formulation, some empirical comparisons and results will be presented. The talk will conclude by providing some high level aspects of run-time performance benchmarking as applicable to training/inferencing processes and model deployment at PayPal.
Nitin Sharma is a Distinguished Scientist at the AI research group in PayPal Risk Sciences, where he focuses on end-to-end design and development of AI algorithms, particularly deep learning, for large-scale real-time payments fraud detection. His research involves the next generation of fraud detection capabilities by designing novel fraud problem formulations, utilizing the exhaustive PayPal data assets so as to improve fraud detection accuracy while continuing to enhance the experience of good users. Prior to his current role, he built large-scale machine learning frameworks for stolen identity and stolen financial instruments fraud detection at PayPal. He has several years of research and teaching experience in the fields of machine learning and mathematical optimization.
Narine Kokhlikyan - Facebook
Narine is a Research Scientist at Facebook AI focusing on explainable AI. She is the main creator of Captum, the PyTorch library for model interpretability. Narine studied at the Karlsruhe Institute of Technology in Germany and was a Research Visitor at Carnegie Mellon University. Her research focuses on explainable AI, cognitive systems, and natural language processing. She is also an enthusiastic contributor of open source software packages such as scikit-learn and Apache Spark.
END OF SUMMIT