Schedule

08:30

COFFEE

08:50

John Hershey

John Hershey, Mitsubishi Electric Research Labs

Cracking the Cocktail Party Problem: Deep Clustering for Speech Separation

Cracking the Cocktail Party Problem: Deep Clustering for Speech Separation

The human auditory system gives us the extraordinary ability to converse above the chatter of a lively cocktail party. Selective listening in such conditions is an extremely challenging task for computers, and has been the holy grail of speech processing for more than 50 years. Previously, no practical method existed in the case of single channel mixtures of speech, especially when the speakers are unknown. We present a breakthrough in this area using a new type of neural network we call deep clustering. Our deep clustering network assigns embedding vectors to different sonic elements of the noisy signal. When the embeddings are clustered the constituent sources are revealed. The system is able to extract clean speech from single channel mixtures of unknown speakers, with an astounding 10 dB improvement in signal to noise ratio -- a level of improvement previously unobtainable even in simpler speech enhancement tasks. Amazingly, the system can even generalize between two- and three-speaker mixtures. We believe this technology is on the verge of solving the general audio separation problem, opening up a new era in spontaneous human-machine communication.

Prior to joining MERL in 2010, John spent 5 years at IBM's T.J. Watson Research Center in New York, where he led a team in noise robust speech recognition. He also spent a year as a visiting researcher in the speech group at Microsoft Research, after obtaining his PhD from UCSD in the area of multi-modal machine perception. He is currently working on machine learning for signal separation, speech recognition, language processing, and adaptive user interfaces.

Buttontwitter Buttonlinkedin

09:10

Daniel McDuff

Daniel McDuff, Affectiva

Emotion Intelligence to our Digital Experiences

Turning Everyday Devices into Health Sensors

Today's electronics have very sensitive optical and motion sensors. These can captures subtle signals resulting from cardiorespiratory activity. I will present how webcam(s) can be used to measure important physiological parameters without contact with the body. In addition, I will show how an ordinary smartphones can be turned into a continuous physiological monitors. Both of these techniques reveal the surprising power of devices with around us all the time. I will show how deep learning are helping us create highly scalable and low-cost applications based on these sensor measurements.

Daniel McDuff is Principal Research Scientist at Affectiva. He is building and utilizing scalable computer vision and machine learning tools to enable the automated recognition and analysis of emotions and physiology. At Affectiva Daniel is building state-of-the-art facial expression recognition software and leading analysis of the world's largest database of human emotions (currently with 8B+ data points). Daniel completed his PhD in the Affective Computing Group at the MIT Media Lab in 2014 and has a B.A. and Masters from Cambridge University. His work has received nominations and awards from Popular Science magazine as one of the top inventions in 2011, South-by-South-West Interactive (SXSWi), The Webby Awards, ESOMAR and the Center for Integrated Medicine and Innovative Technology (CIMIT). His work has been reported in many publications including The Times, the New York Times, The Wall Street Journal, BBC News, New Scientist and Forbes magazine. Daniel is also a Research Affiliate at the MIT Media Lab.

Buttontwitter Buttonlinkedin

09:30

Tony Jebara

Tony Jebara, Netflix

Double-cover Inference in Deep Belief Networks

Personalized Content/Image Selection

A decade ago, Netflix launched a challenge to predict how each user would rate each movie in our catalog. This accelerated the science of machine learning and matrix factorization. Since then, our learning algorithms and models have evolved with multiple layers, multiple stages and nonlinearities. Today, we use machine learning and deep variants to rank a large catalog by determining the relevance of each of our titles to each of our users, i.e. personalized content selection. We also use machine learning to find how to best present the top ranked items for the user. This includes selecting the best images to display for each title just for you, i.e. personalized image selection.

Tony directs machine learning research at Netflix and is sabbatical professor at Columbia University. He serves as general chair of the 2017 International Conference on Machine Learning. He has published over 100 scientific articles in the field of machine learning and has received several best paper awards.

Buttontwitter Buttonlinkedin

09:50

Yoshua Bengio

Yoshua Bengio, Université de Montréal

Keynote: Deep Learning Frameworks

Yoshua Bengio (PhD in CS, McGill University, 1991), post-docs at M.I.T. (Michael Jordan) and AT&T Bell Labs (Yann LeCun), CS professor at Université de Montréal, Canada Research Chair in Statistical Learning Algorithms, NSERC Chair, CIFAR Fellow, member of NIPS foundation board and former program/general chair, co-created ICLR conference, authored two books and over 300 publications, the most cited being in the areas of deep learning, recurrent networks, probabilistic learning, natural language and manifold learning. He is among the most cited Canadian computer scientists and is or has been associate editor of the top journals in machine learning and neural networks.

Buttontwitter Buttonlinkedin

10:10

PANEL: What can be Done to Make Deep Learning as Impactful as Possible in the Near-Term?

10:30

Vivienne Sze

Vivienne Sze, MIT

Building Energy-Efficient Accelerators for Deep Learning

Building Energy-Efficient Accelerators for Deep Learning

As deep learning is becoming more ubiquitous in our lives, we are in need of better hardware infrastructure to support the large amount of computation foreseeable. In particular, the high energy/power consumption of current CPU and GPU systems prevents the deployment of deep learning at a larger scale, and dedicated deep learning accelerators will be the key to solve this problem. In this talk, I will give an overview of our work to build an energy-efficient accelerator, called Eyeriss, for deep convolutional neural networks (CNN), which are currently the cornerstone of many deep learning algorithms. Eyeriss is reconfigurable to support state-of-the-art deep CNNs. Focusing on minimizing data movement between the accelerator and the main memory as well as within the computation fabric of the accelerator, we are able to achieve 10 times higher energy efficiency compared to modern mobile GPUs.

Vivienne Sze is an Assistant Professor at MIT in the Electrical Engineering and Computer Science Department. Her research interests include energy-aware signal processing algorithms, and low-power circuit and system design for multimedia applications. In 2011, she was awarded the Jin-Au Kong Outstanding Doctoral Thesis Prize in electrical engineering at MIT for her thesis on “Parallel Algorithms and Architectures for Low Power Video Decoding”. She is a recipient of the 2016 3M Non-tenured Faculty Award, the 2014 DARPA Young Faculty Award, the 2007 DAC/ISSCC Student Design Contest Award and a co-recipient of the 2008 A-SSCC Outstanding Design Award.

Buttonlinkedin

10:50

Hugo Larochelle

Hugo Larochelle, Twitter

Applied Deep Learning: Now and Beyond

Applied Deep Learning: Now and Beyond

I will start by discussing the reasons that I believe explain the current success in industry of deep learning, as it is becoming the most popular tool for learning representations (features) of data. To do so, I will describe how we use deep learning at Twitter, to learn representations of tweets, images, videos and users. Then, I will share my views on what are some of the most promising future directions for deep learning research.

Hugo Larochelle is Research Scientist at Twitter Cortex and Assistant Professor at the Université de Sherbrooke (UdeS). Before, he spent two years in the machine learning group at University of Toronto, as a postdoctoral fellow under the supervision of Geoffrey Hinton. He obtained his Ph.D. at Université de Montréal, under the supervision of Yoshua Bengio. He is the recipient of two Google Faculty Awards. His professional involvement includes associate editor for the IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), member of the editorial board of the Journal of Artificial Intelligence Research (JAIR) and program chair for the International Conference on Learning Representations (ICLR) of 2015 and 2016.

Buttontwitter Buttonlinkedin

11:10

LUNCH

11:30

Honglak Lee

Honglak Lee, University of Michigan

Deep Learning with Disentangled Representations

Deep learning with disentangled representations

Over the recent years, deep learning has emerged as a powerful method for learning feature representations from complex input data, and it has been greatly successful in computer vision, speech recognition, and language modeling. The recent successes typically rely on a large amount of supervision (e.g., class labels). While many deep learning algorithms focus on a discriminative task and extract only task-relevant features that are invariant to other factors, complex sensory data is often generated from intricate interaction between underlying factors of variations (for example, pose, morphology and viewpoints for 3d object images). In this work, we tackle the problem of learning deep representations that disentangle underlying factors of variation and allow for complex reasoning and inference that involve multiple factors. Specifically, we develop deep generative models with higher-order interactions among groups of hidden units, where each group learns to encode a distinct factor of variation. We present several successful instances of deep architectures and their learning methods, including supervised and weakly-supervised setting. Our models achieve strong performance in emotion recognition, face verification, data-driven modeling of 3d objects, and video game prediction. I will also present other related ongoing work.

I am an Assistant Professor of Computer Science and Engineering at the University of Michigan, Ann Arbor. I received my Ph.D. from Computer Science Department at Stanford University in 2010, advised by Prof. Andrew Ng. My primary research interests lie in machine learning, which spans over deep learning, unsupervised, semi-supervised, and supervised learning, transfer learning, graphical models, and optimization. I also work on application problems in computer vision, audio recognition, robot perception, and text processing. My work received best paper awards at ICML (2009) and CEAS (2005). I have served as a guest editor of IEEE TPAMI Special Issue on Learning Deep Architectures, as well as area chairs and senior program committee of ICML, NIPS, ICCV, AAAI, IJCAI, and ICLR. I received the Google Faculty Research Award (2011), NSF CAREER Award (2015), and was selected by IEEE Intelligent Systems as one of AI's 10 to Watch (2013).

Buttontwitter Buttonlinkedin

11:50

COFFEE

12:10

WELCOME

SPEECH RECOGNITION

12:50

CONVERSATION & DRINKS

13:10

REGISTRATION

DEEP LEARNING FRAMEWORKS & EXPERIENCES

ACCELERATORS FOR DEEP LEARNING

14:10

Urs Köster

Urs Köster, Nervana Systems

Deep Learning at Scale

Deep Learning at Scale

Deep learning has had a major impact in the last three years. Imperfect interactions with machines, such as speech, natural language, or image processing have been made robust by deep learning and deep learning holds promise in finding usable structure in large datasets. The training process is lengthy and has proven to be difficult to scale due to constraints of existing compute architectures and there is a need of standardized tools for building and scaling deep learning solutions. I will outline some of these challenges and how fundamental changes to the organization of computation and communication can lead to large advances in capabilities.

Urs has over 9 years of research experience in machine learning, spanning areas from computer vision and image processing to large scale neural data analysis. His data science experience ranges from working with national laboratories in applying deep learning to understand climate change, to helping customers solve challenging computer vision problems in medical imaging. Urs works on making the fastest implementations of convolutional and recurrent networks. For his postdoc at UC Berkeley, he used unsupervised machine learning algorithms such as Restricted Boltzmann Machines to understand the visual system.

Buttontwitter Buttonlinkedin

Yu-Hsin Chen

Yu-Hsin Chen, MIT

Building Energy-Efficient Accelerators for Deep Learning

Building Energy-Efficient Accelerators for Deep Learning

As deep learning is becoming more ubiquitous in our lives, we are in need of better hardware infrastructure to support the large amount of computation foreseeable. In particular, the high energy/power consumption of current CPU and GPU systems prevents the deployment of deep learning at a larger scale, and dedicated deep learning accelerators will be the key to solve this problem. In this talk, I will give an overview of our work to build an energy-efficient accelerator, called Eyeriss, for deep convolutional neural networks (CNN), which are currently the cornerstone of many deep learning algorithms. Eyeriss is reconfigurable to support state-of-the-art deep CNNs. Focusing on minimizing data movement between the accelerator and the main memory as well as within the computation fabric of the accelerator, we are able to achieve 10 times higher energy efficiency compared to modern mobile GPUs.

Yu-Hsin Chen is currently a PhD candidate at MIT working on the architecture design for deep learning accelerators. Co-advised by Prof. Vivienne Sze and Prof. Joel Emer, his research interests include energy-efficient VLSI system design, computer vision and digital signal processing. He received the B.S. and M.S. degrees, both in department of EECS, from National Taiwan University and MIT, respectively. He was also a recipient of the 2015 NVIDIA Graduate Fellowship and the 2015 ADI Outstanding Student Designer Award.

Buttonlinkedin

14:50

Spyros Matsoukas

Spyros Matsoukas, Amazon

Deep Learning for Amazon Echo

Deep Learning in Amazon Alexa

We will present a set of deep learning techniques that our team has developed in order to address challenges we are facing as part of our continued efforts to improve Alexa’s spoken language understanding capabilities.

Spyros Matsoukas is a Sr. Principal Scientist in the Alexa Machine Learning organization at Amazon.com, developing spoken language understanding technology for voice-enabled products such as Amazon Echo. From 1998 to 2013 he worked at BBN Technologies, Cambridge MA, conducting research in acoustic modeling for ASR, speaker diarization, statistical machine translation, speaker identification, and language identification. He has over over 60 publications in peer reviewed conferences and journals, with 3 best paper awards.

Buttontwitter Buttonlinkedin

15:10

Adam Lerer

Adam Lerer, Facebook

Learning Physical Intuition by Example

Learning Physical Intuition by Example

Babies are known to acquire visual "common sense" concepts, such as object permanence, gravity, and intuitive physics, at a young age. For example, infants play with toy blocks, allowing them to gain intuition about the physical behavior of the world at a young age. While deep neural networks have exhibited state-of-the-art performance on many computer vision tasks, more complex reasoning (e.g. 'what will happen next in this scene?') requires an understanding of how the physical world behaves. We explore the ability of deep feedforward models to learn such intuitive physics. Using a 3D game engine, we create small towers of wooden blocks, and train large convolutional network models to accurately predict their stability, as well as estimating block trajectories. The models are able to generalize to new physical scenarios and to images of real blocks.

Adam is a research engineer at Facebook AI Research, where he has worked on distributed neural network training, computer vision, visual common sense, and graph embeddings. Prior to joining Facebook, Adam worked at D. E. Shaw Research, where he developed software and algorithms for Anton, a special-purpose supercomputer for molecular dynamics simulation. Adam holds a B.Sc. in computer science and physics and M.Eng. in computer science from MIT.

Buttonlinkedin

15:30

 Nathan Wilson

Nathan Wilson, Nara Logics

Biological Foundations for Deep Learning: Towards Decision Networks

Biological Foundations for Deep Learning: Towards Decision Networks

The basic principles of intelligence have been pursued by two parallel research communities – computer scientists developing artificial intelligence, and neuroscientists exploring the brain. Recent advances, particularly in deep learning, present a key opportunity for new homologies and cross-pollination. In this talk we will discuss some of the latest learning rules discovered by each community and their surprising convergence. We will then describe how these rules can be coordinated at scale to take learning networks from perception to decisions, to help solve mature enterprise problems that are ripe for AI applications.

Nathan Wilson is a scientist and entrepreneur focused on actualizing powerful new models of brain-based computation. After many years at MIT working on the mathematical logic of neural circuits, Nathan co-founded Nara Logics, a Cambridge, MA artificial intelligence company developing “synaptic intelligence” that automatically finds and refines connections across data for recommendations and decisions within enterprises. Nathan holds 14 patents in AI and his research has been featured in Nature, Science, PNAS, and the MIT Press. An enthusiastic writer and teacher, he has won departmental and university teaching awards and been highlighted in outlets ranging from TechCrunch and WIRED to Forbes, HuffPo, WSJ, Fast Company and National Geographic.

Buttontwitter Buttonlinkedin

NATURAL LANGUAGE UNDERSTANDING

16:10

Andrew McCallum

Andrew McCallum, University of Massachusetts Amherst

Deep Learning for Representation and Reasoning from Natural Language

Deep Learning for Representation and Reasoning from Natural Language

In this talk I will describe advances in deep learning for extracting entity-relations from natural language as well as for representing and reasoning about the resulting knowledge base. I will introduce "universal schema," our approach that embeds many database schema and natural language expressions into a common semantic space. Then I will describe recent research in Gaussian embeddings that capture uncertainty and asymmetries, collaborative filtering with text, and logical implicature of new relations through multi-hop relation paths compositionally modeled by recursive neural tensor networks.

Andrew McCallum is a Professor and Director of the Center for Data Science at the University of Massachusetts Amherst. He has published over 250 papers in many areas of AI, including natural language processing, machine learning, data mining and reinforcement learning, and his work has received over 45,000 citations. He obtained his PhD from University of Rochester in 1995 with Dana Ballard and a postdoctoral fellowship from CMU with Tom Mitchell and Sebastian Thrun. In the early 2000's he was Vice President of Research and Development at at WhizBang Labs, a 170-person start-up company that used machine learning for information extraction from the Web. He is a AAAI Fellow, the recipient of the UMass Chancellor's Award for Research and Creative Activity, the UMass NSM Distinguished Research Award, the UMass Lilly Teaching Fellowship, and research awards from Google, IBM, Yahoo and Microsoft. He was the General Chair for the International Conference on Machine Learning (ICML) 2012, and is the current president of the International Machine Learning Society, as well as member of the editorial board of the Journal of Machine Learning Research. For the past twenty years, McCallum has been active in research on statistical machine learning applied to text, especially information extraction, entity resolution, semi-supervised learning, topic models, and social network analysis. His work on open peer review can be found at http://openreview.net. McCallum's web page is http://www.cs.umass.edu/~mccallum.

Buttontwitter Buttonlinkedin

Tejas Kulkarni

Tejas Kulkarni, MIT

Panelist

Tejas Kulkarni is a PhD candidate at MIT working on Deep Learning, Reinforcement learning and Probabilistic Modeling. He is interested in building intelligent agents that learn to solve a variety of goals by interacting with their environment. In particular, his research focuses on spatio-temporal abstractions of data that enable data-efficient learning. His work has received best paper (honorable mention) awards at the Computer Vision and Pattern Recognition (CVPR) conference in 2015 and at the Conference on Empirical Methods on Natural Language Processing (EMNLP) in 2015. He has also been awarded the Henry Singleton award and the Leventhal Fellowship for his graduate work.

Buttonlinkedin

Olexandr Isayev

Olexandr Isayev, University of North Carolina

Panelist

Olexandr Isayev is a Research Scientist at UNC Eshelman School of Pharmacy, University of North Carolina at Chapel Hill. In 2008, Olexandr received his Ph.D. in computational chemistry. He was Postdoctoral Research Fellow at the Case Western Reserve University and scientist at the government research lab before joining UNC in 2013. Olexandr received “Emerging Technology Award” from the American Chemical Society (ACS) and the GPU computing award from NVIDIA in 2014. His research interests focus on making sense of chemical data with molecular modeling and machine learning.

Buttontwitter Buttonlinkedin

Nanette Byrnes

Nanette Byrnes, MIT Technology Review

Moderator

Nanette Byrnes is the editor of MIT Technology Review’s business reports, stories and data-driven content that examine how businesses strategically use technology, and how that use is evolving. Before joining MIT Technology Review she was a reporter, writer, and editor at BusinessWeek, Reuters, and SmartMoney, covering a variety of business topics including corporate strategy, governance, and finance. Her work has been recognized by the Gerald Loeb Awards, National Press Club, and New York Press Club, among others.

Buttontwitter Buttonlinkedin

Aditya Khosla

Aditya Khosla, MIT

Panelist

Aditya Khosla is a Research Assistant at MIT working on deep learning for computer vision and human cognition. He is interested in developing machine learning techniques that go beyond simply identifying what an image or video contains, but instead predict the impact visual media has on people e.g., predicting whether someone would like an image or not, and whether they would remember it. He is also interested in applying computational techiques to predictably modify these properties of visual media automatically. He is a recipient of the Facebook Fellowship, and his work on predicting image popularity and modifying face memorability has been widely featured in popular media like The New York Times, BBC, and TechCrunch. For more information, visit my website: http://mit.edu/khosla

Buttonlinkedin

Jana Eggers

Jana Eggers, Nara Logics

Welcome

Jana’s a math and computer nerd who took the business path for a career. Today she’s CEO of Nara Logics, a neuroscience-inspired artificial intelligence company, providing a smart platform for recommendations and decision support. Along with start-ups like Nara Logics, her career has taken her from 3-person business beginnings to 50,000-person enterprises. She’s opened the European logistics software offices as part of American Airlines, started in the internet in ’96 at Lycos, founded Intuit’s corporate Innovation Lab, and researched conducting polymers at Los Alamos National Laboratory. Her passions are working with teams to define and deliver products customers love, algorithms and their intelligence, and inspiring teams to do more than they thought possible.

Buttontwitter Buttonlinkedin

08:30

Nashlie Sephus

Nashlie Sephus, Partpic

An Industrial-Strength Pipeline for Recognizing Replacement Parts

An Industrial-Strength Pipeline for Recognizing Replacement Parts

Image classification and computer vision for search are rapidly emerging in today's technology and consumer markets. Partpic focuses on image search for replacement parts, and we present our industrial pipeline for such, with applications to fasteners. We discuss how we have aimed to overcome issues such as acquiring enough training data, training and classification of many different types of parts, identification of customized specifications of parts (such as finish type, dimensions, etc.), establishing constraints for the user to take an “good-enough” image, and scalability of many pieces of data associated with thousands of parts.

Dr. Nashlie H. Sephus specializes in writing deep learning and visual recognition algorithms as CTO at Partpic (Atlanta, Georgia). In 2014, she graduated with a PhD in Electrical and Computer Engineering at the Georgia Institute of Technology where her thesis topic was data mining/machine learning in digital signals using modulation spectral features. Nashlie then worked in New York City as an Associate at Exponent, an engineering consulting firm. She has prior experience working with companies such as GE, Delphi, and IBM. Nashlie graduated from undergrad at Mississippi State University and hails from Jackson, Mississippi.

Buttontwitter Buttonlinkedin

08:50

WELCOME

STARTUP SESSION

APPLICATIONS IN DEEP LEARNING

09:50

Jianxiong Xiao

Jianxiong Xiao, Princeton University

3D Deep Learning for Robot Perception

3D Deep Learning for Robot Perception

Deep learning has made unprecedented progress in artificial intelligence tasks from speech recognition to image recognition. In both, we ask our algorithms to reason about features in the most appropriate dimension: for natural language, we feed one-dimensional one-hot vectors of words as input to a recurrent neural network, whereas in image processing, we use two-dimensional filters over pixels in a convolutional network. However, as we are physically living in a three-dimensional world, for robot perception, it is more natural and often more useful to use three-dimensional representations and algorithms to reason about the 3D scene around us.

In this talk, I will share our recent experiences on 3D deep learning at three different levels for robot perception: local part, whole object, and global scene. At the local part level, we have developed an algorithm to learn 3D geometric descriptors to match local 3D keypoints, which is a critical step in robot mapping. At the object level, we have developed an object detector to slide a window in 3D using 3D convolutional neural networks. At the global scene level, we propose a novel approach to feed the whole 3D scene into a deep learning network, and let the network automatically learn the 3D object-to-object context relationship for joint inference with all the objects in a scene. To support 3D deep learning research, I will introduce "Marvin", a deep learning software framework to work with three-dimensional deep neural networks.

Jianxiong Xiao is an Assistant Professor in the Department of Computer Science at Princeton University and the director of the Princeton Vision Group. He received his Ph.D. from the Computer Science and Artificial Intelligence Laboratory (CSAIL) at Massachusetts Institute of Technology (MIT). His research focuses on bridging the gap between computer vision and robotics by building extremely robust and dependable computer vision systems for robot perception. Especially, he is interested in 3D Deep Learning, RGB-D Recognition and Reconstruction, Place-centric 3D Context Modeling, Synthesis for Analysis, Deep Learning for Autonomous Driving, Large-scale Crowd-sourcing, and Petascale Big Data. His work has received the Best Student Paper Award at the European Conference on Computer Vision (ECCV) in 2012 and Google Research Best Papers Award for 2012, and has appeared in popular press in the United States. Jianxiong was awarded the Google U.S./Canada Fellowship in Computer Vision in 2012, MIT CSW Best Research Award in 2011, and two Google Research Awards in 2014 and in 2015. More information can be found at: http://vision.princeton.edu.

10:10

LUNCH

10:30

Mark Hammond

Mark Hammond, Bonsai

Doing for Artificial Intelligence what Databases did for Data

Doing for Artificial Intelligence what Databases did for Data

Building deep learning systems at present is part science, part art, and a whole lot of arcana. Rather than focusing on the concepts we want the system to learn and how those can be taught, one often finds themselves dealing with low level details like network topology and hyper parameters. It is easy to lose the forest for the trees.

Databases solved this problem for data by allowing users to program at a higher level of abstraction. With a database, one eschews low level implementation details and instead builds a model of the information (the schema) using a high level declarative programming language (e.g. SQL). The database server is then used to actualize this model, and manage its usage with real data. Similarly, for artificial intelligence, one can build a model for conceptual understanding (the mental model) using a high level declarative programming language (Inkling). An intelligence server can then be used to actualize this model, and manage its usage with real data.

In this talk, Mark will explore the underpinnings of this technique, detail the Inkling programming language, and demonstrate how one can build, debug, and iteratively refine models. To make things concrete and fun, Mark will detail creating a system to play the video game breakout using deep learning, but requiring codifying only the high level concepts relevant for intelligent play and a curriculum for how one can teach them.

Mark Hammond is the founder and CEO of Bonsai, a VC backed startup whose platform makes AI technology accessible to every software developer, regardless of machine learning expertise. Mark has a deep passion for understanding how the mind works, combining that with an understanding of our own human nature, and turning that knowledge into beneficial applied technology. A Caltech alumnus focused on computation and neural systems, he has worked extensively for the industry giant Microsoft, as well as numerous startups and in academia including Numenta and the neuroscience department at Yale.

Buttontwitter Buttonlinkedin

10:50

COFFEE

11:10

REGISTRATION

11:30

PANEL: The Practical Application of AI in Enterprise

11:50

Joseph Durham

Joseph Durham, Amazon Robotics

Assembling Orders in Amazon’s Robotic Warehouses

Assembling Orders in Amazon’s Robotic Warehouses

Amazon Robotics builds the world’s largest mobile robotic fleet where many thousands of robots deliver inventory shelves to pick operators in e-commerce warehouses. Each Amazon warehouse holds millions of items of inventory, most customer orders represent a unique combination of several items, and many orders need to be shipped within a couple hours of being placed to meet delivery promises. This talk will describe how mobile robots and human operators collaborate to solve this challenging problem and enable Amazon to ship millions of orders every day. I will also discuss the results of the recent Amazon Picking Challenge and the next big frontier for robotics in warehousing.

Joseph Durham is Manager of Research and Advanced Development at Amazon Robotics. His team focuses on resource allocation algorithms, machine learning, and path planning for robotic warehouses. Joey joined Kiva Systems after completing his Ph.D. at the University of California at Santa Barbara in distributed coordination for teams of robots. He has been with the company through its acquisition and growth into Amazon Robotics. Previously he worked on path planning for autonomous vehicles at Stanford University for the DARPA Grand Challenge.

Buttontwitter Buttonlinkedin

12:10

Byron Galbraith

Byron Galbraith, Talla

Beyond the Keyword Search: Finding Job Candidates with CV2Vec

Beyond the Keyword Search: Finding Job Candidates with CV2Vec

A major challenge for HR teams is finding, interviewing, and onboarding job candidates. At Talla, we are building intelligent assistants that employ deep learning to help offload some of the tedious and time-consuming parts of this workload. This talk focuses on CV2Vec, a set of experiments we’ve done on the candidate sourcing side of this process. By training neural models to map CV and resume documents into a dense vector representation, we are able to perform candidate searches on more than just keywords. We can find candidates that are most similar to a reference person or the job ad itself, cluster people together and visualize how CVs align with each other, and even make a prediction as to what someone’s next job will be.

Byron Galbraith is the Chief Data Scientist and Co-Founder of Talla, a startup leveraging the latest advancements in AI to build intelligent assistants for business teams. Byron has a PhD in Cognitive and Neural Systems from Boston University and an MS in Bioinformatics from Marquette University. His research expertise includes brain-computer interfaces, neuromorphic robotics, spiking neural networks, high-performance computing, and natural language processing. Byron has held several software engineering roles including back-end system engineer, full stack web developer, office automation consultant, and game engine developer at companies ranging in size from a two-person startup to a multi-national enterprise.

Buttontwitter Buttonlinkedin

12:30

END OF SUMMIT

12:50

Alejandro Jaimes

Alejandro Jaimes, Acesio

Artificial Intelligence in Improving Health Outcomes and De-Risking Clinical Trials

Alejandro (Alex) Jaimes is CTO & Chief Scientist at Acesio. Acesio focuses on Big Data for predictive analytics in Healthcare to tackle disease at worldwide scale, impacting individuals and entire populations. We use Artificial Intelligence to collect and analyze vast quantities of data to track and predict disease in ways that have never been done before- leveraging environmental variables, population movements, sensor data, and the web. Prior to joining Acesio, Alex was CTO at AiCure and prior to that he was Director of Research/Video Product at Yahoo where he led research and contributions to Yahoo's video products, managing teams of scientists and engineers in New York City, Sunnyvale, Bangalore, and Barcelona. His work focuses on Machine Learning, mixing qualitative and quantitative methods to gain insights on user behavior for product innovation. He has published widely in the top-tier conferences (KDD, WWW, RecSys, CVPR, ACM Multimedia, etc), has been a visiting professor (KAIST), and is a frequent speaker at international academic and industry events. He is a scientist and innovator with 15+ years of international experience in research leading to product impact (Yahoo, KAIST, Telefonica, IDIAP-EPFL, Fuji Xerox, IBM, Siemens, and AT&T Bell Labs). He has worked in the USA, Japan, Chile, Switzerland, Spain, and South Korea, and holds a Ph.D. from Columbia University.

Buttontwitter Buttonlinkedin

13:10

Adham Ghazali

Adham Ghazali, Imagry

Large Scale Visual Understanding For Enterprise

Adham Ghazali is the CEO of Imagry. He spent the last 10 years working on various machine learning problems including large scale computer vision, Brain computer interfacing and Bio-Inspired facial recognition. He is interested in the intersection between biology and computer science. At his current post, he is responsible for strategic R&D and Business Development. The company plans to provide state of the art deep neural networks for mobile devices.

Buttontwitter Buttonlinkedin

13:30

Cambron Carter

Cambron Carter, GumGum

How Deep Learning and Image Recognition are Changing the Advertising Experience

How Deep Learning and Image Recognition are Changing the Advertising Experience

In 2015, Google reported $68 billion in advertising revenue, which was roughly 90% of their total revenue for the year. Despite being so vital to the financial fitness of many tech companies, there is disparity in the experience of online advertising. Deep learning is changing that experience. As a computer vision firm applying our technology to advertising, I will discuss how GumGum is using deep learning for a multitude of purposes including content safety, reduction of redundant processing, and general image understanding. Further, I will share some highly specific - and occasionally peculiar - image recognition use cases to which we employed deep learning techniques to afford the user a more organic experience, such as serving ads for lipstick only on pages which have images of people with “bold” lips. I will describe the problems we have attacked with deep learning, both supervised and unsupervised, our battle with statistics at scale, and how we see deep learning dramatically benefiting both consumers and marketers in the long-term.

Cambron Carter works in research and development at GumGum. He is responsible for designing computer vision and machine learning solutions for a wide variety of applications related to images and video. Cambron previously conducted research in medical image analysis where he worked on the early detection of malignant, pulmonary nodules from chest CT. He holds B.S. degrees in physics and electrical engineering and a M.Eng. in electrical engineering from the University of Louisville.

Buttontwitter Buttonlinkedin

13:50

Parsa Ghaffari

Parsa Ghaffari, AYLIEN

Byte2vec & its Application to Natural Language Processing Problems

Byte2vec & its Application to Natural Language Processing Problems

In this talk, we present byte2vec: a flexible embedding model constructed from bytes, and its application to downstream NLP tasks such as Sentiment Analysis. Byte2vec is an embedding model that is constructed directly from the rawest forms of input: bytes, and is: i. truly language-independent; ii. particularly apt for synthetic languages through the use of morphological information; iii. intrinsically able to deal with unknown words; and iv. directly pluggable into state-of-the-art NN architectures. Pre-trained embeddings generated with byte2vec can be fed into state-of-the-art models; byte2vec can also be directly integrated and fine-tuned as a general-purpose feature extractor, similar to VGGNet's current role for computer vision.

Motivation: In today's fragmented, globalized world, supporting multiple languages in NLU and NLP applications is more important than ever. The inherent language dependence in classical Machine Learning and rule-based NLP systems has traditionally been a barrier to scaling said systems to new languages. This dependence typically manifests itself in feature extraction, as well as in pre-processing steps. In this talk, we present byte2vec as an extension to the well-known word2vec embedding model to facilitate dealing with multiple languages and unknown words.

Parsa Ghaffari is an engineer and entrepreneur working in the field of Artificial Intelligence and Machine Learning. He currently runs AYLIEN, a leading NLP API provider focused on building and offering easy to use technologies for analyzing and understanding textual content at scale.

Buttontwitter Buttonlinkedin

14:10

Andrew Tulloch

Andrew Tulloch, Facebook

Deep Learning in Production at Facebook

Deep Learning in Production at Facebook

Facebook is powered by machine learning and AI. From advertising relevance, news feed and search ranking to computer vision, face recognition, and speech recognition, we run ML models at massive scale, computing trillions of predictions every day. I'll talk about some of the tools and tricks we use for scaling both the training and deployment of some of our deep learning models at Facebook. I'll also cover some useful libraries that we've open-sourced for production-oriented deep learning applications.

I'm a research engineer at Facebook, working on the Facebook AI Research and Applied Machine Learning teams to drive the large amount of AI applications at Facebook. At Facebook, I've worked on the large scale event prediction models powering ads and News Feed ranking, the computer vision models powering image understanding, and many other machine learning projects. I'm a contributor to several deep learning frameworks, including Torch and Caffe. Before Facebook, I obtained a masters in mathematics from the University of Cambridge, and a bachelors in mathematics from the University of Sydney.

Buttonlinkedin

14:30

 David J. Klein

David J. Klein, Conservation Metrics

Deep Learning for Biodiversity Conservation

Deep Learning for Biodiversity Conservation

Recent advances in sensor network technology, machine learning, and Big Data analytics can provide rigorous and cost-effective tools for monitoring biodiversity at scale. Conservation Metrics leverages these tools to monitor endangered species and ecosystems around the globe, and provides clients with the information needed for a data-driven approach to conservation. Matthew and David will discuss their technical approach and present several working case studies that show how deep learning can empower biologists to analyze petabytes of sensor data from microphones and cameras in remote corners of the world.

David J. Klein is the lead AI developer and advisor for Conservation Metrics. His scientific and entrepreneurial career has been devoted to developing neural-inspired learning algorithms for challenging sensor analysis applications, primarily in the auditory and visual modalities. A multiple startup veteran and advisor, David was the Algorithm Architect and Machine Learning Manager at Audience, developing brain-inspired speech analysis chips used in the iPhone and Galaxy; CTO and co-founder of BlackSwan Technologies, developing a neural-network based video CODEC; and CTO of Ersatz Labs, developing the first cloud-GPU deep learning platform. Deeply inspired by the natural world, his academic research explored the representation of complex sounds in the brain, and he developed the auditory system for a large AI entity at the Swiss Expo.02.

Buttontwitter Buttonlinkedin

Yuri Ivanov

Yuri Ivanov, Rethink Robotics

Panelist

Yuri Ivanov is an Innovation Scientist at Rethink Robotics. He has worked on many problems in Robotics and Artificial Intelligence and its application to ubiquitous sensing. Yuri's professional interests include Human-Robot Interaction, Computer Vision, Machine Intelligence, Big Data and Sensor Networks. He has published over 40 papers in scientific journals and conferences and holds a number of patents in the field of Computer Science and Robotics. Yuri holds a PhD and MS degrees from MIT and an MS degree from St. Petersburg State University of Aerospace Instrumentation.

Buttontwitter Buttonlinkedin

Kathryn Hume

Kathryn Hume, Fast Forward Labs

Moderator

Kathryn Hume leads marketing for Fast Forward Labs, a machine intelligence research company, and teaches courses on law and technology at the University of Calgary. Prior to joining Fast Forward Labs, she advised global law firms on data privacy and security, and managed Intapp's Risk Roundtable, a seminar series about legal risk management. Holding a doctorate in comparative literature from Stanford, speaks multiple languages and excels at helping organizations innovate by mapping new technologies to vertical business problems.

Buttontwitter Buttonlinkedin

Leonard D'Avolio

Leonard D'Avolio, Cyft

Panelist

Leonard D’Avolio, Ph.D. is the co-founder of Cyft and Assistant Professor at Harvard Medical School and Brigham and Women’s Hospital. He is an advisor to Ariadne Labs, the Helmsley Charitable Trust Foundation, and the Youth Development Organization. At the Department of Veteran Affairs he led informatics for the nation’s largest genomic science initiative and embedded the first clinical trial within an electronic medical record system. He founded Ariadne Labs’ informatics team, led strategic partnerships, and developed a system that has been used to improve 70,000+ childbirths in Uttar Pradesh, India. He is an invited speaker and writer for venues such as TEDMED, InformationWeek, and Scientific American. His work has been funded by the Department of Defense, AHRQ, NCI, Helmsley Charitable Trust Foundation and the Bill and Melinda Gates Foundation.

Buttontwitter Buttonlinkedin

Alejandro Jaimes

Alejandro Jaimes, Acesio

Welcome

Alejandro (Alex) Jaimes is CTO & Chief Scientist at Acesio. Acesio focuses on Big Data for predictive analytics in Healthcare to tackle disease at worldwide scale, impacting individuals and entire populations. We use Artificial Intelligence to collect and analyze vast quantities of data to track and predict disease in ways that have never been done before- leveraging environmental variables, population movements, sensor data, and the web. Prior to joining Acesio, Alex was CTO at AiCure and prior to that he was Director of Research/Video Product at Yahoo where he led research and contributions to Yahoo's video products, managing teams of scientists and engineers in New York City, Sunnyvale, Bangalore, and Barcelona. His work focuses on Machine Learning, mixing qualitative and quantitative methods to gain insights on user behavior for product innovation. He has published widely in the top-tier conferences (KDD, WWW, RecSys, CVPR, ACM Multimedia, etc), has been a visiting professor (KAIST), and is a frequent speaker at international academic and industry events. He is a scientist and innovator with 15+ years of international experience in research leading to product impact (Yahoo, KAIST, Telefonica, IDIAP-EPFL, Fuji Xerox, IBM, Siemens, and AT&T Bell Labs). He has worked in the USA, Japan, Chile, Switzerland, Spain, and South Korea, and holds a Ph.D. from Columbia University.

Buttontwitter Buttonlinkedin

Connect

Be Sociable

  • Twitter
  • Facebook
  • Linkedin
  • Youtube
  • Flickr
  • Lanyrd
  • Instagram
  • Google plus
  • Medium