“Do I need to take a coat today?”, “Play my workout playlist in the living room”, “Order me a takeaway.” Demands that your family might find pretty rude or tiresome, but perfectly acceptable to ask your voice and chat assistant. But we all know how frustrating these AI Assistants have previously been - you ask your phone to “Call Mom” and are answered with “Searching Google for Mom” - argh, no! We shouldn’t be to surprised, however, that the first generation of these assistants didn’t behave in the way we wanted them to. We were expecting our smartphones and household devices to understand our accents, colloquialisms, and emotions in the same way a human brain processes these types of information, when that is something that the machine simply cannot comprehend.
Assistants get confused when the user goes ‘off script’ and run into all sorts of trouble trying to identify the request that’s been made. The idea of a fully functioning assistant is to be able to have a conversation where the user asks a question and then goes on to ask follow up questions with the assistant keeping track of the answers and context. Many companies trying to do this are using are taking this are employing machine learning and natural language processing (NLP) to improve the UX and create a more meaningful connection with the user. Many companies also, are falling short of providing the results we have been hoping for.
‘The next generation of truly useful voice and chat assistants requires Deep-Domain Conversational AI’, says MindMeld, which collects and manages millions of training examples to satisfy the unique requirements of each application. Recently acquired by Cisco, the AI platform is powering a new generation of intelligent conversational interfaces. Vijay Ramakrishnan, ML researcher at MindMeld explains that not only do the machines need to be able to learn from questions that they have been asked, but they must combine this with the ‘current landscape of academic research in NLP, and combine these academic models with proprietary methods developed within Mindmeld.’
Additionally, Vijay is currently working with infrastructure tools to optimise and support their machine learning architecture to help improve the model and allow it to learn faster over time. At the AI Assistants Summit in London next month, Vijay will be sharing his work in further detail and discussing how he and his team are ‘beating state-of-the-art Named Entity Recognition models for noisy datasets using LSTMs and domain specific lookup tables’.
With the Summit less than a month away, we had the chance to ask Vijay 3 questions we were keen to learn more about:
AI Assistants will be an integral part of the service sector in the next 5 years. Assistants will have deeper links to the enterprise’s domain knowledge and will be able to semantically parse a wider range of human language expressions. This will improve the quality of service for these domain specific assistants to a point where they will be good enough to service end-to-end solutions to customer issues which have currently failed in typical methods like interactive voice responses.
Second, we will see assistants being used as API-like interfaces in business to business communication. Instead of calling up a supplier to notify them about a shipment being received, such notifications will be automatically communicated through a virtual assistant messaging the supplier’s virtual assistant. Such assistant to assistant communications will replace rigid APIs we currently see today.
Acquisition of large amounts of high quality labeled data is important to train deep learning models. It can be expensive to acquire and curate such data, especially in the NLP domain. Second, interpretability of deep learning models is an issue that we face when debugging and explaining these models to stakeholders.
AI and ML have played a crucial role in applications driven by natural language. AI and ML are more effective at generalizing to arbitrary natural language constructs than conventional models based off only gazetteers. They learn latent language structures that approximate semantic meaning, such as developments in word (word2vec, GloVe) and character embeddings, and that can be agnostic to language representation, like machine translation use-cases. Such fundamental advancements in natural language understanding drive virtual assistants to become more useful to humans as conversational interfaces.