Would you call Siri your friend? What about Google Assistant, or Alexa?
I can’t imagine we’re at the stage where many people consider their personal virtual assistants friends, but why is that? Besides the fact that it’s a computer not a human, lack of empathy and emotional intelligence is a key reason that we don’t form relationships with these devices.
At the AI Assistant Summit in London earlier this month, Artem Rodichev, head of AI at Luka spoke about building an AI friend and the importance in identifying the user’s emotions from the conversation, to detect how emotions evolve over time, and to understand the user’s emotional needs. Artem spoke about how chatbots can simulate emotional connection with users using different modalities, including text, speech, and vision, and how we can use deep learning to build the best friend for a human.
I caught up with Artem to learn more about his work:
I lead AI team at Luka. We are building a product called Replika. It's a chatbot that tries to be your digital friend. The primary goal of Replika is to give you emotional support, cheer up and encourage you.
In the AI team we are constantly improving our dialogue models and NLP components, as well as doing a lot of research in conversational AI. For example, we recently open-sourced a part of Replika AI technology called CakeChat. CakeChat is a generative dialog system built purely on neural nets, and it's able to express emotions in a text conversation. It was trained on tons of human text dialogs and can generate dialog responses by itself. The key ingredient of CakeChat is emotional conditioning that allows switching emotions expressed in a conversation. You can try it on our demo page here.
Around two years ago we had other app to help us in trying to understand where a conversational interface is the most suitable. We've built a whole bunch of bots: restaurant recommendation bot, weather bot, bots that play games with you, etc. At the same time, one story happened.
It started with Roman Mazurenko. Roman was the closest friend of our CEO, Eugenia Kuyda. He tragically died in a car accident when crossing the road in Moscow. Eugenia decided to build a digital memorial for Roman and make a bot for him. We already had all tools and infrastructure for building bots because of Luka app.
We gathered together all of Roman's chats with friends and used them to recreate his personality. Having dataset with his texts, we've built a dialog model using neural nets and launched Roman bot publicly. To our surprise, a lot of people had long and very emotional conversations with the bot even if they didn't know Roman personally. It became a big story, and the Verge published an article about it.
Since then, we've built three bots of characters from the Silicon Valley show and one for Prince. Again, we saw the same effects on all these bots. That's how we realized conversational interfaces work really well for emotional dialogs and decided to build an AI that can understand and use human emotions.
I think the main challenge now is data. If you want to build a bot that talks like a good friend you need tons of chitchat dialogs of friends to train your neural nets. It's hard to get high-quality dialog data if you are not Facebook or Google who have their messengers with millions of users.
Also, it's still challenging to work with natural language. Language is very ambiguous and keeps changing all the time. For example, it's difficult to detect sarcasm or jokes because neural nets should have strong abilities to reasoning, understand the world and current events. Besides, it's not always possible to understand the emotional state of a person by the text, and you need to use other modalities such as images, speech, video.
Chatbots can better understand users by establishing an emotional connection and therefore help them. If chatbot can read your mood and current emotions it can better understand what you need. Conversations that are less about achieving some task but more about just chatting, laughing, talking about how you feel - the things we mostly do as humans.
Natural language is a powerful interface to interact with the world, and deep learning extends abilities of a machine to understand us. For example, recent advances in machine translation allow us to understand each other without knowing each other's languages.
I really passionate about conversational AI and I think that in ten years everything will be like in the movie “Her” by Spike Jonze. You will have a personal secretary with whom you develop an emotional relationship, who understands your mood and who helps you in all aspects of your life - from reading your emails to motivating you to achieve your goals.