A San Diego, California, family plays rock, paper, scissors with Amazon’s Alexa.

Tribune Content Agency LLC/Alamy Stock Photo

When will Alexa, Google Assistant, and other ‘chatbots’ finally talk to us like real people?

Nearly every tech company worth its silicon now has a virtual assistant: Apple’s Siri, Amazon’s Alexa, Microsoft’s Cortana, and Google Assistant, among others. What do these digital helpers—known as chatbots—have in store for us? Science talked with Alexander Rudnicky, a computer scientist at Carnegie Mellon University in Pittsburgh, Pennsylvania, who studies human-machine conversations, about what chatbots can and can’t do, how they learn from us, and whether we can prevent them from adopting our worst behaviors. This interview has been edited for brevity and clarity.

Q: Let’s start with a pretty basic question. What is a chatbot?

A: Originally, they were dialogue systems that could have some sort of purposeful interaction with a human through text or speech. In the research community, the term “chat” has come to refer to non–goal-directed interaction, the way two people might talk to each other at a party.

Q: How can chatbots learn from humans?

A: The computer first needs to figure out that it doesn’t know something. Then it needs to figure out the right questions to ask. Think of it as an active form of learning—interactive, in fact. It’s modeling how humans learn from each other.

Q: Are there other ways chatbots can actively learn from us?

A: They could also use experimentation. If I’m rambling on about something and your attention wanders, I have to change what I’m saying to get you back in tune. You might give an automatic system the ability to note engagement, but then it has to find good strategies to get your attention back. It might learn to yell at you. Maybe that’s not such a good idea. But there are a lot of things it could try.

Q: How do Siri and Alexa fit into all of this?

A: I don’t consider Siri to be a chat system in the strict sense. I would call it merely an information access system. It allows you to call someone in your contacts, figure out what the weather is, or learn how to get somewhere. The programmers also did some clever stuff, like putting in answers to questions like “Will you marry me?” If you have a few hundred of those, people start thinking, “Wow, she’s really real.” Alexa has more skills, but it’s fundamentally the same thing.

Q: What are the biggest challenges in programming chatbots?

A: Historically, a developer would have to enumerate all the possible ways someone might say something. That was a big stumbling block for a very long time. More recent systems use what’s called “intent recognition” to get at the underlying meaning of what someone says. They use word associations, then find the closest known expression, and respond to that.

Alexander Rudnicky

Carnegie Mellon University

Another challenge is in being able to make use of context and world knowledge. So if I ask a system, “Hey, I’d like to go out to dinner with my friends yesterday,” what I said doesn’t make any common sense. There has to be a part of the system that says, “I should tell this person that I don’t understand that.”

Q: How do we avoid offensive chatbots like Microsoft’s Tay, which began echoing Twitter users’ racist and anti-Semitic sentiments?

A: That was a great case study of “Let’s build a bot that learns from people.” People started jerking it around. You can imagine that one thing missing there was a better model of what it ought to be learning about. On the other hand, people are really inventive when they want to cause problems. I don’t know if you can control it.

Q: What problems are you trying to solve in your own research?

A: I’ll describe one project. People do all sorts of complicated things with their smartphones. They might be using several apps at the same time. If you ask people what they’re doing, they might tell you things like, “I was planning an evening out with my friends, and I wanted to check up on restaurants and shows, message back and forth, look at a map, and so forth.” Would it not be interesting, or at least useful, if a chatbot could notice that you’re doing something purposeful across different apps, ask what it is, and then gradually start helping you out? The most minimal thing it could do is, the next time you say, “I want to organize a dinner,” it would know the apps to display. A more sophisticated task would be passing information from one app to another. It might put the restaurant information into a message to your friends, for example.

Q: Is it possible for different chatbots to combine their knowledge?

A: On some level, those are issues of standards. They all need to agree on the same representation of knowledge, called an ontology. If they do then in principle, yes, you can do that kind of sharing. But there are serious privacy issues with sharing knowledge. If I consistently use one of these agents, it’s going to know a lot about me, about my friends, about what kind of food I like. Maybe I don’t care, but maybe there’s other stuff that I really wouldn’t want others to know about me.

Q: Do you worry about people mistrusting assistants that seem human, but aren’t quite human enough—what robotics researchers refer to as the “uncanny valley”?

A: We can run into people and there’s just something really weird about them. They’re not quite following the conventions or the rules of interaction. They’re operating outside of our expectations, which is kind of disturbing. The uncanny valley thing is basically what happens when your models aren’t good enough, but just good enough to sort of pass. There’s nothing intrinsically interesting about it other than it’s a symptom of a shortcoming.

Q: So when will Alexa finally seem real?

A: What would seem real to me is a system that has much better context awareness, for example knowing not to offer things at the wrong times or talk about the obvious, and one that has better introspection, knowing when it doesn't know something, admitting it, and asking for advice.