If you’ve been following the development of AI and automated home technology, you probably weren’t surprised a couple of years back when Google posted an ad for comedy writers. They felt, it seems, that their Google Home virtual assistant needed a little more polish and sparkle. So they hired talent from Pixar and the Onion to help make dialogue livelier. To quote Kevin J. Ryan in his Inc. article, “Personality can be hard to nail down in AI.”
Like Google Home, other developers are engaged in serious conversational development. Amazon has also invested tremendous amounts of money and manpower in the same pursuit.
They have dedicated 5000 staff to extract awkward loops, among other things, from AI chatter, and they’ve actually gone a step further: Amazon has also sponsored a competition called the Alexa Prize. In it, they invited 15 global university and college teams of the world’s best computer science graduate students, to solve one tricky problem: a social bot that can coherently chat on popular topics with humans for 20 minutes.
Coherent – adjective (Of a person) able to speak clearly and logically. United as forming a whole.
The New Oxford American Dictionary
In the world of If/Then chatbot scripts, coherence is sublime.
What do we mean when we say coherent? In other words, it doesn’t sound like a patchwork quilt of thoughts. To pull that off you have to harness just the right amounts of machine learning and big data. You want it to sound more like you’re actually talking to a person, and not a series of best responses off a script.
The Alexa purse for the best performing socialbot is $500K with an additional $1Million if the winning team’s socialbot went 20 minutes conversing engagingly and coherently with humans. The top team, University of Washington, averaged 10 minutes and 22 seconds. Amazon opened its enrollment for the 2018 competition on December 4th.
Photo Courtesy GeekWire
Alexa currently has more than 25,000 skills and can be integrated into other third-party devices. The competition has definitely moved Amazon’s assistant ahead of the pack, at least in terms of problem-solving transparency.
But how does the industry stack up as a whole?
The virtual assistant industry
Google and Amazon are not alone in their quest, not by a longshot. Others are frantically trying to differentiate, break free from the pack, and grab highly coveted market share:
- Apple Siri
- Microsoft Cortana (Harman Kardon INVOKE)
How much money is there in this very specific gadgetry?
In a recent study, Grand View Research found that the entire “virtual chat” market may reach $12.28 B by 2024. These numbers capture more than just the Alexa-type unit. They also include the emerging trend of assistant-as-an-app services and will have a hefty stake in the mobile applications sector.
That really shouldn’t come as a surprise to smartphone users.
Even if you have resisted the practice of using Siri or Cortana, you probably realize the potential convenience of mobile access.
Here are the top 4 in terms of overall revenue last year (according to Statista):
- Apple did roughly $229.23 B in 2017.
- Amazon did roughly $166 B in 2017.
- Google did roughly $109.65 B in 2017.
- Microsoft did roughly $90 B in 2017.
I don’t think any of these companies would mind tacking on another $2-3 Billion a year by fiscal 2024.
“The goal,” says Jared Newman at Fast Company, “is not just to win consumer market share–that’ll come later. It’s actually to establish AI ecosystems with widespread device and developer support.”
Right now, though, the key to grabbing the greatest share of the virtual assistant market is simple—solve the conversational-coherency problem.
I mean, all you have to do is make an IF/THEN chatbot algorithm sound like you’re talking to a person. Easy, right? To get an idea of the complexity of this problem, let’s look at basic chatbot script theory.
Photo courtesy Chatbots Magazine
Why it’s such a challenging problem
If you read the WIRED article, then you know that UW won because they found the right balance between:
- the use of data
- machine learning.
Here’s a simple customer-service script developed by Maruti Techlabs:
Chatbot: Hi Robert, I’m a customer service bot. How may I help you today?
Robert: I received a bill in error. I just paid my bill 2 days ago and those payments
aren’t reflected in the bill.
Chatbot: Sorry for the inconvenience. May I have your account number?
Chatbot: Thank you for the information. Our system shows you did pay the bill 2 days
ago. The bill was generated a week before. You can simply disregard it. Will that be all
for you today?
We encounter these all the time when interfacing with phone providers and such. These types of conversations are pretty straightforward.
But organizations like Maruti are working to improve this process. Their methodology to solve the problem is this:
Language and reasoning frameworks are going to blend with big data and machine learning to give way to conversational user interfaces that better understand customer needs and wants, and better understand the customer and his surroundings.
How machine learning factors in
Machines can actually learn how to initiate and guide the user. For in-home virtual assistants, there needs to be greater flexibility. And Amazon, for one, would like to introduce normal conversation into the mix.
Lorna Jane, a blogging developer, writer and speaker out of the UK, who likes solving problems like Alexa socialbot, wrote the following rules to help train the Alexa she keeps in her office:
The basic idea is that when creating the “intent”, i.e. the action that you want Alexa to do, you also define “slots”. The slots are the variables;
In the chatbot conversation above, Robert’s replies to How may I help you today might be:
- Payment is late because the check is in the mail
- Cat ate my check
- Got lost in the laundry
- I don’t have service with your company
- Canceling service
- Wrong department need technical support
Sorry for the inconvenience might cover each one of these replies, but it might also come across wooden in the moment, more like an answering machine than an associate who can solve the problem.
Lorna Jane is not crazy about using automatic wording—i.e., scripted responses like Sorry for the inconvenience.
In her world, the ideal socialbot would need to rely more heavily on machine learning.
Interestingly, the second-place team from Czecholovakia focused on a painstaking list of handwritten conversation-guiding rules. Their reasoning was that dialogue written from Machine learning is neither beneficial nor funny. The only problem was, they didn’t have the time or resources to imagineer the vast scope of responses. Machine-learning can get there, given enough time and data to learn from.
In the Czech team’s defense, it’s kind of tough to criticize the runner up in a project that massive. They did, after all, come in second. They must know what they’re talking about.
Back to Amazon and their competitors: though machine-learning and big data plays a part, there still lies the scripting piece. And that’s what brings us to that surprising job we mentioned in the headline.
AI script writers
If you’re interested, you can find job postings for AI script writers on sites like Glassdoor or LinkedIn. Here are some of the details from a job post for Apple Siri International for a Writer / Editor position.
Writers are to help evolve Siri as a distinct, recognizable character.
- Develop and write dialogues in your native language
- Excellent writing skills
- Fluent in English and native in one of the following: French, German, British English, Indian English
- Strong interest in current news and pop culture
- Great vocabulary, literacy, and cross-cultural knowledge
- Demonstrated experience writing character-driven dialogue
Who knew Smart Home product development would get so … artsy?
Virtual assistant socialbots take an enormous amount of pre-work, scripting, machine learning, and sweat equity to pull off with any kind of aplomb. We’ve got a ways to go as this technology is literally in its infancy.
Lifelike virtual assistants will begin to develop and maybe sooner than we think. In the meantime, developers like Google will continue looking for ways to get there like hiring comedy writers.
Alexa and other virtual assistants are a tailor fit for PoE networking technology. Click here to find out more about our Home Automation products.