Amazon is one day ahead of its annual hardware event for the fall big partnership announcement: it has created the Voice Interoperability Initiative, a kind of declaration of intent from more than 30 different companies that strive to ensure that devices work simultaneously with multiple digital assistants. For example, you can talk to Alexa or Cortana on the same smart speaker by simply saying the right keyword.
"So many people want the headline that there will be one voice assistant who will govern them all, we disagree," says Amazon & # 39; s SVP of devices and services Dave Limp. "This is not a sporting event. There will not be one winner." Limp states that if there are always multiple voice assistants, they should work together better.
A wide range of companies that build both software and hardware for speech assistants have signed up to the initiative. I'm just going to quote Amazon's press release directly to give you some of the companies on the list, because it's clear that Amazon is going to face some shock and awe here, especially since the list contains a few big players. I have some striking bold print:
More than 30 companies support the effort, including global brands such as Amazon, Baidu, BMW, Bose, Cerence, Ecobee, Harman, Logitech, Microsoft, Sales team, SonosSound United Sony Audio Group, Spotify and Tencent; telecommunication operators such as Free, Orange, SFR and Verizon; suppliers of hardware solutions such as Amlogic, InnoMedia, Intel, MediaTek, NXP Semiconductors, Qualcomm Technologies, Inc., SGW Global and Tonly; and system integrators such as CommScope, DiscVision, Libre, Linkplay, MyBox, Sagemcom, StreamUnlimited and Sugr.
It is a very long list and three very prominent companies are missing: Google, Apple and Samsung.
The companies that to be on board seem pretty bleached, if the quotes they provided for the Amazon press release are any indication. Intel said his 10th generation of chips this year with & # 39; multiple assistants & # 39; will work, and Qualcomm said his chipsets can already do multiple wake words.
If you read between the lines of this statement by Andrew Shuman, CVP Cortana at Microsoft, you will find the softest possible nod to how Google and Apple have made their platforms unfriendly to external assistants: “We expect the initiative to help us expand vision to even more companies and promote a balanced ecosystem that enables companies to create and make their assistants available on all platforms. (Focus on mine.)
More intriguingly, other companies seem eager to get their voice assistants on Echo devices. Salesforce CEO Marc Benioff writes: "We look forward to working with Amazon and other market leaders to make Einstein Voice, & # 39; the world's leading CRM assistant, accessible on any device." Meanwhile, the Spotify R&D employee says: "Join the Voice Interoperability Initiative, which offers our listeners a more seamless experience for every voice assistant they choose, including the possibility to ask for Spotify directly. (Focus on mine.)
Baidu's participation is also remarkable. The voting assistant of the Chinese company is over 400 million users, which is more than Alexa but less than Google Assistant. Baidu only follows Amazon as the second largest maker of smart speakers, according to research firm Canalys, who recently caught up with Google, even though it only serves the Chinese market.
The idea, these companies hope, is that there will be two types of assistants. One type will be broad in knowledge and possibilities (think of Alexa, Siri and Google), but others will be narrow and deep, context-specific for their knowledge domain. The goal is to make it possible to talk directly to one of them on a smart speaker without the need for an average skill.
It is a strategy that is already playing on PCs. Amazon's voice assistant is more closely integrated into Windows 10, allowing locked PCs to respond to common questions when someone calls "Alexa" from the room. Microsoft & # 39; s Cortana is being re-focused on interactions with the company's software and services.
Limp compares its vision for voice assistants with browsers: you can use any browser you want, to which website you want, so why can't you speak to which speaker you want to talk to every assistant you want? "We are a web 1.0 company," says Limp, "and the reason this building exists where I am now is a function of web interoperability."
It is a very high-quality ideal, but it can also be strategically smart. Amazon already has a strong position at home with Alexa, so it doesn't seem to be a big deal to get other assistants to work on its Echo speakers. For clarity's sake, Amazon is committed to making that happen. The company previously announced that Orange customers in France can purchase Echo speakers that support both Alexa and Orange's Djingo assistant.
Alexa has however not was equally successful on phones, despite several attempts at collaboration with Android manufacturers and headset makers. A sector-wide initiative involving everyone, except the three most influential companies in smartphones, seems to be tailor-made to put pressure on those companies. (It can also help Amazon claim that it is not monopolistic because it is so willing to play well with others and to open up its voice platform to competitors.)
Whether you consider it altruism or strategic 4D chess, the initiative can at least put some pressure on Google. It was more reluctant to let Google Assistant work with other software, although perhaps for reasons related to privacy rather than market dynamics.
When asked specifically about Google, Apple and Samsung, Limp says that "those three companies, we would like to have a part of this initiative." That sounds very much like they were rejected, but Limp refused to go into that.
He says that although he has been talking about this idea with other companies for a while, it has only merged into something more formal over the past "six weeks." Knowing how fast (or slow, if any) companies like Google and Samsung are moving, six weeks doesn't seem like a lot of time. As far as Apple is concerned, it is not known to be a carpenter.
Google made a statement to us and noted that this weekend it only heard about this initiative:
We have just heard about this initiative and should review the details, but in general we are always interested in participating in efforts that have broad support for the ecosystem and maintain strong privacy and security practices.
We contact Samsung and Apple for comments.
For clarity's sake, Limp does not want to believe that this initiative will put pressure on those companies: "If they do not want to do it, this will not change their minds."
From a technical perspective, there are a thousand questions about implementation, software, privacy and more that we do not yet have answers to. The Voice Interoperability Initiative is not intended to be a standard instance, nor does it seem to be prescriptive about how its members should tackle the complex issues of creating a single speaker that supports multiple assistants at the same time.
Amazon is giving away its "watchword engine" for free, so that other companies that want to build their own assistants can use Amazon's research to get started. But companies in the consortium are free to use any technology they want.
To date, there have not been many devices that support & # 39; multiple simultaneous wake words & # 39 ;. Consider the Facebook portal, some cars and a few Android phones. More prominent devices, such as the Sonos One, allow users to choose between Alexa or Google Assistant per loudspeaker.
But there is not really a technical limitation there. Antoine Leblond, software vice president at Sonos, demoe a Sonos One speaker who worked yesterday with both the "Alexa" and the "Hey Google" words active for me during a video conference. It worked great, including the "continuity" function of Sonos that allows you to start music with one assistant and then arrange with the other.
I tried to find out why Leblond is not the way the Sonos One works, as I have done several times in recent years. In particular, since Amazon has repeatedly said that it is happy that Alexa coexists with another assistant, does Google not allow this? Leblond muffled it, but he brought forward the fact that many things can go wrong with two active assistants on a single speaker. For example: if you set an alarm with one assistant and you are not around when it goes off, how does your family know which assistant should tell you to shut up?
Finding out how to implement multiple assistants from a technical perspective is not even the most difficult problem. If there is anything that has taught us last year, it is that few people realized to what extent voice assistants collected our data. Ongoing scandals have affected Amazon, Google, and Apple with their practices to get human reviewers to check the quality of transcripts. All three have changed the course considerably, which has increased transparency and makes it easier to log off, delete your data or both.
A consortium of 25 companies that wants to make it easy for multiple assistants to live together does not sound like a great recipe for privacy. But Limp emphasizes that he wants to be well-informed about how these systems are structured.
For example, he believes there should be strict rules where one assistant should never & # 39; listen & # 39; to a conversation with another assistant. That seems simple, but there are more difficult problems. Does most of the work involved in listening to different wake words have to be handled by hardware or software? When Limp says that "voice assistants (could someday) be able to collaborate privately on behalf of clients in a way that preserves context and continuity," how exactly is that privacy guaranteed?
And it becomes even more difficult: a common problem in the past year was the realization that these assistants accidentally answered without hearing their wake. So in a world where a speaker can have two or a dozen different assistants ready, what happens to those accidental recordings?
There are no clear answers to these questions, six weeks after discussions about forming the initiative became serious, only a commitment to sort them out. I asked Sonos if there are meetings or contracts or even a contribution, and the answers were no, no and no. It is all very early.
Amazon, especially with Alexa, is known for rapidly widening its ecosystem, sometimes at the expense of clarity or software quality. Just think of the early days (and some more recent ones) of using skills with Alexa, which often require peeled, specific assignments. At least this time Amazon doesn't seem to be rushing.
"We have been in this for five years," says Limp. Looking at the technical and privacy issues here, he believes that & # 39; it is a manageable problem, but not a trivial problem. It will take many, many years to resolve. "