<pre><pre>Google defends letting human workers listen to assistant voice conversations

Google defends the practice of hiring human employees, most of whom seem to be based on contract employees around the world, listening to audio recordings of conversations between users and the Google Assistant software. The answer comes after revelations from the Belgian public broadcaster VRT NWS detailed how contract workers in the country sometimes listen to sensitive audio recorded by the Google Assistant in an accident.


Published in a blog post today, Google says it needs precautions to protect the identity of users and that it has "a number of safeguards to prevent so-called" false acceptance ", meaning that the Google Assistant is activated on a device such as a Google Home speaker without that the correct wake-up has been intentionally expressed by a user.

The company also says employees review these conversations to make Google & # 39; s software work in multiple languages. "This is a crucial part of the speech technology building process and is needed to create products such as the Google Assistant," writes David Monsees, a product manager in the Google Search team who wrote the blog.

"We have just heard that one of these language reviewers has violated our data security policy by leaking confidential Dutch audio data," Monsees adds, referring to excerpts of audio that the Belgian contract worker has shared with VRT NWS. "Our security and privacy response teams have been activated for this issue, are investigating this and we will take action. We are conducting a full review of our protections in this area to prevent misconduct such as this happens again."

In addition, Google claims that only 0.2 percent of all audio clips have been reviewed by language experts. "Audio clips are not linked to user accounts as part of the review process, and reviewers are in charge of not transcribing background conversations or other sounds and only transcribing clips that are sent to Google," adds Monsees.

Google goes on to say that it offers users a wide range of tools to view the audio stored by Google Assistant devices, including the ability to manually remove those audio clips and set timers for automatic deletion. "We are always working to improve the way we explain our settings and privacy procedures to people and will explore ways to further clarify how data is used to improve speech technology," concludes Monsees.

What is not covered in the blog post is how the number of general requests that employees around the world judge for general natural language improvements, and not just to ensure that the translations are correct.


It is widely understood by people in the artificial intelligence industry that human annotators should help understand raw AI training data and that these employees are employed by companies such as Amazon and Google, where they can access both audio conversations and text transcripts from some conversations between users and smart home devices. In this way, people can view the exchanges, correctly annotate the data and record any errors, so that software platforms such as Google Assistant and Amazon Alexa can improve over time.

But neither Amazon nor Google has ever been completely transparent about this and it has led to a number of controversies over the years that have only intensified in recent months. Since Bloomberg reported in April on Amazon's extensive use of human contract workers to train Alexa, big tech companies in the smart home sector are forced to decide for themselves how these products and AI platforms are developed, maintained and improved over time.

Often the answer to those questions is small armies of human workers, listening to recorded conversations and reading transcripts as they enter data for the underlying computer learning algorithms to process. Yet there is no mention of it being Google Home privacy policy page. There are also GDPR implications for European users when this level of data collection is performed without proper end user communication and consent.

If you want this data to be deleted, you must do this jump through a good number of hoops. And in the case of Amazon and Alexa, some of that data is stored indefinitely, even after a user decides to remove the audio, the company just said last week. The privacy settings of Google seem more robust than those of Amazon: with Google you can completely disable the storage of sound data. But both companies are now struggling with a wider audience that is becoming aware of how AI software is being beta-tested and used in real-time, while devices are being driven in our bedrooms, kitchens and living rooms.

In this case we have a Belgian news organization that says that no fewer than 150 Google Assistant recordings have been found from 1000 fragments that have been supplied by a contract employee and have been accidentally recorded without a word of a word being broadcast. That the employee in question who could easily get this data, contrary to the privacy of users and the clear precautions of Google, is disturbing. Even more doubtful is how the worker says he could merge sensitive events into the user's homes, such as a potential threat of physical violence caught by false acceptance when the worker heard a female voice that seemed like he was in need.

Obviously, owning a Google Home or similar assistant device and allowing you to listen to your sensitive daily conversations and verbal internet requests is at least some form of privacy damage. The use of a Google product does that, because the company makes money by collecting that data, storing it and selling targeted advertisements against it. But these findings contradict Google & # 39; s claims that it apparently does everything it can to protect the privacy of its users, and that its software does not listen unless the silent word is spoken. It is clear that someone is actually listening somewhere else in the world. And sometimes they shouldn't be either.