Zuckerberg’s creepy new AI is ‘too dangerous’ to disclose: Meta is making the most advanced AI speech model ever that can mimic the voice of anyone, including deceased relatives – but the company fears scammers will abuse it
Meta boasted on Friday that it has produced “the most versatile AI for speech generation” in existence.
But it added that the company would not disclose their AI model due to serious concerns about the advanced technology’s “potential risks of abuse”.
In recent months, scammers have become adept at using AI-generated speech to commit grisly and shocking crimes, including an April attempt to fake the kidnapping of a teenage girl in Arizona, terrorizing the young girl’s distraught mother. with realistic AI-generated pleas.
But Meta suggested some more optimistic use cases in their press release, stating that Voicebox could be used to help the blind and visually impaired hear messages from their friends and loved ones, or let non-native speakers play translations of their own words. , with their own voice, but in a foreign language.
Meta called their new Voicebox generative AI model “the most versatile AI for speech generation” in existence. But the company added that it would not make the AI public, due to the company’s own grave concerns about the “potential risks of abuse” of the advanced technology.
The announcement comes just over a month after Zuckerberg (pictured) was turned down by the White House — which explicitly told reporters that Meta representatives weren’t invited to a West Wing summit exclusive to companies at the forefront of AI -innovation.
At the moment, the company said its AI model can speak six languages: English, French, Spanish, German, Polish and Portuguese.
Meta also offered some more business-oriented use cases for the technology, including deploying Voicebox as a means for audio creators to more easily edit unwanted background noise or errors from their audio or video tracks.
It also suggested that Voicebox could be used to create more reassuring, naturalistic voices for virtual assistants and more realistic sounding characters in video games.
But all of these brave new capabilities will not yet be made available to developers hoping to play in Meta’s Voicebox sandbox, the company said in a press release.
A promotional video for Voicebox released Friday showed off the AI’s ability to convert text-to-speech into a wide variety of voices
“There are many exciting use cases for generative speech models,” the company says said in a research post“but due to the potential risks of abuse, we are not disclosing the Voicebox model or code at this time.”
“While we believe it is important to be open with the AI community and share our research to advance the state of the art of AI,” the company added, “it is also necessary to strike the right balance. find between openness and responsibility.’
Meta’s deep learning AI researchers noted in their post introducing Voicebox that their system uses a method called Flow Matching, which outperforms diffusion models used by today’s cutting-edge systems, such as VALL-E and zero-shot text -speech.
Voicebox, they said, produced artificial audio that was more intelligible and scored a lower word error rate of 1.9 percent compared to their competitor’s 5.9 percent.
It also has a higher ratio of producing audio similarity (0.580 vs. 0.681), while being nearly 20 times faster according to Meta.
When translating into different languages, Voicebox outperformed a highly regarded multilingual text-to-speech AI, YourTTSreducing the average word error rate from 10.9 percent to 5.2 percent, and increasing the audio similarity ratio from 0.335 to 0.481.
The announcement comes just over a month after Zuckerberg was rejected by the Biden White House – who explicitly told reporters that Meta representatives were not invited to a West Wing summit that was exclusive to companies at the forefront of AI innovation.