Our voices are as unique as our fingerprints. So how would you feel if your voice was cloned?
In recent months, a new type of deepfake known as voice cloning has emerged, in which hackers use artificial intelligence (AI) to simulate your voice.
Famous faces such as Stephen Fry, Sadiq Khan and Joe Biden have already fallen victim to voice cloning, while an anonymous CEO was even tricked into transferring $243,000 to a scammer after receiving a fake phone call.
But how does it work and how convincing is it?
To find out, I let a professional hacker clone my voice, with terrifying results.
In recent months, a new type of deepfake known as voice cloning has emerged, in which hackers use artificial intelligence (AI) to simulate your voice. But how does it work and how convincing is it? To find out, I let a professional hacker clone my voice, with terrifying results.
Voice cloning is an artificial intelligence technique that allows hackers to take an audio recording of someone, train an artificial intelligence tool with their voice, and recreate it.
Speaking to MailOnline, Dane Sherrets, Solutions Architect at HackerOne, explained: “This was originally used to create audiobooks and to help people who had lost their voice for medical reasons.
“But today, Hollywood and, unfortunately, scammers use it more and more.”
When the technology first emerged in the late 1990s, its use was limited to experts with deep knowledge of AI.
However, over the years the technology has become more accessible and affordable, to the point where almost anyone can use it, according to Sherrets.
“Someone with very limited experience can clone a voice,” he said.
“It takes maybe less than five minutes with some of the tools out there that are free and open source.”
When the technology first emerged in the late 1990s, its use was limited to experts with deep knowledge of AI. However, over the years, the technology has become more accessible and affordable, to the point where almost anyone can use it, according to Sherrets (file image).
To clone my voice, all Mr Sherrets needed was a five-minute clip of me speaking.
I opted to record myself reading a Daily Mail story, although Sherrets says most hackers could simply extract the audio from a quick phone call or even a video posted on social media.
‘It’s possible to do this during a call, if there’s something shared on social media or even if someone is on a podcast. “It’s actually stuff we upload or record every day,” he said.
Once I sent the clip to Mr. Sherrets, he simply loaded it into a tool (which he chose not to name), which could then be “trained” on my voice.
“Once that was done, I could type, or even speak directly into the tool, and have it output the message I wanted it to be in your voice,” he said.
“The really crazy thing about the tools that exist now is that I can add inflections, pauses or other additional things that make the speech sound more natural, which makes it much more convincing in a scam scenario.”
Despite not including additional pauses or inflections, the first clip of my voice clone that Mr. Sherrets created was surprisingly convincing.
The robot’s voice nailed my hybrid American-Scottish accent perfectly, as it said, ‘Hi mom, I’m Shivali. I lost my bank card and I need to transfer some money. Can you send some to the account that just texted you?
However, the creepiness increased in the next clip, in which Sherrets added pauses.
“Towards the end you hear a long pause and then a sigh, which makes it sound much more natural,” explained the professional hacker.
Although fortunately my experience with voice cloning was only a demonstration, Mr. Sherrets highlights some of the serious dangers of this technology.
“Some people have received fake kidnapping calls, where their ‘child’ has called them, saying, ‘I’ve been kidnapped, I need millions of dollars or I won’t be released,’ and the child sounds very distraught,” he said. .
‘What we are seeing today, increasingly, is people trying to carry out more targeted social engineering attempts against companies and organizations.
‘I used the same technology to clone my CEO’s voice.
“CEOs often appear in public, so it’s very easy to get high-quality audio of their voice and clone it.
The robot’s voice nailed my hybrid American-Scottish accent perfectly, as it said, ‘Hi mom, I’m Shivali. I lost my bank card and I need to transfer some money. Can you send some to the account that just texted you?
Voice cloning is an artificial intelligence technique that allows hackers to take an audio recording of someone, train an artificial intelligence tool with their voice, and recreate it.
‘Having a CEO’s voice makes it much easier to get a quick password or access a system. “Companies and organizations must be aware of this risk.”
Fortunately, Sherrets says there are several key signs that a voice is a clone.
“There are key signs,” he told MailOnline.
‘There are the pauses, the problems where it doesn’t sound as natural, and there may be what you call “artifacts” in the background.
“For example, if a voice was cloned in a room full of people and there are a lot of other people chatting, when that cloned voice is used, some garbage will be heard in the background.”
However, as technology continues to evolve, these signs will become more difficult to detect.
“People need to be aware of this technology and constantly be suspicious of anything that asks them to act urgently – that’s often a red flag,” he explained.
“They should be quick to ask questions that perhaps only the real person would know, and not be afraid to try to verify things before taking any action.”
Sherrets recommends having a “safe word” with your family and friends.
“If you’re really in an urgent situation, you can say that safe word and they’ll know instantly that it’s really you,” he said.
Finally, the expert advises being aware of your digital footprint, and being attentive to the amount you upload online.
“Every time I upload content, it expands my audio attack surface and could be used to train the AI later,” he added.
“There are compromises that everyone will have to make, but it’s something to keep in mind: audio of yourself floating around can be used against you.”