Table of Contents
You, presumably a human, are a crucial part of detecting whether a photo or video was taken by artificial intelligence.
There are detection tools, manufactured both commercially and in research laboratories, that can help. To use these deepfake detectors, you upload or link to media that you suspect might be fake, and the detector will give you a percentage chance that it was generated by AI.
But your senses and understanding a few key clues provide a lot of information when analyzing media to see if it is a deepfake.
While regulations for deepfakes, particularly in elections, lag behind rapid advances in AI, we have to find ways to determine whether an image, audio or video is actually real.
Siwei Lyu made one of them, the DeepFake-o-meter, at the University at Buffalo. His tool is free and open source and compiles more than a dozen algorithms from other research labs in one place. Users can upload a media and run it through the tools of these different labs to get an idea of whether it could be generated with AI.
The DeepFake-o-meter shows both the benefits and limitations of AI detection tools. When we ran some known deepfakes through various algorithms, the detectors gave a rating for the same video, photo or audio recording that ranged from 0% to 100% probability of having been generated by AI.
AI and the algorithms used to detect it may be biased by the way it is taught. At least in the case of the DeepFake-o-meter, the tool is transparent about that variability in results, whereas with a commercial detector purchased from the app store, it’s less clear what its limitations are, he said.
“I think a false image of reliability is worse than low reliability, because if you rely on a system that is fundamentally unreliable to function, it can cause problems in the future,” Lyu said.
Its system is still basic for users and was only publicly launched in January of this year. But its objective is that journalists, researchers, researchers and everyday users can upload media to see if it is real. His team is working on ways to classify the different algorithms it uses for detection to inform users which detector would work best in their situation. Users can choose to share the media they upload with Lyu’s research team to help them better understand deepfake detection and improve the website.
Lyu often serves as an expert source for journalists trying to assess whether something might be a deepfake, so he walked us through some well-known deepfakery cases from recent memory to show the ways we can tell they’re not real. . Some of the obvious signs have changed over time as AI has improved and will change again.
“You need to hire a human operator to perform the analysis,” he said. “I think it is crucial that there is collaboration between human algorithms. Deepfakes are a technical-social problem. It will not be solved solely with technology. It has to have an interface with humans.”
Audio
A robocall that circulated in New Hampshire using an AI-generated voice of President Joe Biden encouraged voters there not to attend the Democratic primary, one of the first major cases of a deepfake in this year’s US elections.
When Lyu’s team ran a short clip of the robocall through five algorithms on the DeepFake-o-meter, only one of the detectors returned more than a 50% AI probability; That one said it had a 100% probability. The other four had between 0.2% and 46.8% probability. A longer version of the call resulted in three of the five detectors coming in with greater than 90% probability.
This is based on our experience creating audio deepfakes: they are harder to detect because you rely solely on your hearing, and easier to generate because there are tons of examples of public figures’ voices that AI can use to create the voice of a person. say what they want.
But there are some clues in robocalls, and audio deepfakes in general, that we should pay attention to.
AI-generated audio often has a flatter overall tone and is less conversational than the way we normally speak, Lyu said. You don’t hear much emotion. There may be no proper breathing sounds, such as breathing before speaking.
Also pay attention to background noises. Sometimes there are no background noises when there should be. Or, in the case of the robocall, there is a lot of noise mixed in in the background almost to give an air of reality that actually sounds unnatural.
Photos
With photographs, it helps to zoom in and closely examine any “inconsistencies with the physical world or human pathology,” such as buildings with crooked lines or hands with six fingers, Lyu said. Small details like hair, mouth, and shadows can give clues as to whether something is real.
Hands were once a clearer indicator of AI-generated images because they more often ended up with additional appendages, although the technology has improved and that is becoming less common, Lyu said.
We send you the photos of Trump with black voters which a BBC investigation found to have been generated by AI via the DeepFake-o-meter. Five of the seven fake image detectors returned a 0% probability that the fake image was fake, while one recorded a 51% probability. The remaining detector said that no face had been detected.
Lyu’s team noticed unnatural areas around Trump’s neck and chin, people’s teeth facing outward, and webbing around some fingers.
Beyond these visual oddities, AI-generated images appear overly bright in many cases.
“It’s very difficult to express it in quantitative terms, but there is an overall view and appearance that the image looks too plastic or like a painting,” Lyu said.
Videos
Videos, especially those of people, are harder to fake than photos or audio. In some AI-generated videos without people, it can be more difficult to determine whether the images are real, although they are not “deepfakes” in the sense that the term generally refers to falsified or doctored images of people.
For video proof, we send a deepfake of Ukrainian President Volodymyr Zelenskiy that shows him telling his military to surrender to Russia, which didn’t happen.
Visual cues in the video include unnatural flickering that shows some pixel artifacts, Lyu’s team said. The edges of Zelenskiy’s head are not quite right; They are jagged and pixelated, a sign of digital manipulation.
Some of the detection algorithms look specifically at lips, because current AI video tools will primarily change lips to say things a person didn’t say. The lips are where most of the inconsistencies are found. An example would be if a letter sound requires the lip to be closed, such as a B or P, but the deepfake’s mouth is not completely closed, Lyu said. When the mouth is open, the teeth and tongue appear dull, she said.
To us, the video is more clearly fake than the audio or photo examples we reported to Lyu’s team. But of the six detection algorithms that evaluated the clip, only three returned very high probabilities of AI generation (over 90%). The other three returned very low probabilities, ranging from 0.5% to 18.7%.