Shortly after rumors of former President Donald Trump’s imminent indictment leaked, images emerged online purporting to show his arrest. These images looked like news photos, but they were fake. They were created by a generative artificial intelligence system.
Generative AI, in the form of image generators such as DALL-E, Half way through the journey And Stable spreadand text generators such as Bard, ChatGPT, Chinchilla And Llama, has exploded into the public sphere. Combining smart machine learning algorithms with billions of pieces of human-generated content, these systems can do everything from create an eerily realistic image from a caption, synthesize a speech into the voice of President Joe Biden, replace a person’s likeness with another in a video , or write a coherent 800-word opinion piece based on a title prompt.
Even in these early days, generative AI is capable of creating highly realistic content. My colleague Sophie Nightingale and I discovered that the average person is cannot be distinguished reliably an image of a real person of an AI-generated person. While audio and video haven’t fully passed through the uncanny valley yet — images or models of people that are disturbing because they’re close but not quite realistic — they probably will soon. When this happens, and it is almost guaranteed, it will become easier and easier to distort reality.
In this new world, it’s a piece of cake to make a video of a CEO saying her company’s profits are down 20%, which could lead to billions in market share loss, or to video of a world leader threatening military action, which could spark a geopolitical crisis, or insert someone’s likeness into a sexually explicit video.
Advances in generative AI will soon lead to fake but visually convincing content proliferating online, leading to an even messier information ecosystem. A secondary consequence is that detractors can easily dismiss real video evidence of anything from police brutality and human rights violations to a world leader burning top secret documents as fake.
While society looks down on what is almost certainly just the beginning of these advances in generative AI, there are reasonable and technologically feasible interventions that can be used to mitigate these abuses. Like a computer scientist specializes in forensic imagingI believe an important method is watermarking.
There is a long one history of document marking and other items to prove their authenticity, indicate ownership, and counterfeit. Today, Getty Images, a huge image archive, adds a visible watermark to all digital images in their catalog. This allows customers to freely browse images while protecting Getty’s property.
So are imperceptible digital watermarks used for digital rights management. A watermark can be added to a digital image by, for example, adjusting every 10th image pixel so that its color (usually a number in the range 0 to 255) is an even value. Because this pixel adjustment is so small, the watermark is not noticeable. And because this periodic pattern is unlikely to occur naturally and can be easily verified, it can be used to verify the provenance of an image.
Even medium-resolution images contain millions of pixels, which means that additional information can be embedded in the watermark, including a unique identifier that encodes the generating software and a unique user ID. This same type of imperceptible watermark can be applied to audio and video.
The ideal watermark is one that is imperceptible and also resistant to simple manipulations such as cropping, resizing, color adjustment and converting digital formats. While the pixel color watermark example is not resilient because the color values can be changed, many watermarking strategies have been proposed that are robust – though not impervious – to attempts to remove them.
Watermarks and AI
These watermarks can be baked into the generative AI systems by watermarking all training data, after which the generated content will contain the same watermark. This built-in watermark is attractive because it means generative AI tools can be open source – like the image generator Stable spread is – without worrying that a watermarking process could be removed from the image generator software. Has stable diffusion a watermark featurebut because it’s open source, anyone can just delete that part of the code.
Open AI is experiment with a watermarking system ChatGPT’s creations. Of course, characters in a paragraph can’t be modified like a pixel value, so text watermarks take on a different shape.
Text-based generative AI is based on produce the next most reasonable word in a sentence. For example, starting with the sentence fragment “an AI system can…”, ChatGPT will predict that the next word should be “learn”, “predict”, or “understand”. Each of these words has a probability associated with it that corresponds to the probability that each word will appear next in the sentence. ChatGPT learned these probabilities from the large amount of text it was trained on.
Generated text can be watermarked by secretly tagging a subset of words and then influencing the selection of a word as a synonym tagged word. For example, the tagged word “understand” can be used instead of “understand”. By periodically skewing the word selection in this way, a body of text is watermarked based on a particular distribution of tagged words. This approach won’t work for short tweets, but is generally effective with text of 800 or more words, depending on the specific watermark details.
Generative AI systems can, and in my opinion should, watermark all of their content, allowing for downstream identification and easier intervention if necessary. If the industry does not voluntarily do so, legislators can pass regulations to enforce this rule. Unscrupulous people, of course, will not adhere to these standards. But if the major online gatekeepers — Apple and Google app stores, Amazon, Google, Microsoft cloud services, and GitHub — enforce these rules by banning non-compliant software, the damage will be greatly reduced.
Sign authentic content
To address the problem from the other side, a similar approach could be taken to authenticate original audiovisual recordings at the point of recording. A specialized camera app can cryptographically sign the recorded content as it is recorded. There is no way to tamper with this signature without leaving evidence of the attempt. The signature is then stored on a centralized list of trusted signatures.
While this does not apply to text, audiovisual content can be verified to be human-generated. The Content Provenance and Authentication Coalition (C2PA), a collaborative effort to create a standard for authenticating media, recently released an open specification to support this approach. With major institutions such as Adobe, Microsoft, Intel, BBC and many others joining this effort, the C2PA is well positioned to produce effective and widely deployed authentication technology.
The combined signing and watermarking of human-generated and AI-generated content won’t prevent all abuses, but it will provide some level of protection. All protections will require constant tweaking and refinement as adversaries find new ways to weaponize the latest technologies.
In the same way that society has fought against a decades of battle against other cyber threats such as spam, malware and phishing, we must prepare for an equally protracted battle to defend against various forms of abuse perpetrated using generative AI.