Software company Digimarc will now allow copyright owners to add more information to their work, which the company says will improve how AI models deal with copyrights in training data.
In a statement, Digimarc said that its new The Digimarc Validate service allows users to include the property ID in the metadata. The company said this means that when copyrighted material becomes part of a generative AI training data set, users can point to the digital watermark with intellectual property information.
For example, an image with Digimarc Validate adds a © symbol that is machine-readable and includes information about who owns the copyright. The company said Digimarc Validate works with its digital watermark detection software, called SAFE, or secure, accurate, fair and efficient, which AI companies must accept if they want to prevent copyrighted material from being used. Digimarc Validate symbol reaches training data sets.
“Generative AI has changed the rules and once digital assets are distributed or published, the ability to protect those valuable assets disappears,” said Digimarc President and CEO Riley McCormack.
Digimarc said that much of the content of the data sets extracted for AI training “is copyrighted; It is simply not digitally identified as such.” This allows generative AI models to identify which data is protected before the model ingests it for training.
Signaling copyright ownership in content metadata sounds great on paper, but it will only work if AI developers actively avoid copyrighted material for training models. So far, AI companies have not promised to stay away from copyrighted material in training data sets. However, having a digital paper trail of copyright could allow creators to flag if AI developers intentionally infringe protections.
Some AI companies, such as Adobe, said they only use licensed data for training. Websites advertised by OpenAI can block your web crawler from receiving that data for training.
Meanwhile, Microsoft has said it will face legal pressure if commercial customers using its Copilot products are sued for copyright infringement.
Some of the first lawsuits against developers of generative AI models address the thorny issue of copyright infringement. Comedian Sarah Silverman and authors Christopher Golden and Richard Kadrey sued OpenAI and Meta for allegedly using their books to train GPT-4 and Llama 2. Three artists filed a lawsuit against Stable Diffusion, Midjourney, and art website DeviantArt for allegedly infringe your copyright.
To help figure out how to best address copyright and AI issues, the US Copyright Office opened a public comment period on August 30 to understand people’s concerns.
While the White House has secured commitments from AI companies to develop watermarks, the goal of those watermarks is to identify AI-generated content.
“The risk content owners face by not adding an identifier to digital assets before distribution or publication goes beyond simple misuse and theft,” said Ken Sickles, chief product officer at Digimarc. “In the future, your digital assets will make e-commerce transactions more reliable, email more secure, and social media a safer place.”
Digimarc Validate is available for commercial use starting at $399 per month. Business customers can work with the company for pricing options.