According to Dark Visitors founder Gavin King, most major AI agents still honor the robots.txt file. “That’s been pretty consistent,” he says. But not all website owners have the time or know-how to constantly update their robots.txt files. And even when they do, some bots bypass the file’s directives: “They try to disguise traffic.”
Prince says Cloudflare’s bot blocking won’t be an order this type of bad guy can ignore. “Robots.txt is like putting up a ‘no trespassing’ sign,” he says. “It’s like having a physical wall patrolled by armed guards.” Just as it flags other types of suspicious web behavior, such as price-mining bots used for illegal price monitoring, the company has built processes to detect even the most carefully hidden AI crawlers.
Cloudflare is also announcing a marketplace in the future for customers to negotiate scraping terms of use with AI companies, whether that involves payment for content usage or a barter of credits to use AI services in exchange for scraping. “We don’t really care what the transaction is, but we think there needs to be some way to return value to the original content creators,” Prince says. “Compensation doesn’t have to be in dollars. Compensation can be credit or recognition. It can be a lot of different things.”
There is no set date for the launch of that marketplace, but even if it launches this year, it will join an increasingly crowded field of projects aimed at facilitating licensing and other permissions arrangements between AI companies, publishers, platforms and other websites.
What do AI companies think about this? “We’ve talked to most of them and their reactions have ranged from ‘this makes sense and we’re open to it’ to ‘fuck you,’” says Prince (though he declined to name names).
The project has been a fairly rapid development. Prince cites a conversation with Atlantic CEO (and former WIRED editor-in-chief) Nick Thompson as inspiration for the project; Thompson had commented on how many different publishers had stumbled upon underground web scrapers. “I love that he’s doing it,” Thompson says. If even large media organizations struggled to deal with the influx of scrapers, Prince reasoned, independent bloggers and website owners would struggle even more.
Cloudflare has been a leading web security company for years, providing a large portion of the infrastructure that underpins the web. It has historically remained as neutral as possible when it comes to the content of the websites it serves; on the rare occasions it has made exceptions to that rule, Prince has emphasized that he doesn’t want Cloudflare to be the arbiter of what’s allowed online.
In this regard, he believes Cloudflare is in a privileged position to take a stand. “The path we are on is not sustainable,” says Prince. “Hopefully we can help ensure that human beings are paid for their work.”