‘Pay per crawl’: Major publishers back Cloudflare’s plan to charge AI companies

The day after OpenAI issued an economic blueprint for Australia that called for tax funding to boost AI, internet backend company Cloudflare has declared war on AI bot scrapers, with a new default bot blocking regime backed by major global publishers.

Cloudflare provides networking infrastructure that reportedly covers around 20% of all internet traffic. From today, it says it will block AI crawlers by default. The system allows publishers to charge AI models for accessing its content.

Cloudflare CEO Matthew Prince declared July 1 ‘Content Independence Day’, and claimed to have the support of “a majority of the world’s leading publishers and AI companies” in implementing this scheme.

Dozens of publishers and publications, including Condé Nast, The Associated Press, The Atlantic, Dotdash Meredith, Buzzfeed, and Time, have already shown support for Cloudflare’s plans. In addition, major tech platforms like Reddit, Quora, and Pinterest, as well as Universal Music Group — the largest record company in the world — have publicly backed the move.

Roger Lynch, CEO of Condé Nast said it “sets a new standard for how content is respected online” and will allow “sustainable innovation built on permission and partnership”.

He said: “This is a critical step toward creating a fair value exchange on the Internet that protects creators, supports quality journalism and holds AI companies accountable.”

Steve Huffman, co-founder and CEO of Reddit, said the ecosystem of creators, platforms, web users, and crawlers “will be better when crawling is more transparent and controlled”, and Cloudflare’s efforts are “a step in the right direction for everyone”.

Renn Turiano, chief consumer and product officer of Gannett Media, who publishes USA Today, said he is “optimistic the Cloudflare technology will help combat the theft of valuable IP”.

Cloudflare’s “permission-based model” means that AI companies will be required to obtain permission from a site before scraping its contents. Each Cloudflare domain user will be able to control their own settings upon sign-up, eliminating the need for webpage owners to manually opt out, or install third-party systems.

In addition, Cloudflare is testing a system called ‘pay per crawl’, in which creators are financially compensated by the AI crawlers who access their content.

Cloudflare said in a blog post that publishers feel like they have a binary choice when it comes to “AI theft”: wall their content off with systems introduced themselves, or leave the site open to scraping.

For publishers not adverse to having their content used to train AI systems, adopting their own similar pay-to-play model “requires knowing the right individual and striking a one-off deal, which is an insurmountable challenge if you don’t have scale and leverage,” the post read.

Cloudflare data from more than 1 million customers who currently enable a ‘block AI’ feature.

 

“AI crawlers have been scraping content without limits,” Prince said in a media release. “Our goal is to put the power back in the hands of creators, while still helping AI companies innovate. This is about safeguarding the future of a free and vibrant Internet with a new model that works for everyone.”

Not everyone thinks the technology is a home run. Paul Hewett, CEO of In Marketing We Trust, told Mumbrella the move is “a solid step in the right direction”, but isn’t completely sold.

“Centralised infra-level blocking is progress,” he said. “But, this feels like a quick tactical fix to buy time for a better, open protocol-based solution that can scale across the web. Both will likely need to coexist, in my opinion.”

Under the ‘pay per crawl’ model, each time an AI crawler requests content, they are required to ‘present payment intent’ through Cloudflare, who then aggregates the content scraped, charges the crawler (who has previously provided payment details), and distributes the earnings to the publisher.

Pay per crawl grants domain owners “full control over their monetisation strategy,” the company blog explains. “They can define a flat, per-request price across their entire site.” Different crawlers can be granted different levels of access, and publishers can choose to waive fees if they choose.

Prince said the next step is a marketplace where “content creators and AI companies, large and small, can come together”, with the goal of working out a new metric when it comes to measuring the value of internet traffic.

“Imagine an AI engine like a block of Swiss cheese,” Prince wrote on the Cloudflare blog. “New, original content that fills one of the holes in the AI engine’s block of cheese is more valuable than repetitive, low-value content that unfortunately dominates much of the web today.

“We believe that if we can begin to score and value content not on how much traffic it generates, but on how much it furthers knowledge — measured by how much it fills the current holes in AI engines ‘Swiss cheese’ — we not only will help AI engines get better faster, but also potentially facilitate a new golden age of high-value content creation.”

Prince admitted “we don’t know all the answers yet” but is working with leading economists and computer scientists “to figure them out.”

 

The interface of Cloudflare’s new AI blocking tool

Get the latest media and marketing industry news (and views) direct to your inbox.

Sign up to the free Mumbrella newsletter now.

"*" indicates required fields

 

SUBSCRIBE

Sign up to our free daily update to get the latest in media and marketing.