Advertisement

Drudge Retort: The Other Side of the News
Friday, July 25, 2025

Cloudflare, one of the internet's most important infrastructure companies, now blocks AI crawlers by default for all new customers.

More

Alternate links: Google News | Twitter

The age of the AI scraping free-for-all may be coming to an end--at least if Cloudflare gets its way. The IT infrastructure firm has switched to blocking AI crawlers by default for its customers and is moving forward with a Pay Per Crawl program.

[image or embed]

-- WIRED (@wired.com) Jul 6, 2025 at 7:56 PM

Comments

Admin's note: Participants in this discussion must follow the site's moderation policy. Profanity will be filtered. Abusive conduct is not allowed.

More: Alongside that change, Cloudflare has launched PayPerCrawl, a new marketplace that allows website owners to charge AI companies per page crawled. If you're running a blog, a digital magazine, a startup product page, or even a knowledge base, you now have the option to set a price for access. AI bots must identify themselves, send payment, and only then can they index your content.

The core issue behind this shift is how AI models are trained. Large language models like OpenAI's GPT or Anthropic's Claude rely on huge amounts of data from the open web. They scrape everything, including articles, FAQs, social posts, documentation, even Reddit threads, to get smarter. But while they benefit, the content creators see none of that upside.

Unlike traditional search engines that drive traffic back to the sites they crawl, generative AI tends to provide full answers directly to users, cutting creators out of the loop.

According to Cloudflare, the data is telling: OpenAI's crawl-to-referral ratio is around 1,700 to 1. Anthropic's is 73,000 to 1. Compare that to Google, which averages about 14 crawls per referral, and the imbalance becomes clear.

#1 | Posted by qcp at 2025-07-24 08:50 AM | Reply

Part of the problem, as I see it on my web servers, is the load that the AI Bots put on the server, and also that most of the AI bots do not respect the robots.txt file.

I do note, however, that on my servers, the OpenAI bot is very well behaved, both in the minimal load it places on the server, and that it respects the robots.txt file.

#2 | Posted by LampLighter at 2025-07-25 02:24 PM | Reply

@#1 ... Unlike traditional search engines that drive traffic back to the sites they crawl, generative AI tends to provide full answers directly to users, cutting creators out of the loop. ...

I've seen concerns that Google search's AI answer at the top of most search result are disliked by the crawled websites because there is no link back to the site that supplied the answer. Google seems to be keeping the ad revenue all for itself.

#3 | Posted by LampLighter at 2025-07-25 02:26 PM | Reply

Something like this was/is inevitable.

#4 | Posted by TaoWarrior at 2025-07-25 02:33 PM | Reply

@#3 ... I've seen concerns that Google search's AI answer at the top of most search result are disliked by the crawled websites because there is no link back to the site that supplied the answer. ...

One example ...

Google users are less likely to click on links when an AI summary appears in the results
www.pewresearch.org

... Last year, Google introduced "AI Overviews," a feature that displays an artificial intelligence-generated result summary at the top of many Google search pages. This feature is available to millions of U.S. Google users. Online publishers recently have attributed declining web traffic to these summaries replacing traditional search results, claiming that many users are relying on the summaries instead of following links to the publishers' websites.

A Pew Research Center report published this spring analyzed data from 900 U.S. adults who agreed to share their online browsing activity. About six-in-ten respondents (58%) conducted at least one Google search in March 2025 that produced an AI-generated summary. Additional analysis found that Google users were less likely to click on result links when visiting search pages with an AI summary compared with those without one. For searches that resulted in an AI-generated summary, users very rarely clicked on the sources cited. ...


#5 | Posted by LampLighter at 2025-07-25 02:48 PM | Reply

The following HTML tags are allowed in comments: a href, b, i, p, br, ul, ol, li and blockquote. Others will be stripped out. Participants in this discussion must follow the site's moderation policy. Profanity will be filtered. Abusive conduct is not allowed.

Anyone can join this site and make comments. To post this comment, you must sign it with your Drudge Retort username. If you can't remember your username or password, use the lost password form to request it.
Username:
Password:

Home | Breaking News | Comments | User Blogs | Stats | Back Page | RSS Feed | RSS Spec | DMCA Compliance | Privacy

Drudge Retort