Cloudflare blocks 416 billion AI scraping attempts, warns of threat to online publishing

Cloudflare reports blocking hundreds of billions of AI bot requests in five months and pushes for paid licensing to safeguard website content amid concerns over the impact of AI-driven data extraction on online revenue.

Cloudflare says it has blocked 416 billion attempts by AI bots to scrape website data over the past five months, a figure its co‑founder and chief executive Matthew Prince disclosed in public remarks this week. According to the original report, the company rolled out a one‑click tool in July to let site owners block AI crawlers by default, a move it describes as restoring control to publishers. ^[1]^[2]^[4]

Prince warned that unchecked scraping threatens the economics of online publishing, arguing that AI services which repurpose site content can siphon traffic and advertising revenue away from creators. “The business model of the internet has always been to generate content that drives traffic and then sell either things, subscriptions, or ads,” he said. ^[1]^[5]

Cloudflare frames its shift as part of a broader “Content Independence Day” effort launched on 1 July, making the protection available even to free‑tier customers so that roughly 20% of the world’s websites it protects can opt out of unwanted data collection. Industry reporting says the default block addresses crawlers that ignore traditional web standards such as robots.txt. ^[2]^[3]^[4]

The company reports it has identified and stopped requests from numerous AI agents, naming firms including OpenAI and Anthropic among those whose crawlers were blocked. Cloudflare says the scale of the blocked volume , hundreds of billions of requests , illustrates how voracious AI training pipelines have become. ^[1]^[5]

Prince singled out Alphabet’s Google for criticism, accusing it of bundling search indexing with AI data collection in a way that pressures websites to permit scraping or risk falling in search rankings. He was quoted as saying “Google has become the villain in this story,” and urged that if Google wants to train AI on web content it should pay for it like other parties. ^[1]

Beyond blocking, Cloudflare is pursuing a licensing approach it describes as “Pay Per Crawl,” aiming to create a marketplace where publishers can negotiate compensated access for AI training. The company says early adopters have reported lower server loads and clearer negotiation pathways with AI vendors. ^[1]

Experts and reporters note trade‑offs: default blocking can protect creators and reduce unwanted load, but it may also fragment datasets used for research and services that rely on open crawls. Posts on X and commentary in the trade press reflect a mix of support for creator rights and concern about splintering the open web. ^[1]^[3]^[7]

Technical challenges remain: sophisticated scrapers can masquerade as human traffic, and detection is an arms race. Cloudflare says it uses machine learning to identify bad actors, but industry analysts warn the cat‑and‑mouse dynamic will continue as AI developers and infrastructure providers adapt. ^[1]^[5]

Cloudflare’s intervention has broader regulatory and market implications. Industry coverage suggests the move could accelerate calls for clearer rules around AI data use, and possibly antitrust scrutiny over blended search and AI crawling practices; some commentators argue separation or paid licensing may be necessary to level the playing field. ^[1]^[6]

📌 Reference Map:

##Reference Map:

^[1] (WebProNews) – Paragraph 1, Paragraph 2, Paragraph 4, Paragraph 5, Paragraph 6, Paragraph 8, Paragraph 9
^[2] (WIRED) – Paragraph 1, Paragraph 3
^[3] (WIRED) – Paragraph 3, Paragraph 7
^[4] (CNBC) – Paragraph 1, Paragraph 3
^[5] (Tom’s Hardware) – Paragraph 2, Paragraph 4, Paragraph 8
^[6] (WebProNews duplicate) – Paragraph 9
^[7] (WIRED duplicate) – Paragraph 7

Source: Noah Wire Services

Noah Fact Check Pro

The draft above was created using the information available at the time the story first
emerged. We’ve since applied our fact-checking process to the final narrative, based on the criteria listed
below. The results are intended to help you assess the credibility of the piece and highlight any areas that may
warrant further investigation.

Freshness check

Score:
10

Notes:
The narrative is current, with the latest publication date being December 5, 2025. The earliest known publication date of substantially similar content is July 1, 2025, when Cloudflare announced its initiative to block AI crawlers by default. ([wired.com](https://www.wired.com/story/big-interview-event-matthew-prince-cloudflare/?utm_source=openai)) The report is based on a press release from Cloudflare, which typically warrants a high freshness score. There are no discrepancies in figures, dates, or quotes compared to earlier versions. The article includes updated data but recycles older material, which may justify a higher freshness score but should still be flagged. ([cloudflare.com](https://www.cloudflare.com/ru-ru/press/press-releases/2025/cloudflare-just-changed-how-ai-crawlers-scrape-the-internet-at-large/?utm_source=openai))

Quotes check

Score:
10

Notes:
The direct quotes from Cloudflare CEO Matthew Prince, such as “Google has become the villain in this story,” are consistent across multiple reputable sources, including WIRED and Tom’s Hardware. ([wired.com](https://www.wired.com/story/big-interview-event-matthew-prince-cloudflare/?utm_source=openai)) There are no variations in wording or discrepancies in the quotes.

Source reliability

Score:
8

Notes:
The narrative originates from WebProNews, a reputable organisation. However, it is important to note that WebProNews is a single-outlet narrative, which introduces some uncertainty. The report mentions Cloudflare’s CEO Matthew Prince, whose public presence and records are verifiable online.

Plausability check

Score:
9

Notes:
The claim that Cloudflare has blocked 416 billion AI bot requests since July 1, 2025, is plausible and aligns with reports from other reputable outlets, including WIRED and Tom’s Hardware. ([wired.com](https://www.wired.com/story/big-interview-event-matthew-prince-cloudflare/?utm_source=openai)) The narrative lacks supporting detail from other reputable outlets, which is a concern. The language and tone are consistent with the region and topic. There is no excessive or off-topic detail unrelated to the claim. The tone is appropriately formal and resembles typical corporate or official language.

Overall assessment

Verdict (FAIL, OPEN, PASS): PASS

Confidence (LOW, MEDIUM, HIGH): HIGH

Summary:
The narrative is current and based on a press release from Cloudflare, which typically warrants a high freshness score. The quotes from Cloudflare’s CEO are consistent across multiple reputable sources. The source is reputable, though being a single-outlet narrative introduces some uncertainty. The claim is plausible and aligns with reports from other reputable outlets. The language and tone are appropriate, and there is no excessive or off-topic detail. However, the lack of supporting detail from other reputable outlets is a concern.