Bot Traffic · June 15, 2026

The AI Crawler Leaderboard Changed Twice in 60 Days: Reading the Churn

GPTBot led, then ClaudeBot overtook it in April, then Bytespider nearly tripled by May. Monthly swings of this magnitude make single-bot optimization strategies obsolete before you ship them.

Three AI crawlers traded positions at the top of the verified bot traffic rankings twice in 60 days. That is not sampling noise — at edge-network scale, single percentage-point shifts represent hundreds of millions of additional requests per month, and the pattern signals that the AI crawler market has not yet settled into a stable competitive hierarchy.

Method

The crawler-share figures below come from network-level HTTP request monitoring across a major CDN edge network, aggregated monthly and cross-referenced with ASN ownership to verify bot identity beyond UA strings. Blocking-rule figures come from a cross-network sample of robots.txt files collected and analyzed in Q1 2026. All crawl-share percentages are share of verified AI crawler HTTP requests, not total web traffic.

Finding 1: ClaudeBot Led for One Month, Then Fell Back

Top AI Crawler Share: April vs May 2026
Share of verified AI crawler HTTP requests; Googlebot held 48% throughout

April 2026 was the first month ClaudeBot finished ahead of GPTBot in raw share of AI crawler HTTP requests: 11.69% versus 9.84%. By May, GPTBot had reclaimed the lead at 11.48%, with ClaudeBot at 9.73%. Neither shift correlates with a published policy change from either platform. Both operate continuous crawl cycles with multi-week return periods per URL, and ranking differences at this timescale more likely reflect queue depth decisions and freshness-weighting strategies than any deliberate change in crawl coverage.

The implication for site owners is that priority-ordered bot rules — configurations where the first matching rule wins — can produce misaligned behavior when the bot mix shifts. A pre-rendering or content-enrichment rule optimized for the current leader may be deprioritizing the bot that will lead next month. Bot-agnostic enrichment strategies, where any verified AI crawler receives semantically complete content, degrade less quickly as rankings rotate.

Finding 2: Bytespider Nearly Tripled in Two Months

Bytespider Crawl Share Growth (Mar-May 2026)
Monthly share of verified AI crawler HTTP requests for Bytespider

Bytespider — ByteDance's web crawler — grew from approximately 3.55% of AI crawler share in March 2026 to 5.73% in April and 10.25% in May. That trajectory makes it the fastest-growing crawler on the list during this period, and it moved from a distant fourth to effectively level with the two dominant bots in share terms. Research on its request patterns puts its per-domain crawl rate at roughly 25 times GPTBot's rate, meaning it visits more pages more frequently when it does crawl a site.

Unlike real-time retrieval agents that fetch pages live during user queries, Bytespider is a training-data crawler: it accumulates content for future model training rather than surfacing it in immediate AI search results. The growth rate suggests a deliberate corpus-expansion effort. Sites with substantial international readership in ByteDance's primary markets are likely already seeing elevated Bytespider traffic in their access logs.

Finding 3: AI Bots Represent Over a Quarter of All Verified Bot Traffic

In May 2026, AI crawlers accounted for 20.3% of verified bot HTTP requests on edge networks. Adding AI-search bots — the real-time retrieval agents that query pages during live user sessions — brings that figure to approximately 26.7% of verified bot traffic. Googlebot retained its dominant 48% share, but that comparison flattens the trajectory: AI crawler share has grown substantially year-over-year while Googlebot's share has been roughly stable.

The infrastructure implication is that AI crawlers now represent a significant enough fraction of bot traffic to affect capacity planning, rate-limiting configurations, and cache-warming jobs calibrated only to traditional search crawler behavior. AI crawler request patterns also differ structurally: they tend to arrive in bursts rather than the more regular cadence of search engine crawlers, and they often need fully rendered HTML to extract meaningful semantic content, adding load per request compared to lightweight HTML-only fetches.

Finding 4: The Most-Blocked Crawler Is Not the Fastest-Growing

AI Crawler DISALLOW Rate in robots.txt (Q1 2026)
Percentage of sampled robots.txt files with a DISALLOW rule for each bot

As of Q1 2026, GPTBot appeared in 5.52% of DISALLOW rules in sampled robots.txt files, making it the most-blocked AI crawler. CCBot followed at 5.08%, ClaudeBot at 4.88%, Google-Extended at 4.44%, and Bytespider at 4.23%. The ordering tracks public awareness and news coverage more closely than crawl volume or commercial risk.

Bytespider's blocking rate is the lowest of the five despite having the fastest crawl growth. This gap — low block rate against rapidly growing crawl volume — means most sites are currently providing Bytespider full access by default, without having made a deliberate decision about whether that access is desirable.

What This Means for Site Owners

The monthly ranking churn is an argument for reviewing bot rules on a 30-day cadence rather than setting them once after a major deployment. The bots generating the most requests today may not be the same ones doing so in 60 days, and configurations that depend on an assumed stable ranking order — particularly those with first-match-wins priority logic — will produce increasingly misaligned results over time.

The most practically useful distinction right now is between training crawlers and retrieval bots. Retrieval bots — such as ChatGPT-User, OAI-SearchBot, and PerplexityBot — fetch pages live during user queries, and the content they retrieve directly influences whether your site appears in AI assistant answers. Blocking them removes your content from live AI search results. Training crawlers — including GPTBot, ClaudeBot, and Bytespider — build datasets for future model versions. Allowing them does not produce referral traffic in return, and the downstream benefit is indirect and delayed.

Most site owners who have reviewed their AI crawler policies deliberately are allowing retrieval bots while applying separate access controls to training crawlers, particularly Bytespider. That is a legitimate and increasingly common configuration. The data above suggests many sites have not yet made this distinction explicit in their robots.txt or in proxy-layer access rules. Given that Bytespider is now among the top three AI crawlers by volume and carries the lowest block rate among the five most-discussed bots, the gap between policy intent and actual configuration is likely widening.

Sources

  1. TechnologyChecker: AI Bot Traffic Share Data May 2026
  2. robots.txt AI Crawler Blocking Analysis Q1 2026
  3. ChatGPT Now Crawls 3.6x More Than Googlebot: What 24M Requests Reveal
  4. AI Crawlers Explained: GPTBot, ClaudeBot, PerplexityBot 2026