Zero Out of 500 Million: How AI Crawlers Actually Handle JavaScript
Vercel and MERJ tracked 500 million GPTBot requests and found zero JavaScript executions. Here are the three rendering failure modes making sites invisible to AI crawlers.
Vercel and MERJ logged more than 500 million GPTBot requests and found zero evidence of JavaScript execution. GPTBot downloads .js files 11.5% of the time—and never runs them. ClaudeBot fetches JavaScript in 23.84% of its requests, executes it in 0%. The pattern holds across every major AI crawler except Googlebot, which uses a two-phase headless Chrome pipeline. With AI crawlers now accounting for roughly 28% of total crawl traffic to large-scale web properties, that rendering gap has direct consequences: if your content depends on JavaScript to appear in the DOM, AI crawlers do not see it.
Method
The primary data source is the Vercel and MERJ joint analysis of more than 500 million bot requests logged across Vercel's CDN. Resource-type headers in edge logs revealed which crawlers fetched JavaScript files and at what rate. Crawler volumes—GPTBot at 569 million requests per month, ClaudeBot at 370 million—come from the same dataset. The 620 KB median JavaScript payload figure is from the HTTP Archive Web Almanac 2024 JavaScript chapter. Failure mode taxonomy draws on documented bot behavior from 2025-2026 technical analyses.
Finding 1: Pure Client-Side Rendering — The Blank Page
React, Vue, and Angular applications without server-side rendering return an HTML shell on the first HTTP fetch. The shell contains a mount point and script tags referencing the application bundle. The actual page content appears only after JavaScript hydrates the DOM. AI crawlers receive that shell, note the JavaScript references, download them 11-24% of the time depending on the crawler, never execute any of them, and index an empty document.
The economics explain why rendering will not be added later. GPTBot processes 569 million pages per month. Adding headless browser execution at that scale multiplies compute costs by an estimated 10-20x. Beyond cost, rendering introduces unpredictable latency—a page that takes 3 seconds to hydrate at interactive speeds is unusable when crawling millions of URLs per day. Crawlers impose one-to-five-second timeouts per page. The 620 KB median JavaScript payload (Web Almanac 2024) would consume most of that budget before the first component mounted.
The fix is clear in principle: move content into the initial HTTP response via server-side rendering (SSR) or static site generation (SSG). The organizational challenge is harder—many SPAs were built before AI crawlers became a meaningful traffic source, and migrating rendering architecture is non-trivial for production applications at scale.
Finding 2: Scroll-Triggered Content — The Lazy-Load Trap
Lazy loading improves browser performance by deferring resource requests until a user scrolls near the content. For images, the loading attribute is generally safe—image URLs are already in the HTML. The problem is textual and structured content behind scroll-position triggers or Intersection Observer callbacks.
E-commerce product pages commonly load reviews, Q&A sections, and related products via API calls triggered by scroll events. Category pages often server-render the first grid of items but lazy-load subsequent pages behind infinite scroll. A site can implement SSR correctly for its hero section and still hide 60-80% of its semantic content below scroll triggers that AI crawlers will never fire.
AI crawlers issue one HTTP GET per URL and do not scroll. That single response is the complete picture. This failure mode is distinct from the CSR problem: a page can use SSR for its main content and still suffer from lazy-load content gaps when the engineering team optimized for browser performance without considering static-fetch behavior.
Finding 3: Interaction-Gated Content — The Tab and Modal Problem
Tabbed interfaces, accordions, and modal dialogs frequently hold a page's most structured content: product specifications, FAQs, plan comparisons, full terms. Some implementations include that content in the initial HTML marked display-none, which AI crawlers can parse. Others fetch content from an API on tab click or accordion expand—at which point the content does not exist in the DOM until interaction fires.
Common patterns where content is gated by interaction: SaaS pricing pages where plan comparison tables appear on tab switch; documentation sites where code examples load on language selector change; support pages where FAQ answers are fetched only on accordion expand. In all these cases, the HTML AI crawlers receive contains none of the detailed information visible in a browser.
A critical distinction: real-time retrieval agents—AI assistants browsing on behalf of a user during an active session—behave more like a browser and may trigger some interactions. Training-data crawlers like GPTBot and ClaudeBot operate at batch scale and make one request per URL. A developer who tests their site by asking an AI assistant to summarize a page may see full content while GPTBot has never indexed any of it.
What This Means for Site Owners
All three failure modes share a common root: content delivery architecture optimized for browser users, with no consideration for static fetchers. The fix is the same in each case—move the content that matters into the initial HTTP response at the main URL, without requiring JavaScript execution, scroll position, or user interaction.
SSR solves failure mode 1. It does not automatically solve failure modes 2 and 3. A page that uses SSR for the hero section but loads reviews via scroll-triggered API calls still has the lazy-load gap. SSR describes the rendering environment; whether the actual content appears in the HTTP response depends on what the data-fetching layer does before the first byte is sent.
The fastest diagnostic is also the simplest: fetch your own URLs with curl, no cookies, no JavaScript-specific headers, and compare the response to a browser view. The delta between those two views is exactly what AI crawlers do not see. For most sites, that delta includes the highest-value structured content—reviews, FAQs, specifications, pricing details—exactly what AI search systems reference when forming recommendations.
Prioritize moving content with high semantic density into the static response. Schema markup in JSON-LD, FAQ structured data, and product specifications are particularly high-value because they map directly to the kinds of questions AI search answers. Content that cannot be moved into the initial response should be documented as invisible to training-data crawlers, and any AI citation goals should be scoped accordingly.