Home Knowledge base Why AI crawlers don't run JavaScript (and what to do about it)
Technical

Why AI crawlers don't run JavaScript (and what to do about it)

AI search crawlers fetch raw HTML and don't render JavaScript. Why, how to tell if your site is invisible to them, and the fastest fixes for React, Vue, and SPA sites.

If your site is a JavaScript-rendered SPA, you may be invisible to ChatGPT, Claude, and Perplexity — even if Google indexes you fine.

This is the most common, and most expensive, AEO failure we see. Here’s why it happens and how to fix it.

The short version

When ChatGPT or Claude wants to cite a page, the underlying crawler does this:

GET https://yoursite.com/page
User-Agent: ChatGPT-User/1.0

It reads the response body. If the body is essentially <div id="root"></div> and a script tag, the crawler gives up. It does not start a Chromium instance, wait for hydration, run your bundle, or evaluate useEffect.

That’s the whole story.

Why they don’t render

Three reasons, in order of importance:

1. Cost

A modern Chromium-based renderer (the kind Google uses for Googlebot) takes 0.5–3 seconds of compute per page and uses 200–500 MB of RAM. At Google scale this is justifiable because the index is reused trillions of times. At AI-search scale — where the model fetches live per query — it is not. Doing a Chromium render on every fetch would multiply latency by 10–20× and infrastructure cost by similar.

2. Determinism

LLMs need stable, deterministic input. JavaScript can return different content on every render (timestamps, A/B tests, geo-detected content, ads, animations). The same URL fetched at different times can produce wildly different DOM. AI crawlers prefer the raw HTML because it’s a single fixed string they can hash, cache, and re-use.

3. Latency budget

When ChatGPT browses to answer a question, the user is waiting. The model has a budget of a few seconds total to fetch 1–5 sources, summarize them, and respond. A 2-second render budget per page is unworkable.

How to tell if you’re affected

Three quick tests:

Test 1: View source

Right-click → View Page Source on your most important pages. If the text content of the page (the actual paragraphs, not the navigation) is not in the source, you have a problem.

If the source is mostly:

<body>
  <div id="root"></div>
  <script src="/static/js/main.js"></script>
</body>

…that’s a 100% AI-invisible page.

Test 2: curl it

curl -A "ChatGPT-User/1.0" https://yoursite.com/page | grep -i "your headline text"

If grep returns nothing, the headline isn’t in the raw HTML.

Test 3: Run our auditor

The AEO Site Checker runs a check called ssr_content that compares the raw HTML response to the visible-text-content shape an LLM would extract. If the page passes Readability extraction with > 200 words of clean content, you get the full 8 points. If the body is mostly empty, you fail.

The fixes, ranked by effort

Easiest: switch your hosting to SSR

If you’re on Vercel, Netlify, or Cloudflare Pages and using Next.js, Nuxt, Remix, Astro, or SvelteKit — flip the rendering mode. In Next.js 15:

// app/blog/[slug]/page.tsx
export const dynamic = 'force-static'  // SSG, fully pre-rendered

// or
export const dynamic = 'force-dynamic' // SSR per request

In Astro:

// astro.config.mjs
export default defineConfig({
  output: 'server', // or 'static'
  adapter: node({ mode: 'standalone' }),
});

In Nuxt 3:

// nuxt.config.ts
export default defineNuxtConfig({
  ssr: true, // default; just confirm it's not false
});

This is usually a 15-minute change. After it’s deployed, your raw HTML will contain the actual content.

Medium: add static prerendering

If you can’t run a Node server in production but you can build static HTML, use SSG (static-site generation). All the modern frameworks support a “build once, serve flat HTML” mode. The output is HTML files on disk that are served by any static host.

Pros: the AI crawler sees real HTML. Cons: you have to rebuild on content changes.

For a marketing site or blog, SSG is the right answer. For a dashboard or app, SSR.

Harder: prerender with a service

If you can’t change your stack, services like prerender.io or rendertron sit in front of your CDN and serve pre-rendered HTML to bots based on User-Agent. You give them a URL, they hit your site with a headless browser, cache the rendered HTML, and serve it to crawlers.

Tradeoffs:

  • An extra hop and an extra service to monitor.
  • The cached HTML can go stale.
  • You need to update the User-Agent allowlist as new AI bots emerge.

But this works without touching the app code — useful as a stopgap.

Hardest: full migration

If your app is tightly coupled to client-side rendering (think: legacy AngularJS or a Backbone SPA), the right long-term answer is to migrate to a framework that supports SSR. This is a project, not a fix. Start by listing the top 20 URLs you actually want AI engines to cite, and migrate just those routes.

The hybrid pattern that works

Most modern sites end up using a hybrid rendering approach:

  • Marketing pages (/, /blog/*, /pricing, /about) — server-rendered or static. AI engines see them.
  • App pages (/dashboard, /settings) — client-rendered. AI engines don’t need to see them, and you don’t want them indexed anyway.
  • Robots.txt blocks the app routes:
User-agent: *
Disallow: /dashboard/
Disallow: /settings/
Disallow: /api/

This is the right architecture. The marketing surface is AEO-optimized; the app is JS-heavy and explicitly excluded.

Common counter-arguments

“But Google renders JS now.”

True. Googlebot uses a headless Chromium and will render most pages eventually. But:

  1. Even Google has a multi-pass model — it fetches the raw HTML first, queues it for rendering, and may not get to rendering for hours or days. AI search has no such luxury.
  2. AI engines are not Google. The crawlers we audit against (OAI-SearchBot, Claude-User, PerplexityBot) all fetch raw HTML by default.

“My SPA hydrates in 200ms, surely the crawler can wait.”

It cannot, because it doesn’t try. The crawler isn’t a browser. It’s an HTTP client that reads the response body. There’s no event loop, no JS engine, no DOM.

“I’ll just make a separate API for crawlers.”

This is what llms.txt is for, in part. Combined with SSR or SSG for your top pages, that covers it.

How this affects your AEO score

In our rubric, ssr_content is worth 8 points out of 100. If your raw HTML doesn’t have at least 200 words of clean text content (per Mozilla Readability extraction), you fail this check.

Combined with fetch_direct (18 points) and robots_ai_bots (10 points), the Fetchability category is 43 points — nearly half the total. A pure-SPA site can score above 50 on AEO only by being unusually good elsewhere.


Further reading

Ready to score your site? Run an audit →