radarAI Crawler Access Checker

AI Crawler Access Checker for Ecommerce Product Pages

Check whether Googlebot, GPTBot, OAI-SearchBot, ClaudeBot, PerplexityBot, and other crawlers can discover, fetch, index, and trust your product pages across robots.txt, noindex, sitemaps, canonicals, and rendered output.

check_circleGooglebot and AI bot rules

check_circleMeta robots and X-Robots-Tag

check_circleSitemap, canonical, and hreflang

check_circleRendered product facts for crawlers

System Ready

Product page URL

link

Full scanLite and Pro deep analysis

No login required · Free scan · Instant online report

Audit Coverage

What This Tool Checks

Robots.txt Rules

Review whether important product URLs or crawler groups are blocked by robots.txt, including Googlebot, Google-Extended, GPTBot, OAI-SearchBot, ClaudeBot, Claude-SearchBot, and PerplexityBot patterns.

Indexing Directives

Check meta robots and X-Robots-Tag directives that can prevent indexing, following links, or preview generation even when robots.txt allows the fetch.

Discovery Signals

Look at sitemap, canonical, hreflang, and internal-link signals so crawlers can find the preferred product URL instead of a collection, market, or obsolete variant URL.

Rendered Access

Identify pages where JavaScript, redirects, password gates, bot protection, geofencing, or app failures hide product facts from crawler-visible rendered output.

Crawler Access Map

Access signals this checker separates

Crawler access is not one switch. A page can be allowed in robots.txt, blocked by noindex, missing from sitemaps, canonicalized away, or readable to browsers but incomplete for crawlers. This checker separates each layer so you can see which gate is creating risk.

robots.txt and AI bot rules

robots.txt controls whether specific user agents are allowed to fetch a URL. It is useful for crawl control, but it is not the same as indexing, ranking, or AI visibility.

Example signals

User-agent: OAI-SearchBot / GPTBot / ClaudeBot / Claude-SearchBot / PerplexityBot / Google-Extended; Disallow: /products/

What to verify

check_circleProduct URLs are not accidentally blocked by broad Disallow rules.
check_circleSearch, shopping, and AI-specific user agents are handled intentionally rather than through copy-pasted rules.
check_circleThe sitemap location is exposed when robots.txt is used as a discovery hint.

Meta robots and X-Robots-Tag

A crawler can fetch a page and still be told not to index it. Meta robots and HTTP X-Robots-Tag headers often explain why a product URL is discoverable but absent from search.

Example signals

meta robots: noindex, nofollow; X-Robots-Tag: noindex

What to verify

check_circleProduct pages do not inherit noindex from staging, filters, or app templates.
check_circleHeaders and HTML directives do not conflict with each other.
check_circleIndexing directives match the page's canonical and sitemap status.

Sitemap, canonical, and market discovery

Sitemaps and canonicals tell crawlers which product URL should represent the item. Drift here can make AI and search systems collect the wrong variant, market, or collection context.

Example signals

sitemap.xml URL + canonical product URL + hreflang market equivalents

What to verify

check_circleThe preferred product URL appears in sitemap.xml.
check_circleCanonical points to the product page, not a filtered collection or obsolete variant.
check_circleLocalized market URLs use consistent canonical and hreflang relationships.

Rendered product facts

AI and search crawlers need the final product facts, not just an empty shell. Rendering failures can make schema, price, availability, or product attributes invisible.

Example signals

Rendered HTML: Product schema, price, availability, attributes, reviews, return and shipping context

What to verify

check_circleCore facts appear without requiring login, cart state, or user interaction.
check_circleJavaScript and app widgets do not delay critical schema or offer data beyond crawler collection.
check_circleBot protection, redirects, and geofencing do not serve a thin or blocked page to crawlers.

Common Blockers

Issues Worth Fixing First

priority_high

Blocked AI crawler groups

Rules may disallow GPTBot, OAI-SearchBot, ClaudeBot, Claude-SearchBot, and PerplexityBot, or opt out through Google-Extended. Broad bot patterns can also block product pages by accident.

priority_high

Noindex on product URLs

A product page can be fetchable but still excluded by meta robots or X-Robots-Tag noindex directives.

priority_high

Sitemap or canonical drift

Crawlers may discover one URL while canonical, hreflang, or sitemap entries point to a different market, variant, or collection URL.

priority_high

Rendered content hidden from bots

Delayed JavaScript, app widgets, redirects, or bot protection can keep price, availability, schema, or attributes out of the crawler-visible page.

Workflow

From URL to Fix Plan

Paste a live product URL

Use the public product page you want search engines and AI systems to discover, not a preview or admin URL.

Separate each access layer

ShopGox checks robots.txt, page directives, sitemap and canonical signals, and the rendered product output together.

Fix the blocking gate first

Use the report to decide whether the fix belongs in robots.txt, theme templates, headers, sitemap settings, app rules, or platform configuration.

FAQ

Questions Before You Scan

Does allowing an AI crawler guarantee AI visibility?

expand_more

No. Access only means a crawler is not blocked from fetching the page. AI visibility still depends on product data quality, authority, freshness, merchant data sources, and whether the AI system chooses to use the page.

Is robots.txt the same as noindex?

expand_more

No. robots.txt controls crawling, while noindex controls whether a fetched page can be indexed. A page can be allowed by robots.txt but excluded by meta robots or X-Robots-Tag.

Should I block GPTBot, ClaudeBot, or PerplexityBot?

expand_more

That is a business decision. Blocking may reduce certain AI training or retrieval access, but it can also reduce the chance that AI systems collect your product facts. The checker helps you see what your current rules are doing.

Why does sitemap access matter for AI search?

expand_more

Sitemaps help crawlers discover canonical product URLs and recrawl important pages. If product pages are missing from sitemap.xml, AI and search systems may rely on weaker internal links or stale URLs.

Can Shopify or ecommerce apps accidentally block crawlers?

expand_more

Yes. Theme templates, robots.txt.liquid edits, password gates, region rules, review apps, SEO apps, bot protection, and WAF tools can change what crawlers see compared with shoppers.

What should I check in Google Search Console after fixing access issues?

expand_more

Use URL Inspection on the exact canonical product URL. Check live test status, indexing eligibility, user-declared canonical, Google-selected canonical, crawl allowed status, noindex status, and whether the page appears in the submitted sitemap.

More Tools

Keep Checking Product Visibility

AI Shopping Readiness Scanner

AI Shopping Readiness Scanner for Ecommerce Stores

Scan ecommerce product pages for AI search readiness, Product schema, crawler access signals, product attributes, semantic clarity, and structured data consistency.

arrow_forward

Product Schema Checker

Free Product Schema Checker for Ecommerce Pages

Free Product schema checker for ecommerce pages. Review Product and Offer JSON-LD, price, availability, variants, reviews, breadcrumbs, and AI search readiness.

arrow_forward

Shopify Schema Checker

Free Shopify Schema Checker for Product Pages

Scan Shopify product pages for Product schema, variant Offers, review app markup, theme conflicts, Shopify Markets signals, and AI search readiness.

arrow_forward