Robots.txt Rules
Review whether important product URLs or crawler groups are blocked by robots.txt, including rules aimed at search and AI crawler user agents.
Check whether search and AI crawlers can discover, fetch, index, and trust your product pages by reviewing robots.txt, meta robots, X-Robots-Tag, sitemap, canonical, and rendered-page signals.
No login required · Free scan · Instant online report
Audit Coverage
Review whether important product URLs or crawler groups are blocked by robots.txt, including rules aimed at search and AI crawler user agents.
Check meta robots and X-Robots-Tag directives that can prevent indexing or following links even when robots.txt allows the fetch.
Look at sitemap, canonical, hreflang, and internal-link signals so crawlers can find the preferred product URL.
Identify pages where JavaScript, redirects, password gates, geofencing, or app failures hide product facts from crawlers.
Crawler Access Map
Crawler access is not one switch. A page can be allowed in robots.txt, blocked by noindex, missing from sitemaps, canonicalized away, or readable to browsers but incomplete for crawlers. This checker separates each layer so you can see which gate is creating risk.
robots.txt controls whether specific user agents are allowed to fetch a URL. It is useful for crawl control, but it is not the same as indexing, ranking, or AI visibility.
Example signals
User-agent: OAI-SearchBot / GPTBot / ClaudeBot / Claude-SearchBot / PerplexityBot / Google-Extended; Disallow: /products/What to verify
A crawler can fetch a page and still be told not to index it. Meta robots and HTTP X-Robots-Tag headers often explain why a product URL is discoverable but absent from search.
Example signals
meta robots: noindex, nofollow; X-Robots-Tag: noindexWhat to verify
Sitemaps and canonicals tell crawlers which product URL should represent the item. Drift here can make AI and search systems collect the wrong variant, market, or collection context.
Example signals
sitemap.xml URL + canonical product URL + hreflang market equivalentsWhat to verify
AI and search crawlers need the final product facts, not just an empty shell. Rendering failures can make schema, price, availability, or product attributes invisible.
Example signals
Rendered HTML: Product schema, price, availability, attributes, reviews, return and shipping contextWhat to verify
Common Blockers
Rules may disallow GPTBot, OAI-SearchBot, ClaudeBot, Claude-SearchBot, and PerplexityBot, or opt out through Google-Extended. Broad bot patterns can also block product pages by accident.
A product page can be fetchable but still excluded by meta robots or X-Robots-Tag noindex directives.
Crawlers may discover one URL while canonical, hreflang, or sitemap entries point to a different market, variant, or collection URL.
Delayed JavaScript, app widgets, redirects, or bot protection can keep price, availability, schema, or attributes out of the crawler-visible page.
Workflow
01
Use the public product page you want search engines and AI systems to discover, not a preview or admin URL.
02
ShopGox checks robots.txt, page directives, sitemap and canonical signals, and the rendered product output together.
03
Use the report to decide whether the fix belongs in robots.txt, theme templates, headers, sitemap settings, app rules, or platform configuration.
FAQ
More Tools
AI Shopping Readiness Scanner
Scan ecommerce product pages for AI search readiness, Product schema, crawler access signals, product attributes, semantic clarity, and structured data consistency.
Product Schema Checker
Check ecommerce product pages for Product schema, Offer data, JSON-LD consistency, price, availability, attributes, and AI search readiness.
Shopify Schema Checker
Free Shopify schema checker for product pages. Scan Product schema, variant offers, app-injected metadata, Shopify Markets signals, and AI search readiness.