AI Crawler Access

Shopify robots.txt for AI Crawlers

A practical guide to Shopify robots.txt, AI crawler access, private paths, Googlebot risk, and monitoring for agentic shopping visibility.

ShopGox Editorial5/23/2026en
Shopify robots.txt for AI Crawlers: What to Allow, Block, and Monitor

AI shopping systems cannot recommend what they cannot crawl, read, or trust. For Shopify stores, robots.txt is one of the first gates between your product catalog and search engines, AI assistants, shopping agents, and commercial crawlers.

The goal is not to allow everything. The goal is to keep important product, collection, image, and structured data paths accessible while protecting cart, checkout, account, search, filter, and internal utility paths that do not help discovery.

Robots.txt controls crawling, not content quality

Robots.txt tells compliant crawlers which paths they should not request. It does not add product schema, fix duplicate variants, improve thin descriptions, or guarantee indexing. Treat it as access control for crawlers, not as a substitute for technical SEO.

What Shopify usually handles well by default

Keep these defaults unless you have a clear reason

  • check_circleProduct and collection pages should remain crawlable.
  • check_circleStatic assets needed to render product content should remain crawlable.
  • check_circleCart, checkout, account, and internal search paths usually do not need crawler access.
  • check_circleSitemap paths should remain discoverable.
  • check_circleGooglebot should not be blocked unless you intentionally want to remove Google Search access.

A sane AI crawler policy for Shopify

Visual summary diagram for Shopify robots.txt for AI Crawlers: What to Allow, Block, and Monitor.

For a growth-stage ecommerce site, a balanced policy is usually better than a blanket block. Allow crawlers that can create discovery value, protect private and low-value paths, and monitor logs or CDN analytics for abnormal traffic.

A sane AI crawler policy for Shopify

Path typeRecommended policyWhy it matters
Product pagesAllowThey contain the commercial facts AI shopping systems need.
Collection pagesAllowThey help crawlers understand categories, inventory groupings, and internal links.
Product imagesAllowImages support visual search, previews, and richer product understanding.
Cart and checkoutDisallowThey do not help discovery and may create crawl waste.
Customer account pagesDisallowThey are private or low-value for public discovery.
Internal search and filtered URLsUsually disallowThey can create duplicate or infinite crawl paths.
SitemapAllowIt helps crawlers find canonical URLs efficiently.

Example Shopify robots.txt rules for AI crawlers

Shopify stores can customize robots rules through the theme's robots.txt.liquid template. Keep custom rules small, documented, and easy to reverse. Do not copy a crawler blocklist blindly from another store.

User-agent: *
Disallow: /cart
Disallow: /checkout
Disallow: /account
Disallow: /search
Disallow: /*?*sort_by=
Disallow: /*?*filter.

User-agent: Googlebot
Allow: /

User-agent: GPTBot
Disallow: /cart
Disallow: /checkout
Disallow: /account

User-agent: Google-Extended
Disallow: /

Google-Extended is not the same as Googlebot

Google-Extended is a separate control for certain Google AI training and product use cases. Blocking Google-Extended is different from blocking Googlebot. If your priority is Search visibility, never treat them as interchangeable.

What to monitor after changing robots.txt

Post-change checks

  1. 1Open /robots.txt and confirm the final rendered file is what you expected.
  2. 2Verify product and collection URLs are not blocked for Googlebot.
  3. 3Check sitemap URLs are still accessible.
  4. 4Run a crawler access check against representative product pages.
  5. 5Watch server, CDN, or Shopify analytics for crawl spikes.
  6. 6Keep a dated note of every robots.txt change so you can roll back quickly.

Common Shopify robots.txt mistakes

Common Shopify robots.txt mistakes

  • errorBlocking all query parameters when important variant URLs depend on parameters.
  • errorBlocking image folders needed for product previews.
  • errorBlocking Googlebot while trying to block only AI training crawlers.
  • errorAssuming robots.txt can fix duplicate content by itself.
  • errorForgetting that some crawlers may ignore robots.txt or use changing user-agent strings.

FAQ

Should Shopify stores block AI crawlers?keyboard_arrow_down

Not by default. If AI shopping visibility matters, allow crawler access to public product and collection pages while blocking private or low-value paths such as cart, checkout, account, search, and duplicate filters.

Can robots.txt remove Shopify product pages from Google?keyboard_arrow_down

Robots.txt controls crawling. If Google already knows a URL, blocking crawl is not the same as a clean noindex strategy. For Google Search, be especially careful not to block Googlebot from important product pages.

Does Shopify allow robots.txt customization?keyboard_arrow_down

Yes, Shopify supports robots.txt customization through the theme's robots.txt.liquid template. Keep changes conservative and test the rendered /robots.txt file after publishing.

Which Shopify URLs should stay crawlable for AI shopping?keyboard_arrow_down

Product pages, collection pages, product images, canonical URLs, and sitemap URLs should usually remain crawlable because they carry product facts, category context, and discovery links.

Related tools

Related posts