############################################################################### # robots.txt for https://www.expertrating.com # Last updated: 2026-06 # # PURPOSE: Optimize crawl budget by blocking low-value, thin, duplicate, # and transactional pages while preserving access to all certification # content, rendering resources, and the sitemap. # # REFERENCES: # - Google robots.txt spec: developers.google.com/search/docs/crawling-indexing/robots/robots_txt # - RFC 9309: www.rfc-editor.org/rfc/rfc9309 # - ExpertRating Technical SEO Audit (2026) # # IMPORTANT: robots.txt controls CRAWLING, not INDEXING. # Use for pages that must not # appear in search results. Use both together for defense-in-depth. ############################################################################### #--------------------------------------------------------------------------- # SECTION 1: Default rules for all crawlers #--------------------------------------------------------------------------- User-Agent: * # --- Staging / Development --- Disallow: /TalentPulseStage/ Disallow: /rac/ # --- Crawl Budget: Thin/Low-Value Content Directories --- Disallow: /Fitness/ Disallow: /fitness/ # --- Crawl Budget: Stale Job Listings --- Disallow: /jobs/ # --- Crawl Budget: Low-value directories --- Disallow: /awareness-tests/ Disallow: /gatlin/ # --- Crawl Budget: Transactional/Checkout Pages --- Disallow: /login-res.asp Disallow: /Instructorled-adduser-BOP.asp # --- Crawl Budget: User-Generated/Verification Pages --- Disallow: /transcript.asp Disallow: /reports/transcript.aspx # --- Crawl Budget: Video/Media Directories --- Disallow: /resource/video_files/ Disallow: /assets/content/v/ Disallow: /media/mp4/ Disallow: /upload/media/video/ Disallow: /public/video/ Disallow: /player/ # --- Crawl Budget: Error Handlers --- Disallow: /500errorlog.asp Disallow: /404-Not-found.asp # --- Crawl Budget: Parameter-Based Duplicates --- Disallow: /*?iframe=true Disallow: /*&iframe=true # --- Individual Pages --- Disallow: /certifications/fitness/review-Strength-and-Conditioning-Training-Certification.asp Disallow: /certifications/fitness/review-personal-trainer-certification.asp # --- WordPress --- Disallow: /wp-login.php Disallow: /wp-admin/ # --- Rendering Resources: ALLOW --- Allow: /assets/css/ Allow: /assets/js/ Allow: /assets/images/ Allow: /resource/css/ Allow: /resource/js/ Allow: /resource/images/ Allow: /Content/ Allow: /Scripts/ Allow: /bundles/ Allow: /images/ Allow: /images-2021/ Allow: /css/ Allow: /js/ Allow: /*.css$ Allow: /*.js$ Allow: /*.jpg$ Allow: /*.jpeg$ Allow: /*.png$ Allow: /*.gif$ Allow: /*.svg$ Allow: /*.webp$ Allow: /*.mp4$ Allow: /*.webm$ Allow: /*.woff$ Allow: /*.woff2$ #--------------------------------------------------------------------------- # SECTION 2: Google AdsBot #--------------------------------------------------------------------------- # Uncomment if running Google Ads. # User-Agent: AdsBot-Google # Allow: / #--------------------------------------------------------------------------- # SECTION 3: AI Search Crawlers — Full citation access # Named explicitly above wildcard to prevent parameter-rule inheritance. # Only thin/transactional paths blocked. All certification content open. #--------------------------------------------------------------------------- # OpenAI — ChatGPT web search User-Agent: GPTBot Allow: /certifications/ Disallow: /Fitness/ Disallow: /fitness/ Disallow: /jobs/ # OpenAI — Search product crawler (separate pipeline from GPTBot) User-Agent: OAI-SearchBot Allow: /certifications/ Disallow: /Fitness/ Disallow: /fitness/ Disallow: /jobs/ # OpenAI — ChatGPT live browsing agent User-Agent: ChatGPT-User Allow: /certifications/ Disallow: /Fitness/ Disallow: /fitness/ Disallow: /jobs/ # Perplexity AI — Direct source citation in answers User-Agent: PerplexityBot Allow: /certifications/ Disallow: /Fitness/ Disallow: /fitness/ Disallow: /jobs/ # Anthropic — Claude web features # Note: ClaudeBot = citation crawler. anthropic-ai = training crawler (blocked in Section 4). User-Agent: ClaudeBot Allow: /certifications/ Disallow: /Fitness/ Disallow: /fitness/ Disallow: /jobs/ # Google — Gemini and AI Overviews User-Agent: Google-Extended Allow: / # DuckDuckGo — DuckAssist AI answers User-Agent: DuckAssistBot Allow: /certifications/ Disallow: /Fitness/ Disallow: /fitness/ Disallow: /jobs/ # You.com AI search User-Agent: YouBot Allow: /certifications/ Disallow: /Fitness/ Disallow: /fitness/ Disallow: /jobs/ #--------------------------------------------------------------------------- # SECTION 4: AI Training Crawlers — Blocked (no citation value) #--------------------------------------------------------------------------- User-Agent: CCBot Disallow: / User-Agent: anthropic-ai Disallow: / User-Agent: Bytespider Disallow: / User-Agent: cohere-ai Disallow: / #--------------------------------------------------------------------------- # SECTION 5: Sitemap Declaration #--------------------------------------------------------------------------- Sitemap: https://www.expertrating.com/sitemap-https.xml