Audits technical SEO for crawlability, indexability, security, URLs, mobile, Core Web Vitals, structured data, JS rendering, robots.txt, and AI crawlers.
From antigravity-awesome-skillsnpx claudepluginhub sickn33/antigravity-awesome-skills --plugin antigravity-awesome-skillsThis skill is limited to using the following tools:
Designs and optimizes AI agent action spaces, tool definitions, observation formats, error recovery, and context for higher task completion rates.
Enables AI agents to execute x402 payments with per-task budgets, spending controls, and non-custodial wallets via MCP tools. Use when agents pay for APIs, services, or other agents.
Compares coding agents like Claude Code and Aider on custom YAML-defined codebase tasks using git worktrees, measuring pass rate, cost, time, and consistency.
As of 2025-2026, AI companies actively crawl the web to train models and power AI search. Managing these crawlers via robots.txt is a critical technical SEO consideration.
Known AI crawlers:
| Crawler | Company | robots.txt token | Purpose |
|---|---|---|---|
| GPTBot | OpenAI | GPTBot | Model training |
| ChatGPT-User | OpenAI | ChatGPT-User | Real-time browsing |
| ClaudeBot | Anthropic | ClaudeBot | Model training |
| PerplexityBot | Perplexity | PerplexityBot | Search index + training |
| Bytespider | ByteDance | Bytespider | Model training |
| Google-Extended | Google-Extended | Gemini training (NOT search) | |
| CCBot | Common Crawl | CCBot | Open dataset |
Key distinctions:
Google-Extended prevents Gemini training use but does NOT affect Google Search indexing or AI Overviews (those use Googlebot)GPTBot prevents OpenAI training but does NOT prevent ChatGPT from citing your content via browsing (ChatGPT-User)Example, selective AI crawler blocking:
# Allow search indexing, block AI training crawlers
User-agent: GPTBot
Disallow: /
User-agent: Google-Extended
Disallow: /
User-agent: Bytespider
Disallow: /
# Allow all other crawlers (including Googlebot for search)
User-agent: *
Allow: /
Recommendation: Consider your AI visibility strategy before blocking. Being cited by AI systems drives brand awareness and referral traffic. Cross-reference the seo-geo skill for full AI visibility optimization.
Google updated its JavaScript SEO documentation in December 2025 with critical clarifications:
<meta name="robots" content="noindex"> but JavaScript removes it, Google MAY still honor the noindex from raw HTML. Serve correct robots directives in the initial HTML response.Best practice: Serve critical SEO elements (canonical, meta robots, structured data, title, meta description) in the initial server-rendered HTML rather than relying on JavaScript injection.
| Category | Status | Score |
|---|---|---|
| Crawlability | pass/warn/fail | XX/100 |
| Indexability | pass/warn/fail | XX/100 |
| Security | pass/warn/fail | XX/100 |
| URL Structure | pass/warn/fail | XX/100 |
| Mobile | pass/warn/fail | XX/100 |
| Core Web Vitals | pass/warn/fail | XX/100 |
| Structured Data | pass/warn/fail | XX/100 |
| JS Rendering | pass/warn/fail | XX/100 |
| IndexNow | pass/warn/fail | XX/100 |
If DataForSEO MCP tools are available, use on_page_instant_pages for real page analysis (status codes, page timing, broken links, on-page checks), on_page_lighthouse for Lighthouse audits (performance, accessibility, SEO scores), and domain_analytics_technologies_domain_technologies for technology stack detection.
| Scenario | Action |
|---|---|
| URL unreachable | Report connection error with status code. Suggest verifying URL, checking DNS resolution, and confirming the site is publicly accessible. |
| robots.txt not found | Note that no robots.txt was detected at the root domain. Recommend creating one with appropriate directives. Continue audit on remaining categories. |
| HTTPS not configured | Flag as a critical issue. Report whether HTTP is served without redirect, mixed content exists, or SSL certificate is missing/expired. |
| Core Web Vitals data unavailable | Note that CrUX data is not available (common for low-traffic sites). Suggest using Lighthouse lab data as a proxy and recommend increasing traffic before re-testing. |