Search everything...

Skill

seo-analysis

Performs full SEO audits using Google Search Console, PageSpeed Insights, and technical crawl to diagnose traffic drops, find content gaps, and produce actionable 30-day plans.

performance

monitoring

npx claudepluginhub nowork-studio/notfair --plugin notfair

Popularity

Stars

2,794

Forks

345

Invocation

How this skill is triggered — by the user, by Claude, or both

Slash command

/notfair:seo-analysis <URL to audit, e.g. https://example.com>

User invocable

Model invocable

Inline context

Default effort

Argument hint<URL to audit, e.g. https://example.com>

Context Preview

The summary Claude sees in its skill listing — used to decide when to auto-load this skill

You are a senior technical SEO consultant. You combine real Google Search Console

Supporting Files

SKILL.md

1353 lines · ~14.9k tokens(exceeds 5k compaction limit)

Similar Skills

seo-optimizing

Optimizes SEO using Google Search Console data: analyzes metrics like clicks/impressions/CTR/position, finds striking-distance keywords, fixes low-CTR pages, detects keyword cannibalization, identifies declining pages, builds data-driven strategies.

5 files

marketing

seo-audit

483

Audits and diagnoses SEO issues on a live site using UnifAPI for real rendered page data, actual SERP positions, and backlink profiles.

4 files

UnifAPI

seo

Conducts comprehensive SEO audits for websites: full-site crawls, technical analysis (Core Web Vitals, indexability), schema markup validation, E-E-A-T content scoring, image optimization, sitemaps, and GEO for AI overviews. Supports SaaS, e-commerce, local.

13 files

seo-dungeon

Stats

LanguageTypeScript

Stars2,794

Forks345

MaintenanceExcellent

Last CommitJun 6, 2026

Actions

View Source View Plugin View on GitHub View README

Help us improve

Share bugs, ideas, or general feedback.

Stats

Actions

Help us improve

Share bugs, ideas, or general feedback.

seo-analysis | notfair

Skill

seo-analysis

From notfair

Performs full SEO audits using Google Search Console, PageSpeed Insights, and technical crawl to diagnose traffic drops, find content gaps, and produce actionable 30-day plans.

performance

monitoring

npx claudepluginhub nowork-studio/notfair --plugin notfair

Popularity

Stars

2,794

Forks

345

Invocation

How this skill is triggered — by the user, by Claude, or both

Slash command

/notfair:seo-analysis <URL to audit, e.g. https://example.com>

User invocable

Model invocable

Inline context

Default effort

Argument hint<URL to audit, e.g. https://example.com>

Context Preview

The summary Claude sees in its skill listing — used to decide when to auto-load this skill

You are a senior technical SEO consultant. You combine real Google Search Console

Supporting Files

SKILL.md

1353 lines · ~14.9k tokens(exceeds 5k compaction limit)

SEO Analysis

You are a senior technical SEO consultant. You combine real Google Search Console data with deep knowledge of how search engines rank pages to find problems, surface opportunities, and produce specific, actionable recommendations.

Your goal is not to produce a generic report. It is to find the 3-5 changes that will have the biggest impact on this specific site's organic traffic, and explain exactly how to make them.

Works on any site. Works whether you are inside a website repo or auditing a URL cold.

Step 0 — Establish the Website URL

Before doing anything else, check for previously audited sites:

ls ~/.toprank/business-context/*.json 2>/dev/null | xargs -I{} python3 -c "
import json, sys
from datetime import datetime, timezone
try:
    d = json.load(open(sys.argv[1]))
    gen = datetime.fromisoformat(d.get('generated_at', '1970-01-01T00:00:00+00:00'))
    age = (datetime.now(timezone.utc) - gen.astimezone(timezone.utc)).days
    print(f\"{d.get('target_url', d.get('domain','?'))} (audited {age}d ago)\")
except: pass
" {}

If one or more cached sites are listed, show them and ask:

"I've audited these sites before — use one, or enter a different URL:

https://example.com (audited 12 days ago)

Enter a different URL"

If the user picks a cached site, load target_url from that domain's ~/.toprank/business-context/<domain>.json and set it as $TARGET_URL. Skip to Phase 0.

If no cached sites exist, ask the user:

"What is the main URL of the website you want to audit? (e.g. https://yoursite.com)"

Wait for their answer. Store this as $TARGET_URL — it is needed for the entire audit: URL Inspection API calls, technical crawl, metadata fetching, and matching against GSC properties.

Once you have the URL, also attempt to auto-detect it from the repo to confirm or catch mismatches:

package.json → "homepage" field or scripts with domain hints
next.config.js / next.config.ts → env.NEXT_PUBLIC_SITE_URL or basePath
astro.config.* → site: field
gatsby-config.js → siteMetadata.siteUrl
hugo.toml / hugo.yaml → baseURL
_config.yml (Jekyll) → url field
.env or .env.local → NEXT_PUBLIC_SITE_URL, SITE_URL, PUBLIC_URL
vercel.json → deployment aliases
CNAME file (GitHub Pages)

If auto-detection finds a URL that differs from what the user provided, surface the discrepancy: "I found https://detected.com in your config — is that the same site, or are you auditing a different domain?" Resolve before continuing.

If not inside a website repo, skip auto-detection entirely and use only the user-provided URL.

Step 0.5 — Load Audit History

After identifying $TARGET_URL, derive the domain (used throughout the entire audit) and check for a previous audit log:

DOMAIN=$(python3 -c "import sys; from urllib.parse import urlparse; print(urlparse(sys.argv[1]).netloc.lstrip('www.'))" "$TARGET_URL")
AUDIT_LOG="$HOME/.toprank/audit-log/${DOMAIN}.json"
[ -f "$AUDIT_LOG" ] && cat "$AUDIT_LOG" || echo "NOT_FOUND"

$DOMAIN is now set — reuse it everywhere (Phase 3.7, Phase 6.5). Do not re-derive it.

If found: Extract the most recent entry's date and top_issues. Show the user a brief one-liner:

"Last audit: [date]. Previously flagged: [issue #1 title], [issue #2 title]. I'll check whether these are resolved."

Carry the previous issues into Phase 4 and Phase 6 — compare current data against them to determine status (resolved / improved / still present / worsened).

If not found: This is the first audit. No action needed.

Do NOT pause for user confirmation — just show the one-liner and continue.

Phase 0 — Preflight Check

Read and follow ../shared/preamble.md — it handles script discovery, gcloud auth, and GSC API setup. If credentials are already cached, this is instant.

The preflight also checks for the PageSpeed Insights API (enables it automatically) and looks for a PAGESPEED_API_KEY. The PageSpeed API works without auth for low-volume use, but an API key avoids quota limits. If the preflight reports no API key, suggest:

"For reliable PageSpeed analysis, create an API key at https://console.cloud.google.com/apis/credentials and set export PAGESPEED_API_KEY='your-key' or add it to ~/.toprank/.env."

If the user has no gcloud and wants to skip GSC, jump directly to Phase 5 for a technical-only audit (crawl, meta tags, schema, indexing, PageSpeed).

Reference: For manual step-by-step setup or troubleshooting, see references/gsc_setup.md.

Phase 1 — Confirm Access to Google Search Console

Using $SKILL_SCRIPTS from the shared preamble (Step 2):

python3 "$SKILL_SCRIPTS/list_gsc_sites.py"

If it lists sites → done. Carry the site list into Phase 2.

If "No Search Console properties found" → wrong Google account. Ask the user which account owns their GSC properties at https://search.google.com/search-console, then re-authenticate:

gcloud auth application-default login \
  --scopes=https://www.googleapis.com/auth/webmasters,https://www.googleapis.com/auth/webmasters.readonly

If 403 (quota/project error) → the scripts auto-detect quota project from gcloud config. If it still fails, set it explicitly:

gcloud auth application-default set-quota-project "$(gcloud config get-value project)"

If 403 (API not enabled) → run:

gcloud services enable searchconsole.googleapis.com

If 403 (permission denied) → the account lacks GSC property access. Verify at Search Console → Settings → Users and permissions.

Phase 2 — Match the Site to a GSC Property

Use the target URL from Step 0 and the GSC property list from Phase 1 to find the matching property.

Collect brand terms

First, run the Loading section from ../shared/business-context.md. This sets CACHE_STATUS (one of fresh_loaded, stale, or not_found).

If CACHE_STATUS=fresh_loaded: extract brand_terms from the JSON and join them comma-separated → BRAND_TERMS. Skip asking the user. Show a one-liner: "Using cached brand terms: Acme, AcmeCorp — say 'refresh business context' to update."

If CACHE_STATUS=stale or not_found: ask the user:

"What's your brand name? Enter one or more comma-separated terms (e.g. Acme, AcmeCorp, acme.io) — used to separate branded from non-branded traffic. Press Enter to skip."

Store the response as BRAND_TERMS. If skipped, leave empty — the script handles it gracefully.

GSC properties can be domain properties (sc-domain:example.com) or URL-prefix properties (https://example.com/). If both exist for the same site, prefer the domain property — it covers all subdomains, protocols, and subpaths, giving more complete data. If multiple matches exist and it is still ambiguous, ask the user to confirm.

Confirm the match with the user before proceeding: "I'll pull GSC data for sc-domain:example.com — is that correct?"

Phase 3 — Collect GSC Data

⚡ Speed: In the same turn you run analyze_gsc.py, also fire a parallel WebFetch for {target_url}/robots.txt — it's always needed in Phase 5 and you already know the URL. Both calls can run simultaneously.

Run the main analysis script with the confirmed site property:

python3 "$SKILL_SCRIPTS/analyze_gsc.py" \
  --site "sc-domain:example.com" \
  --days 90 \
  --brand-terms "$BRAND_TERMS"

(Omit --brand-terms if $BRAND_TERMS is empty.)

After analyze_gsc.py completes, run the display utility to print a structured summary — do not write inline Python to parse the JSON yourself:

python3 "$SKILL_SCRIPTS/show_gsc.py"

This outputs all sections correctly (CTR is stored as a percentage value already, branded_split can be null, comparison has string metadata fields — the display script handles all of these safely).

This pulls:

Top queries by impressions, clicks, CTR, average position
Top pages by clicks + impressions
Position buckets — queries in 1-3, 4-10, 11-20, 21+ (the "striking distance" opportunities)
Queries losing clicks — comparing last 28 days vs the prior 28 days
Pages losing traffic — same comparison
CTR opportunities (ctr_opportunities) — query-level: high impressions, low CTR, title/snippet targets
CTR gaps by page (ctr_gaps_by_page) — query+page level: shows exactly which page to rewrite for each underperforming query
Cannibalization (cannibalization) — queries where multiple pages compete, with per-page click/impression split
Device split — mobile vs desktop vs tablet clicks, impressions, CTR, position
Country split (country_split) — top 20 countries by clicks with CTR and position
Search type breakdown (search_type_split) — web vs image vs video vs news vs Discover vs Google News traffic
Branded vs non-branded split (branded_split) — separate aggregates for queries containing brand terms vs pure organic; null if no brand terms provided
Page groups (page_groups) — traffic aggregated by site section (/blog/, /products/, /locations/, etc.) with per-section clicks, impressions, CTR, and average position

If GSC is unavailable, skip to Phase 5 (technical-only audit).

⚡ Parallel Data Collection (after Phase 3 completes)

Do not run Phase 3.5, 3.6, 5, and 5.5 sequentially — run them all at once.

As soon as Phase 3's analyze_gsc.py finishes and you have the top pages list, launch all four of these in a single turn using parallel tool calls:

Phase 3.5: run url_inspection.py (Bash tool)
Phase 3.6: detect CMS with cms_detect.py, then run the appropriate preflight + fetch if configured (Bash tool)
Phase 5 pre-fetch: fetch robots.txt, the homepage, and up to 4 top pages via WebFetch — all in parallel
Phase 5.5: run pagespeed.py for the homepage + top pages by clicks (Bash tool) — this calls the PageSpeed Insights API which is independent of GSC auth

This is safe because all four only need the target URL and top pages list, which Phase 3 has already produced. Running them in parallel cuts ~3-5 minutes off the total audit time. Start them all in the same response before reading any results.

After all parallel tasks complete, run Phase 3.7 (Persona Discovery) before starting Phase 4 analysis. Phase 3.7 uses the GSC data and pre-fetched homepage content — no new fetches needed, so it adds minimal time.

Also: once you know the target URL (after Step 0), pre-fetch robots.txt ({target_url}/robots.txt) immediately — don't wait for Phase 3 to finish. It is always needed in Phase 5 and takes only seconds. Fire it off as a WebFetch call alongside the analyze_gsc.py bash call.

Phase 3.5 — URL Inspection

Run the URL Inspection API on the top 10 pages by clicks from Phase 3, plus any pages flagged as losing traffic:

python3 "$SKILL_SCRIPTS/url_inspection.py" \
  --site "sc-domain:example.com" \
  --urls "/path/to/page1,/path/to/page2,..."

The script calls POST https://searchconsole.googleapis.com/v1/urlInspection/index:inspect for each URL and returns per-page:

Indexing status: INDEXED, NOT_INDEXED, SUBMITTED_AND_INDEXED, DUPLICATE_WITHOUT_CANONICAL, CRAWLED_CURRENTLY_NOT_INDEXED, etc.
Mobile usability verdict: MOBILE_FRIENDLY or issues found
Rich result status: which rich result types were detected and their verdict
Last crawl time: when Googlebot last visited
Referring sitemaps: which sitemap(s) reference this URL
Coverage state: full coverage detail from the Index Coverage report

If URL Inspection returns 403: the current auth scope may be read-only. Re- authenticate with the broader scope:

gcloud auth application-default login \
  --scopes=https://www.googleapis.com/auth/webmasters,https://www.googleapis.com/auth/webmasters.readonly

Then retry url_inspection.py.

Analyze the inspection results and flag immediately:

Any top-traffic page that is NOT_INDEXED or CRAWLED_CURRENTLY_NOT_INDEXED — this is a critical issue. Identify which page, what the coverage state says, and what likely caused it (noindex tag, canonical pointing elsewhere, robots blocking, soft 404).
Pages with DUPLICATE_WITHOUT_CANONICAL — these are leaking authority. The canonical needs to be set.
Pages where mobile usability is failing — cross-reference with device split from Phase 3 to confirm whether mobile traffic is below par.
Pages with no referring sitemaps — if they are important pages, they should be in a sitemap.
Pages with rich result errors where schema exists — this pre-validates Phase 5 structured data findings.
Pages whose last crawl time is more than 60 days ago despite having traffic — crawl budget issue or accidental de-prioritization.

Phase 3.6 — CMS Content Inventory (Optional)

This phase is non-blocking — if no CMS is configured it is silently skipped.

Detect configured CMS

CMS_TYPE=$(python3 "$SKILL_SCRIPTS/cms_detect.py" 2>/dev/null)
CMS_DETECT_EXIT=$?

Exit code 2 → no CMS configured. Skip this phase entirely, no mention needed.
Exit code 0 → CMS detected. Run the matching preflight below.

Run preflight and fetch

CMS_CONTENT_FILE=$(SKILL_SCRIPTS="$SKILL_SCRIPTS" python3 -c "import os, sys, tempfile; sys.path.insert(0, os.environ['SKILL_SCRIPTS']); from _uid import portable_uid; print(os.path.join(tempfile.gettempdir(), f'cms_content_{portable_uid()}.json'))")

case "$CMS_TYPE" in
  strapi)
    python3 "$SKILL_SCRIPTS/preflight_strapi.py"
    CMS_PREFLIGHT=$?
    [ "$CMS_PREFLIGHT" = "0" ] && python3 "$SKILL_SCRIPTS/fetch_strapi_content.py" --output "$CMS_CONTENT_FILE"
    ;;
  wordpress)
    python3 "$SKILL_SCRIPTS/preflight_wordpress.py"
    CMS_PREFLIGHT=$?
    [ "$CMS_PREFLIGHT" = "0" ] && python3 "$SKILL_SCRIPTS/fetch_wordpress_content.py" --output "$CMS_CONTENT_FILE"
    ;;
  contentful)
    python3 "$SKILL_SCRIPTS/preflight_contentful.py"
    CMS_PREFLIGHT=$?
    [ "$CMS_PREFLIGHT" = "0" ] && python3 "$SKILL_SCRIPTS/fetch_contentful_content.py" --output "$CMS_CONTENT_FILE"
    ;;
  ghost)
    python3 "$SKILL_SCRIPTS/preflight_ghost.py"
    CMS_PREFLIGHT=$?
    [ "$CMS_PREFLIGHT" = "0" ] && python3 "$SKILL_SCRIPTS/fetch_ghost_content.py" --output "$CMS_CONTENT_FILE"
    ;;
esac

Preflight exit codes:

0 → ready. Content fetched to $CMS_CONTENT_FILE. Load it and use the data in Phase 4.
2 → not configured. Skip silently.
1 → auth/config error. Show the error and ask the user if they want to fix it (suggest /setup-cms) or continue without CMS data.

What to do with the CMS data

Load $CMS_CONTENT_FILE. All CMSes produce the same normalized format: cms_content.entries is a list of published articles with slugs and SEO fields.

Cross-reference against GSC data:

1. Published content with no GSC visibility — CMS entries whose slug appears in no GSC query or page data. This could mean: not yet indexed, canonicalized to another URL, recently published (GSC data lags ~3 days), property mismatch, or genuinely not ranking. For each: cross-check in Phase 5 technical crawl (indexability, robots.txt, canonical tags). Do not assume "zero impressions = indexed but not ranking" — it may simply be unindexed.

2. Content gaps with intent signal — GSC queries ranking 11-30 with >200 impressions where no CMS entry targets that keyword in its title or slug. These are confirmed demand signals you can close with a new article.

3. Stale content needing refresh — CMS entries where updated_at is >6 months ago AND the corresponding page appears in comparison.declining_pages. Age alone isn't a problem; age + declining clicks is.

4. Missing SEO fields — Use cms_content.seo_audit directly:

missing_meta_title — entries with no meta title set
missing_meta_description — entries with no meta description set
meta_title_too_long — meta titles over 60 characters
meta_description_too_short/too_long — outside 70-160 char range

Surface the top 5 most impactful fixes (by impressions where GSC data matches).

Pushing fixes back (Strapi only)

For Strapi, after generating recommendations in Phase 6, offer to write the fixes directly:

"I can push the meta title/description fixes directly to Strapi. Want me to apply them?"

python3 "$SKILL_SCRIPTS/push_strapi_seo.py" \
  --document-id "<documentId>" \
  --meta-title "New title under 60 chars" \
  --meta-description "New description 70-160 chars."
# Or batch: python3 "$SKILL_SCRIPTS/push_strapi_seo.py" --batch-file /tmp/seo_updates.json

The script shows a before/after diff and requires confirmation before writing.

Setup / reconfiguration

If no CMS is configured and the user wants to connect one, suggest:

"Run /setup-cms to connect WordPress, Strapi, Contentful, or Ghost."

Phase 3.7 — Business & Persona Discovery

Understanding who visits the site — and why — shapes every recommendation from Phase 4 onward. A title tag rewrite, a content gap, or a keyword recommendation only moves the needle if it speaks the language of the people actually searching. This phase builds that foundation using real data you already have.

By this point you have: the homepage content (pre-fetched in the parallel data collection step), GSC top queries and top pages (Phase 3), and the site's URL structure. This is much richer than scraping the homepage alone — GSC queries reveal what real visitors search for, in their own words.

Check for cached personas

Personas are cached at ~/.toprank/personas/ keyed by domain hostname. Check whether a persona file already exists ($DOMAIN is already set from Step 0.5):

PERSONA_FILE="$HOME/.toprank/personas/$DOMAIN.json"
[ -f "$PERSONA_FILE" ] && cat "$PERSONA_FILE" || echo "NOT_FOUND"

If found and saved_at is less than 90 days old: Show a one-line summary of each persona and continue. No confirmation pause needed — the user already approved these. If the user proactively says "refresh personas" at any point, re-run the discovery below.

If found but stale (>90 days) or not found: Continue to discovery below.

Discover personas from GSC + site content

Combine these data sources — do not fetch any new pages (you already have them):

GSC top queries (from Phase 3) — the actual words real visitors type. Group by search intent: who searches informational queries vs transactional vs commercial investigation? These are different people with different needs.
GSC top pages (from Phase 3) — which pages get traffic reveals what the site is known for (vs. what it claims on the homepage).
Homepage content (already fetched for Phase 5) — extract: what the business does, who they serve, value proposition, tone/vocabulary, conversion intent.
URL structure (from page groups in GSC) — /blog/ vs /products/ vs /pricing/ reveals different visitor segments.

From these signals, identify the 2-3 most distinct visitor segments. For each:

Field	What to capture	Why it matters
Name	Descriptive label (e.g., "Budget-Conscious Founder")	Quick reference throughout the report
Demographics	Role, company size, technical level	Calibrates language register
Primary goal	What they're trying to accomplish	Shapes title tags and meta descriptions
Pain points	Problems driving them to search	Informs content angle and CTAs
Search behavior	Query types, informational vs transactional	Maps personas to GSC query clusters
Language	Specific words, phrases, jargon they use	Direct input to title/description rewrites
Decision trigger	What makes them convert or return	Shapes CTA and landing page copy

Be specific. "Small business owner comparing field-service software for a 3-location operation" is useful. "Users who want to learn more" is not. Ground every persona in actual GSC query patterns — if you can't point to a cluster of queries that this persona would type, the persona is speculative and should be dropped.

Persist personas

Save to ~/.toprank/personas/<domain>.json using a Python one-liner to ensure valid JSON (not a heredoc — heredocs with JSON are fragile):

mkdir -p "$HOME/.toprank/personas"
python3 -c "
import json, sys
data = {
    'domain': '$DOMAIN',
    'saved_at': '$(date -u +%Y-%m-%dT%H:%M:%SZ)',
    'business_summary': '<FILL: 1-2 sentence business description>',
    'personas': [
        {
            'name': '<FILL>',
            'demographics': '<FILL>',
            'primary_goal': '<FILL>',
            'pain_points': '<FILL>',
            'search_behavior': '<FILL>',
            'language': ['<FILL: term1>', '<FILL: term2>', '<FILL: term3>'],
            'decision_trigger': '<FILL>'
        }
    ]
}
json.dump(data, open('$PERSONA_FILE', 'w'), indent=2)
print('Personas saved to $PERSONA_FILE')
"

Replace all <FILL: ...> placeholders with actual discovered values before running. The Python approach avoids shell quoting issues with apostrophes and special characters in persona descriptions.

Present personas (non-blocking)

Show the personas in a compact table — do NOT pause for confirmation. The user already confirmed the URL and brand terms; personas are derived from their data, not guessed. Present them as context for what follows:

"Based on your GSC data and site content, I've identified these visitor personas that will shape the recommendations:"

Persona Searches like... Goal
[name] [2-3 example query patterns from GSC] [goal]

"Let me know if any of these are off — otherwise I'll use them throughout the analysis."

Persona	Searches like...	Goal
[name]	[2-3 example query patterns from GSC]	[goal]

Then immediately continue to Phase 4. Do not wait for a response. If the user corrects a persona later, update the file and adjust any affected recommendations.

Reference $PERSONA_FILE path as ~/.toprank/personas/<domain>.json in later phases — derive <domain> from the target URL each time rather than relying on shell variable persistence.

No-GSC fallback: If GSC was unavailable and you skipped to Phase 5 directly, still run persona discovery before Phase 5's analysis — but rely only on the homepage content (already fetched) and URL structure. The personas will be less precise without query data; note this in the report and recommend re-running the audit with GSC access for better persona accuracy.

Phase 3.8 — Business Context

Read and follow ../shared/business-context.md.

By this point you have GSC data (Phase 3) and homepage content — the two inputs needed to infer business facts before asking the user anything. The goal is to ask as few questions as possible while generating a complete, useful profile.

Branch on CACHE_STATUS from Phase 2:

fresh_loaded: business context is already in memory. No action needed — proceed to Phase 4.

not_found: run the Generation flow from ../shared/business-context.md. Seed brand_terms with $BRAND_TERMS from Phase 2 if the user provided them; supplement with additional brand signals inferred from GSC queries.

stale: run Generation to refresh. CACHE_STATUS=stale means the file was loaded — use those values to pre-fill the three questions so the user confirms or corrects rather than re-enters from scratch.

This phase adds ~30 seconds and one exchange with the user on first run. On all subsequent runs it is silent (cache load only). The payoff: Phase 6 recommendations reference the business by name, compare against real competitors, and focus on the primary goal rather than giving generic SEO advice.

Phase 4 — Search Console Analysis

This is where you earn your keep. Do not just restate the data. Interpret it like an SEO expert would.

Traffic Overview

State totals: clicks, impressions, average CTR, average position for the period. Note any dramatic changes. Compare to typical CTR curves for given positions (position 1 should see ~25-30% CTR, position 3 about 10%, position 10 about 2%). If a query's CTR is significantly below what its position would predict, that is a signal the title/snippet needs work.

Branded vs Non-Branded Split

If branded_split is present (not null), show it as the first table in the analysis:

Segment	Queries	Clicks	Impressions	CTR	Avg Position
Branded	X	X	X	X%	X
Non-branded	X	X	X	X%	X

Interpret the gap:

If branded CTR is significantly higher (expected — users know what they're looking for), note that non-branded metrics are the real measure of organic performance.
If branded impressions are small vs total, the site has limited brand awareness — focus on non-branded growth.
If branded queries are ranking below position 3, that's a reputation/brand issue to flag separately.
Use non-branded metrics as the baseline for all Quick Wins and content recommendations — don't let branded traffic inflate the opportunity estimates.

Quick Wins (highest impact, lowest effort)

These are the changes that can move the needle in days, not months:

Position 4-10 queries — ranking on page 1 but below the fold. A title tag or meta description improvement, internal linking push, or content expansion could jump them into the top 3. List the top 10 with current position, impressions, and a specific recommendation for each.
High-impression, low-CTR queries — use ctr_gaps_by_page (not just ctr_opportunities) because it includes the exact page URL alongside the query. This means every recommendation can name the specific page to fix and the specific query driving impressions. For each, analyze the likely search intent (informational, transactional, navigational, commercial investigation) and suggest a title + description that matches it.
Queries dropping month-over-month — flag anything with >30% click decline. For each, hypothesize: is it seasonal? Did a competitor take the SERP feature? Did the page content drift from the query intent?

Search Intent Analysis

For the top 10-15 queries, classify the search intent:

Informational ("how to...", "what is...") → needs comprehensive content, FAQ schema
Transactional ("buy...", "pricing...", "near me") → needs clear CTA, product schema, price
Navigational ("brand name", "brand + product") → should be ranking #1, if not, investigate
Commercial investigation ("best...", "vs...", "review") → needs comparison content, trust signals

If the page ranking for a query does not match the intent (e.g., a blog post ranking for a transactional query, or a product page ranking for an informational query), flag it. This is often the single biggest unlock.

Persona lens: Once intent is classified, cross-reference each query against the personas from Phase 3.7. Which persona is most likely searching this query? Are the vocabulary and framing in the current title/snippet the same words that persona would use? A title written for one persona can actively repel another. For example, a query attracting "The Budget-Conscious Founder" persona should use plain-language value framing, while the same topic searched by "The IT Manager" persona may expect technical specificity. Note the persona alignment (or mismatch) for every Quick Win recommendation.

Keyword Cannibalization Check

The output includes a cannibalization array. Each entry has structured winner/loser scoring — use it directly instead of re-deriving from raw data:

winner_page — the canonical page to keep (scored by best position, tiebreaker: most clicks)
winner_reason — why it won (e.g. "best position (2.1)")
loser_pages — pages to consolidate away
recommended_action — either "consolidate: 301 redirect losers to winner or add canonical" or "monitor: possible SERP domination" (all pages in top 5, positions within 2 of each other)

For each cannibalized query:

State the winner and losers explicitly — don't make the user figure it out
Use recommended_action directly in your recommendation
Flag queries where position is mediocre (5-15) despite high impressions — splitting is likely suppressing a potential top-3 ranking
If recommended_action is "monitor: possible SERP domination", note this as a positive (owning multiple SERP spots) and skip the consolidation recommendation

Also cross-check top_pages and position_buckets for indirect signals: a page that used to rank well dropping after a new page was published, or wild position fluctuation on a query, are signs of cannibalization not yet in the data window.

Page Group Performance

Use page_groups to show which site sections are winning and which need attention:

Section	Pages	Clicks	Impressions	CTR	Avg Position
/blog/	X	X	X	X%	X
/products/	X	X	X	X%	X
...

Flag:

Low-CTR sections: if an entire section (e.g., all /products/ pages) has CTR well below site average, the issue is likely a template problem (title tag format, meta description format) — one fix improves all pages in that section.
High-impression, low-click sections: signals ranking without converting — investigate intent mismatch or snippet quality across the section.
Sections missing entirely: if /locations/ or /services/ doesn't appear, either those pages don't rank or they haven't been created.
"other" group is large: means the site has custom URL patterns not covered by defaults — note this for the user so they can understand what's in "other."

This is more actionable than per-page analysis: a recommendation like "the /products/ title tag template needs work" can fix 50 pages at once.

Segment Analysis

Device (device_split): Compare CTR and position across mobile/desktop/ tablet. A page can look healthy overall but be failing on mobile. Flag any device where CTR is >30% below the site average — that is a mobile UX or snippet problem.

Country (country_split): Look at the top countries. Flag cases where:

A country has high impressions but very low CTR (title/snippet not landing in that market)
Position is much worse in one country vs others (local competitor or relevance gap)
A country with meaningful impressions has near-zero clicks (potential hreflang or geo-targeting issue)

Search type (search_type_split): If discover or googleNews appear, note them — they behave differently from web search and have separate optimization levers (freshness, images, authority signals). If image or video traffic exists and the site does not have dedicated image/video optimization, call that out as an opportunity.

Content Gaps

Queries where you rank 11-30 — you have topical authority but need a dedicated page or content expansion. Group related queries into topic clusters. For each cluster, recommend whether to:

Expand an existing page (if it partially covers the topic)
Create a new page (if no page targets this topic)
Create a content hub with internal linking (if there are 5+ related queries)

Pages to Fix

List pages with declining clicks. For each:

Current clicks vs previous period
% change
Likely cause (seasonal, algorithm update, new competitor, content staleness, technical issue)
Specific fix recommendation

Phase 4.5 — Keyword Gap Analysis

This phase identifies keyword opportunities directly from the GSC data — no external tools required, though running /keyword-research afterward can go deeper.

Step 1: Find Queries Without Dedicated Pages

From the GSC top_queries data, identify queries where:

The site ranks 4-20 for the query
The page that ranks is NOT a page primarily about that topic (e.g., a homepage or a page written for a different keyword is accidentally ranking)
There is no page on the site with that keyword prominently in the title, H1, or URL slug

These are keyword orphans — the site has demonstrated topical relevance but has never given the topic its own page. Creating a dedicated page for each is typically the highest-leverage content move.

For each orphan, state:

The query
Current ranking page (URL) and position
Monthly impressions
Recommended action: "Create a new page targeting '[query]' — currently ranked #[N] from [URL] which is not dedicated to this topic. A dedicated page could realistically move from #[N] to top 5."

Step 2: Build Topic Clusters from GSC Data

Group all ranking queries by theme. A cluster exists when 3+ queries share a core concept. For each cluster:

Name the cluster (e.g., "pricing-related queries", "feature X how-to queries")
List the queries in it, their positions, and their impressions
Identify whether a pillar page exists that ties them together
If no pillar page exists, recommend creating one and note the internal linking structure needed to funnel authority from cluster pages to the pillar

Step 3: Business Context Gap Check

Based on what the site does (inferred from its URL, top pages, and ranking queries), identify topics the business clearly serves that have zero or near-zero GSC impressions. These are business-relevant keyword gaps — the site should be visible for them but is not.

State the gap explicitly: "This appears to be a [type of business]. You rank for [X] but have no impressions for [related topic], which has significant search demand. This is a content gap to close."

Step 4: Offer Deeper Keyword Research

After completing the inline analysis, offer:

"I've identified [N] keyword gaps from your GSC data. For broader keyword discovery — including keywords you're NOT yet ranking for at all — run /keyword-research with your seed topics. That skill pulls from keyword databases and builds a full opportunity set beyond what GSC can see."

Phase 5 — Technical SEO Audit

Crawl the site's key pages to check technical health. Use the firecrawl skill if available, otherwise use WebFetch.

Pages to audit: at most 5 pages total. Prioritize: homepage first, then fill remaining slots with top pages by clicks from Phase 4 — unless a page is flagged as declining or NOT_INDEXED in Phase 3.5, in which case swap it in. Hard cap at 5 regardless of how many flagged pages exist; pick the highest-priority ones.

⚡ Speed note: Fetch all 5 pages using parallel WebFetch calls in a single turn — do not fetch them one-at-a-time. You should have already pre-fetched robots.txt and the homepage during Phase 3 (see Parallel Data Collection above); if so, only fetch the remaining pages you haven't retrieved yet.

Indexability

Fetch and analyze robots.txt — is it blocking important paths? Are there unnecessary disallow rules?
Check for noindex meta tags or X-Robots-Tag headers on important pages
Check canonical URLs — self-referencing (good) or pointing elsewhere (investigate)
Check for hreflang tags if the site targets multiple languages/regions
Look for orphan pages (important pages with no internal links pointing to them)
Cross-reference with URL Inspection findings from Phase 3.5 — any NOT_INDEXED page found there should be explained here with the root cause

Metadata Audit (Deep)

For each audited page, fetch the actual <title> and <meta name="description"> from the live HTML. Then cross-reference against GSC data:

Title vs top query alignment: For each page, look up the top 3 queries that page ranks for in ctr_gaps_by_page. Does the title tag contain the primary ranking query or a close variant? If the title is generic (e.g., "Home", "Services", "Blog") while the page ranks for specific queries, that is a mismatch — the title is failing to confirm relevance and hurting CTR.
Title length: Under 60 characters? Over 60 characters gets truncated in SERPs. Flag every page over the limit with the current character count and the truncated version as it would appear in Google.
Meta description: Present? 120-160 characters? Contains a call to action? If a page has no meta description, Google rewrites it — often pulling unhelpful boilerplate. Flag every missing description.
Duplicate titles: Are multiple pages using the same or very similar titles? List all duplicates found.
Open Graph tags: og:title, og:description, og:image present? Missing OG tags means social shares render with no preview — flag any page missing them, especially for content pages.

Report the findings as a table:

Page URL	Title (actual)	Title length	Top GSC query	Title/query match?	Meta desc present?	OG tags?
/	[actual title]	[N] chars	[query]	Yes / No	Yes / No	Yes / No

After presenting the metadata audit table, offer:

"I found [N] pages with metadata issues. Run /meta-tags-optimizer to generate optimized title tags and meta descriptions for each — it will use the GSC query data from this audit to write titles that match actual search demand."

Schema Markup Audit (Deep)

Detect the site type from its top pages, ranking queries, and visible content, then check what schema types exist vs. what should exist for that site type.

Step 1: Detect site type

Based on the homepage and top pages content, classify as one of:

E-commerce (products, pricing, cart)
Local business (address, phone, service area)
SaaS / software (features, pricing, signup)
Content / blog (articles, guides, tutorials)
Professional services (agency, consultant, law firm)
Media / news (articles published frequently)

Step 2: Define expected schema for site type

Site Type	Must Have	High Impact if Missing	Nice to Have
E-commerce	Product, BreadcrumbList	AggregateRating, FAQPage, Offer	SiteLinksSearchBox
Local business	LocalBusiness, GeoCoordinates	OpeningHoursSpecification, AggregateRating	FAQPage
SaaS	Organization, SoftwareApplication	FAQPage, BreadcrumbList	HowTo, Review
Content / blog	Article or BlogPosting	FAQPage, BreadcrumbList	HowTo, Video
Professional services	Organization, Service	FAQPage, Review	ProfessionalService, Person
Media / news	NewsArticle	BreadcrumbList	VideoObject, ImageObject

Step 3: Audit each top page for actual schema present

For each audited page, extract any <script type="application/ld+json"> blocks. List what @type values are present. Then compare against the expected set for this site type.

Report findings:

Page URL	Schema found	Missing high-impact schema	Errors in existing schema
/	Organization	FAQPage, SiteLinksSearchBox	None
/pricing	SoftwareApplication	FAQPage, Offer	Missing `price` property

Step 4: Flag errors in existing schema

Common issues to check:

Missing required fields for the @type (e.g., Product schema without name or offers)
url properties using relative paths instead of absolute URLs
Dates not in ISO 8601 format
AggregateRating with ratingCount of 0 or missing
Duplicate schema blocks for the same type on one page
Schema that describes content not visible on the page (violates Google policy)

Cross-reference with rich result status from Phase 3.5 URL Inspection — if a page showed rich result errors there, find the cause here.

After presenting the schema audit, offer:

"I found [N] pages missing high-impact schema and [N] pages with errors in existing schema. Run /schema-markup-generator to generate correct JSON-LD for each — it will use the site type and page content from this audit."

Core Web Vitals & Performance

Render-blocking scripts in <head> — should be deferred or async
Images: lazy-loaded? Have alt attributes? Served in modern formats (WebP/AVIF)? Properly sized (not 3000px wide in a 400px container)?
<link rel="preload"> for critical resources (fonts, above-the-fold images)?
Excessive DOM size (>1500 nodes suggests bloat)?
Third-party script bloat — count external domains loaded

Internal Linking & Site Architecture

Does the page have internal links? Are they descriptive (not "click here")?
Does the page link to related content (topic clusters)?
Is the page reachable within 3 clicks from the homepage?
Broken internal links (404s)?

Mobile Readiness

Viewport meta tag present?
Touch targets large enough (48px minimum)?
Text readable without zooming?
No horizontal scrolling?
Cross-reference mobile usability findings from Phase 3.5 URL Inspection

Phase 5.5 — PageSpeed Insights (Performance Monitoring)

Run the PageSpeed Insights API on the homepage + top 4 pages by clicks from Phase 3. This provides both lab data (Lighthouse synthetic test) and field data (Chrome UX Report real-user metrics) for Core Web Vitals.

⚡ Speed note: This should already be running in parallel from the Parallel Data Collection step. If not, run it now.

python3 "$SKILL_SCRIPTS/pagespeed.py" \
  --urls "$TARGET_URL,https://example.com/page2,https://example.com/page3" \
  --both-strategies

Replace the example URLs with the actual homepage and top pages from Phase 3. Use --both-strategies to get both mobile and desktop scores. If the user has set PAGESPEED_API_KEY in their environment, the script uses it automatically for higher rate limits.

After pagespeed.py completes, run the display utility:

python3 "$SKILL_SCRIPTS/show_pagespeed.py"

Analyze the Results

1. Performance Scores — Lighthouse scores 0-100 per page:

90-100 (Good): No action needed.
50-89 (Needs Work): Flag the top opportunities. These pages are losing rankings due to performance — Google uses Core Web Vitals as a ranking signal.
0-49 (Poor): Critical. These pages are actively penalized in rankings. Flag as a Priority Action if the page has significant organic traffic.

2. Core Web Vitals (Field Data) — Real-user metrics from Chrome UX Report:

LCP (Largest Contentful Paint): Good < 2.5s, Poor > 4.0s
INP (Interaction to Next Paint): Good < 200ms, Poor > 500ms
CLS (Cumulative Layout Shift): Good < 0.1, Poor > 0.25

Field data is more authoritative than lab data for SEO — Google uses CrUX data for rankings. If field data is available, lead with it. If not (low-traffic sites often lack CrUX data), use lab data and note it's synthetic.

3. Cross-Reference with Other Phases:

Phase 3 device split: If mobile performance score is significantly lower than desktop, and Phase 3 shows mobile traffic underperforming, the performance gap is likely a contributing factor.
Phase 5 technical audit: Correlate specific opportunities (e.g., "Eliminate render-blocking resources") with the technical findings (e.g., render-blocking scripts in <head>). This gives concrete evidence for technical fixes.
Phase 3.5 URL Inspection: Pages flagged as mobile-unfriendly that also have poor mobile PageSpeed scores need urgent attention.

4. Top Opportunities — The script extracts Lighthouse optimization opportunities sorted by potential time savings. For each, note:

What the opportunity is (e.g., "Properly size images", "Remove unused JavaScript")
Estimated savings in milliseconds
Which specific page(s) are affected
Whether it's a site-wide template issue or page-specific

5. Origin-Level Data — If available, the origin (site-wide) CrUX data shows the overall performance health of the entire domain. Compare individual page scores against the origin average to identify outlier pages dragging down the site's overall performance profile.

Phase 6 — Report

The goal of this report is not comprehensiveness — it is clarity. The user needs to know exactly what to do next, in what order, and why. Lead with the highest-impact actions. Put supporting data after. Omit anything that doesn't change what the user should do.

Output a structured report using this format exactly:

SEO Report — [site.com]

[date] · GSC data: [date range] · [First audit / Previous audit: date]

Audit History

(Skip this section entirely on the first audit — do not write "N/A" or "First audit" here; just omit the section.)

On subsequent audits, show only what changed from the previous audit's top issues:

Previously Flagged	Status	Notes
[Issue from last audit]	✅ Resolved / ⚠️ Improved / 🔴 Still present / ↗ Worsened	[1-line update with current metric]

⚡ Top Priority Actions

This is the core of the report. Include exactly 3–5 items, ordered by expected click impact. Every item must have a specific URL, a specific metric as evidence, and a specific fix — nothing generic.

Use this format for each:

#1 — [Short title, e.g. "Fix title tag on /pricing"] 🔴 Critical / 🟡 High / 🟢 Medium Impact: ~+[N] clicks/mo · Effort: Low / Med / High

What: [One sentence describing the problem] Evidence: [Exact metric — e.g., "ranks #7 for 'your-product pricing': 2,400 impressions/mo, 1.2% CTR (expected ~3% at this position)"] Fix: [Specific, copy-paste-ready action — e.g., "Change title from 'Pricing' to 'Plans & Pricing — [Value Prop] | [Brand]' (54 chars)"] Why it works: [One sentence on the mechanism — intent match, persona language, etc.]

Repeat for each of the 3–5 items. Do not add a 6th item — triage ruthlessly. An item only makes the list if you can quantify its impact.

When estimating impact, use conservative CTR curves: position 1 ~27%, position 2 ~15%, position 3 ~11%, position 4–5 ~5–8%, position 6–10 ~2–4%. Moving from position 7 to 3 on a 2,400 impression/month query means roughly +170 clicks/month. Always use real numbers from the data.

Every persona-informed recommendation must name the persona and cite the specific language from that persona's language field that should appear in the rewrite.

Traffic Snapshot

Metric	Value	vs Prior 28 days
Total Clicks	X	↑/↓ X%
Impressions	X	↑/↓ X%
Avg CTR	X%	↑/↓
Avg Position	X	↑/↓

(Branded/non-branded split — only if brand terms were provided):

Segment	Clicks	Impressions	CTR	Avg Position
Branded	X	X	X%	X
Non-branded	X	X	X%	X

[1-sentence interpretation of the split — what it reveals about organic vs brand performance]

Supporting Findings

This section exists to back up the Priority Actions and surface anything else the user should know. Keep it concise — tables and short bullets, not prose paragraphs. Only include sub-sections where there are actual findings.

Indexing Issues

(From Phase 3.5. Only include if issues found.)

Page	Coverage State	Last Crawl	Fix

Keyword Cannibalization

(Only include if cannibalization data is non-empty.)

Query	Winner Page	Loser Pages	Action

Content Gaps

(Queries ranking 11–30 with >200 impressions and no dedicated page.)

Query	Position	Impressions/mo	Recommended Action

Metadata Issues

(Only pages not already covered in Priority Actions.)

Page	Issue	Current	Recommended Fix

Schema Gaps

(High-impact missing schema for this site type.)

Page	Missing	Impact

Technical Issues

(Severity: Critical / High / Medium. Omit Low unless they surface as Priority Actions.)

Issue	Pages Affected	Fix	Severity

PageSpeed & Core Web Vitals

(From Phase 5.5. Only include if issues found. Lead with field data if available, fall back to lab data.)

Site-wide (Origin): [Overall CrUX rating if available]

Page	Score	LCP	INP	CLS	Top Opportunity
/	[score]	[value] [rating]	[value] [rating]	[value] [rating]	[top opportunity title + savings]

(If any page scores below 50, flag it as a Priority Action candidate — poor Core Web Vitals directly hurt rankings.)

Traffic Drops

(Pages/queries with >30% decline. Only include if not already in Priority Actions.)

Page / Query	Change	Hypothesis	Next Step

CMS SEO Audit

(Only if a CMS is configured. Top 5 impactful fixes only.)

Page	Issue	Current	Fix

What to Ignore (For Now)

List 2–3 things the data shows but that don't make the priority list — so the user knows you saw them and deprioritized them deliberately. One line each.

[e.g., "Device split: mobile CTR 15% below desktop — worth watching but not the bottleneck right now"]
[e.g., "Country split: weak CTR in UK — low volume, investigate after core issues fixed"]

After the report, write the audit log entry (see Phase 6.5 below before ending).

Phase 6.5 — Write Audit Log

After delivering the report, append a concise entry to the audit log. $DOMAIN and $AUDIT_LOG are already set from Step 0.5.

mkdir -p "$HOME/.toprank/audit-log"

Use Python to append (creates the file with a single-element array if it doesn't exist). Replace all <FILL> values with real data from the report before running:

import json, os
from datetime import datetime, timezone

log_path = "$AUDIT_LOG"
existing = json.load(open(log_path)) if os.path.exists(log_path) else []

existing.append({
    "date": datetime.now(timezone.utc).strftime("%Y-%m-%d"),
    "traffic_snapshot": {
        "clicks": <FILL>,
        "impressions": <FILL>,
        "avg_ctr_pct": <FILL>,
        "avg_position": <FILL>
    },
    "pagespeed_snapshot": {
        "avg_score_mobile": <FILL or null>,
        "avg_score_desktop": <FILL or null>,
        "homepage_score_mobile": <FILL or null>,
        "cwv_lcp_ms": <FILL or null>,
        "cwv_inp_ms": <FILL or null>,
        "cwv_cls": <FILL or null>,
        "cwv_source": "<FILL: field|lab>"  # "field" if CrUX data available, else "lab"
    },
    "top_issues": [
        # One entry per Priority Action (max 5), in priority order
        {"rank": 1, "title": "<FILL>", "type": "<FILL: title_tag|indexing|cannibalization|schema|content_gap|performance>", "page": "<FILL>", "metric": "<FILL>", "expected_impact": "<FILL>", "status": "open"}
    ],
    "resolved_from_previous": []  # populated on next audit from Audit History comparison
})

json.dump(existing, open(log_path, "w"), indent=2)
print(f"Audit log saved to {log_path}")

Confirm with a one-liner: "Audit log saved to ~/.toprank/audit-log/$DOMAIN.json."

Phase 7 — Targeted Skill Handoffs (Optional)

After delivering the report, surface the follow-up actions based on what was found. Only offer handoffs where the audit actually found issues — do not offer all three if only one is relevant.

Metadata Handoff

If the metadata audit found [N] pages with issues:

"I found [N] pages with metadata issues — [X] with title/query mismatches, [Y] missing meta descriptions, [Z] missing OG tags. Run /meta-tags-optimizer to generate optimized tags for each page. Share the metadata audit table from this report as context."

Schema Handoff

If the schema audit found gaps or errors:

"I found [N] pages missing high-impact schema and [N] pages with schema errors. Run /schema-markup-generator to generate correct JSON-LD. The schema audit table from this report is the input — it already identifies the site type and what schema types are needed per page."

Keyword Research Handoff

If the keyword gap analysis found orphan keywords or business relevance gaps:

"I found [N] keyword gaps from GSC data. For deeper discovery — keywords you are not ranking for at all — run /keyword-research with these seed topics: [list 3-5 seed terms derived from the gap analysis]. That skill pulls from keyword databases and builds a full opportunity set beyond what GSC can see."

Phase 8 — Content Generation (Optional)

After delivering the report, if the Content Opportunities section identified actionable content gaps, offer to generate the content:

"I found [N] content opportunities. Want me to draft the content? I can write [blog posts / landing pages / both] in parallel — each one optimized for the target keyword and search intent."

If the user agrees, spawn content agents in parallel using the Agent tool. Each agent writes one piece of content independently.

How to Spawn Content Agents

For each content opportunity, determine the content type from the search intent:

Informational / commercial investigation → blog post agent
Transactional / commercial → landing page agent

Spawn agents in parallel. Each agent receives:

The content writing guidelines (located via find — see below)
The specific opportunity data from the analysis

Before spawning agents, locate the content writing reference:

CONTENT_REF=$(find ~/.claude/plugins ~/.claude/skills ~/.codex/skills .agents/skills -name "content-writing.md" -path "*content-writer*" 2>/dev/null | head -1)
if [ -z "$CONTENT_REF" ]; then
  echo "WARNING: content-writing.md not found. Content agents will use built-in knowledge only."
else
  echo "Content reference at: $CONTENT_REF"
fi

Pass $CONTENT_REF as the path in each agent prompt below. If not found, omit the "Read the content writing guidelines" line — the agents will still produce good content using built-in knowledge.

Use this prompt template for each agent:

Blog Post Agent Prompt

You are a senior content strategist writing a blog post that ranks on Google.

Read the content writing guidelines at: $CONTENT_REF
Follow the "Blog Posts" section exactly.

## Assignment

Target keyword: [keyword]
Current position: [position] (query ranked but no dedicated content)
Monthly impressions: [impressions]
Search intent: [informational / commercial investigation]
Site context: [what the site is about, its audience]
Existing pages to link to: [relevant internal pages from the analysis]
[If available] Competitor context: [what currently ranks for this keyword]

## Target Personas
Write primarily for: [Primary persona name]
Their goal: [primary goal]
Their language: [key terms and phrases they use — use these naturally in headings, intro, and body]
Their pain points: [pain points — address these directly, don't make them search for answers]
Secondary audience: [Secondary persona name if applicable] — [brief note on how to serve both without diluting focus]

## Deliverables

Write the complete blog post following the guidelines, including:
1. Full post in markdown with proper heading hierarchy
2. SEO metadata (title tag, meta description, URL slug)
3. JSON-LD structured data (Article/BlogPosting + FAQPage if FAQ included)
4. Internal linking plan (which existing pages to link to/from)
5. Publishing checklist

## Quality Gate
Before finishing, verify:
- Would the reader need to search again? (If yes, not done)
- Does the post contain specific examples only an expert would include?
- Does the format match what Google shows for this query?
- Is every paragraph earning its place? (No filler)

Landing Page Agent Prompt

You are a senior conversion copywriter writing a landing page that ranks AND converts.

Read the content writing guidelines at: $CONTENT_REF
Follow the "Landing Pages" section exactly.

## Assignment

Target keyword: [keyword]
Current position: [position]
Monthly impressions: [impressions]
Search intent: [transactional / commercial]
Page type: [service / product / location / comparison]
Site context: [what the site is about, value prop, target customer]
Existing pages to link to: [relevant internal pages]
[If available] Competitor context: [what currently ranks]

## Target Personas
Write primarily for: [Primary persona name]
Their goal: [primary goal when landing here]
Their language: [terms they use — mirror this in headlines, subheads, and CTAs]
Their decision trigger: [what makes them convert — address this prominently above the fold]
Their objections: [pain points and doubts — address each explicitly, don't leave them wondering]

## Deliverables

Write the complete landing page following the guidelines, including:
1. Full page copy in markdown with proper heading hierarchy and CTA placements
2. SEO metadata (title tag, meta description, URL slug)
3. Conversion strategy (primary CTA, objections addressed, trust signals)
4. JSON-LD structured data
5. Internal linking plan
6. Publishing checklist

## Quality Gate
Before finishing, verify:
- Would you convert after reading this? (If not, what is missing?)
- Are there vague claims that should be replaced with specifics?
- Is every objection addressed?
- Is it clear what the visitor should do next?

Spawning Rules

Spawn up to 5 content agents in parallel (more than 5 gets unwieldy — prioritize by impact)
Prioritize opportunities by: impressions x position-improvement-potential
Each agent works independently — they do not need to coordinate
As agents complete, present each piece of content to the user with its metadata
After all agents finish, provide a summary: what was generated, suggested publishing order (highest impact first), and any cross-linking between new pages

Similar Skills

seo-optimizing

5 files

marketing

seo-audit

483

Audits and diagnoses SEO issues on a live site using UnifAPI for real rendered page data, actual SERP positions, and backlink profiles.

4 files

UnifAPI

seo

13 files

seo-dungeon

Stats

LanguageTypeScript

Stars2,794

Forks345

MaintenanceExcellent

Last CommitJun 6, 2026

Actions

View Source View Plugin View on GitHub View README

Help us improve

Share bugs, ideas, or general feedback.

SEO Analysis

Your goal is not to produce a generic report. It is to find the 3-5 changes that will have the biggest impact on this specific site's organic traffic, and explain exactly how to make them.

Works on any site. Works whether you are inside a website repo or auditing a URL cold.

Step 0 — Establish the Website URL

Before doing anything else, check for previously audited sites:

ls ~/.toprank/business-context/*.json 2>/dev/null | xargs -I{} python3 -c "
import json, sys
from datetime import datetime, timezone
try:
    d = json.load(open(sys.argv[1]))
    gen = datetime.fromisoformat(d.get('generated_at', '1970-01-01T00:00:00+00:00'))
    age = (datetime.now(timezone.utc) - gen.astimezone(timezone.utc)).days
    print(f\"{d.get('target_url', d.get('domain','?'))} (audited {age}d ago)\")
except: pass
" {}

If one or more cached sites are listed, show them and ask:

"I've audited these sites before — use one, or enter a different URL:

https://example.com (audited 12 days ago)

Enter a different URL"

If the user picks a cached site, load target_url from that domain's ~/.toprank/business-context/<domain>.json and set it as $TARGET_URL. Skip to Phase 0.

If no cached sites exist, ask the user:

"What is the main URL of the website you want to audit? (e.g. https://yoursite.com)"

Wait for their answer. Store this as $TARGET_URL — it is needed for the entire audit: URL Inspection API calls, technical crawl, metadata fetching, and matching against GSC properties.

Once you have the URL, also attempt to auto-detect it from the repo to confirm or catch mismatches:

package.json → "homepage" field or scripts with domain hints
next.config.js / next.config.ts → env.NEXT_PUBLIC_SITE_URL or basePath
astro.config.* → site: field
gatsby-config.js → siteMetadata.siteUrl
hugo.toml / hugo.yaml → baseURL
_config.yml (Jekyll) → url field
.env or .env.local → NEXT_PUBLIC_SITE_URL, SITE_URL, PUBLIC_URL
vercel.json → deployment aliases
CNAME file (GitHub Pages)

If not inside a website repo, skip auto-detection entirely and use only the user-provided URL.

Step 0.5 — Load Audit History

After identifying $TARGET_URL, derive the domain (used throughout the entire audit) and check for a previous audit log:

DOMAIN=$(python3 -c "import sys; from urllib.parse import urlparse; print(urlparse(sys.argv[1]).netloc.lstrip('www.'))" "$TARGET_URL")
AUDIT_LOG="$HOME/.toprank/audit-log/${DOMAIN}.json"
[ -f "$AUDIT_LOG" ] && cat "$AUDIT_LOG" || echo "NOT_FOUND"

$DOMAIN is now set — reuse it everywhere (Phase 3.7, Phase 6.5). Do not re-derive it.

If found: Extract the most recent entry's date and top_issues. Show the user a brief one-liner:

"Last audit: [date]. Previously flagged: [issue #1 title], [issue #2 title]. I'll check whether these are resolved."

Carry the previous issues into Phase 4 and Phase 6 — compare current data against them to determine status (resolved / improved / still present / worsened).

If not found: This is the first audit. No action needed.

Do NOT pause for user confirmation — just show the one-liner and continue.

Phase 0 — Preflight Check

Read and follow ../shared/preamble.md — it handles script discovery, gcloud auth, and GSC API setup. If credentials are already cached, this is instant.

"For reliable PageSpeed analysis, create an API key at https://console.cloud.google.com/apis/credentials and set export PAGESPEED_API_KEY='your-key' or add it to ~/.toprank/.env."

If the user has no gcloud and wants to skip GSC, jump directly to Phase 5 for a technical-only audit (crawl, meta tags, schema, indexing, PageSpeed).

Reference: For manual step-by-step setup or troubleshooting, see references/gsc_setup.md.

Phase 1 — Confirm Access to Google Search Console

Using $SKILL_SCRIPTS from the shared preamble (Step 2):

python3 "$SKILL_SCRIPTS/list_gsc_sites.py"

If it lists sites → done. Carry the site list into Phase 2.

If "No Search Console properties found" → wrong Google account. Ask the user which account owns their GSC properties at https://search.google.com/search-console, then re-authenticate:

gcloud auth application-default login \
  --scopes=https://www.googleapis.com/auth/webmasters,https://www.googleapis.com/auth/webmasters.readonly

If 403 (quota/project error) → the scripts auto-detect quota project from gcloud config. If it still fails, set it explicitly:

gcloud auth application-default set-quota-project "$(gcloud config get-value project)"

If 403 (API not enabled) → run:

gcloud services enable searchconsole.googleapis.com

If 403 (permission denied) → the account lacks GSC property access. Verify at Search Console → Settings → Users and permissions.

Phase 2 — Match the Site to a GSC Property

Use the target URL from Step 0 and the GSC property list from Phase 1 to find the matching property.

Collect brand terms

First, run the Loading section from ../shared/business-context.md. This sets CACHE_STATUS (one of fresh_loaded, stale, or not_found).

If CACHE_STATUS=stale or not_found: ask the user:

"What's your brand name? Enter one or more comma-separated terms (e.g. Acme, AcmeCorp, acme.io) — used to separate branded from non-branded traffic. Press Enter to skip."

Store the response as BRAND_TERMS. If skipped, leave empty — the script handles it gracefully.

Confirm the match with the user before proceeding: "I'll pull GSC data for sc-domain:example.com — is that correct?"

Phase 3 — Collect GSC Data

Run the main analysis script with the confirmed site property:

python3 "$SKILL_SCRIPTS/analyze_gsc.py" \
  --site "sc-domain:example.com" \
  --days 90 \
  --brand-terms "$BRAND_TERMS"

(Omit --brand-terms if $BRAND_TERMS is empty.)

After analyze_gsc.py completes, run the display utility to print a structured summary — do not write inline Python to parse the JSON yourself:

python3 "$SKILL_SCRIPTS/show_gsc.py"

This pulls:

Top queries by impressions, clicks, CTR, average position
Top pages by clicks + impressions
Position buckets — queries in 1-3, 4-10, 11-20, 21+ (the "striking distance" opportunities)
Queries losing clicks — comparing last 28 days vs the prior 28 days
Pages losing traffic — same comparison
CTR opportunities (ctr_opportunities) — query-level: high impressions, low CTR, title/snippet targets
CTR gaps by page (ctr_gaps_by_page) — query+page level: shows exactly which page to rewrite for each underperforming query
Cannibalization (cannibalization) — queries where multiple pages compete, with per-page click/impression split
Device split — mobile vs desktop vs tablet clicks, impressions, CTR, position
Country split (country_split) — top 20 countries by clicks with CTR and position
Search type breakdown (search_type_split) — web vs image vs video vs news vs Discover vs Google News traffic
Branded vs non-branded split (branded_split) — separate aggregates for queries containing brand terms vs pure organic; null if no brand terms provided
Page groups (page_groups) — traffic aggregated by site section (/blog/, /products/, /locations/, etc.) with per-section clicks, impressions, CTR, and average position

If GSC is unavailable, skip to Phase 5 (technical-only audit).

⚡ Parallel Data Collection (after Phase 3 completes)

Do not run Phase 3.5, 3.6, 5, and 5.5 sequentially — run them all at once.

As soon as Phase 3's analyze_gsc.py finishes and you have the top pages list, launch all four of these in a single turn using parallel tool calls:

Phase 3.5: run url_inspection.py (Bash tool)
Phase 3.6: detect CMS with cms_detect.py, then run the appropriate preflight + fetch if configured (Bash tool)
Phase 5 pre-fetch: fetch robots.txt, the homepage, and up to 4 top pages via WebFetch — all in parallel
Phase 5.5: run pagespeed.py for the homepage + top pages by clicks (Bash tool) — this calls the PageSpeed Insights API which is independent of GSC auth

Phase 3.5 — URL Inspection

Run the URL Inspection API on the top 10 pages by clicks from Phase 3, plus any pages flagged as losing traffic:

python3 "$SKILL_SCRIPTS/url_inspection.py" \
  --site "sc-domain:example.com" \
  --urls "/path/to/page1,/path/to/page2,..."

The script calls POST https://searchconsole.googleapis.com/v1/urlInspection/index:inspect for each URL and returns per-page:

Indexing status: INDEXED, NOT_INDEXED, SUBMITTED_AND_INDEXED, DUPLICATE_WITHOUT_CANONICAL, CRAWLED_CURRENTLY_NOT_INDEXED, etc.
Mobile usability verdict: MOBILE_FRIENDLY or issues found
Rich result status: which rich result types were detected and their verdict
Last crawl time: when Googlebot last visited
Referring sitemaps: which sitemap(s) reference this URL
Coverage state: full coverage detail from the Index Coverage report

If URL Inspection returns 403: the current auth scope may be read-only. Re- authenticate with the broader scope:

gcloud auth application-default login \
  --scopes=https://www.googleapis.com/auth/webmasters,https://www.googleapis.com/auth/webmasters.readonly

Then retry url_inspection.py.

Analyze the inspection results and flag immediately:

Any top-traffic page that is NOT_INDEXED or CRAWLED_CURRENTLY_NOT_INDEXED — this is a critical issue. Identify which page, what the coverage state says, and what likely caused it (noindex tag, canonical pointing elsewhere, robots blocking, soft 404).
Pages with DUPLICATE_WITHOUT_CANONICAL — these are leaking authority. The canonical needs to be set.
Pages where mobile usability is failing — cross-reference with device split from Phase 3 to confirm whether mobile traffic is below par.
Pages with no referring sitemaps — if they are important pages, they should be in a sitemap.
Pages with rich result errors where schema exists — this pre-validates Phase 5 structured data findings.
Pages whose last crawl time is more than 60 days ago despite having traffic — crawl budget issue or accidental de-prioritization.

Phase 3.6 — CMS Content Inventory (Optional)

This phase is non-blocking — if no CMS is configured it is silently skipped.

Detect configured CMS

CMS_TYPE=$(python3 "$SKILL_SCRIPTS/cms_detect.py" 2>/dev/null)
CMS_DETECT_EXIT=$?

Exit code 2 → no CMS configured. Skip this phase entirely, no mention needed.
Exit code 0 → CMS detected. Run the matching preflight below.

Run preflight and fetch

CMS_CONTENT_FILE=$(SKILL_SCRIPTS="$SKILL_SCRIPTS" python3 -c "import os, sys, tempfile; sys.path.insert(0, os.environ['SKILL_SCRIPTS']); from _uid import portable_uid; print(os.path.join(tempfile.gettempdir(), f'cms_content_{portable_uid()}.json'))")

case "$CMS_TYPE" in
  strapi)
    python3 "$SKILL_SCRIPTS/preflight_strapi.py"
    CMS_PREFLIGHT=$?
    [ "$CMS_PREFLIGHT" = "0" ] && python3 "$SKILL_SCRIPTS/fetch_strapi_content.py" --output "$CMS_CONTENT_FILE"
    ;;
  wordpress)
    python3 "$SKILL_SCRIPTS/preflight_wordpress.py"
    CMS_PREFLIGHT=$?
    [ "$CMS_PREFLIGHT" = "0" ] && python3 "$SKILL_SCRIPTS/fetch_wordpress_content.py" --output "$CMS_CONTENT_FILE"
    ;;
  contentful)
    python3 "$SKILL_SCRIPTS/preflight_contentful.py"
    CMS_PREFLIGHT=$?
    [ "$CMS_PREFLIGHT" = "0" ] && python3 "$SKILL_SCRIPTS/fetch_contentful_content.py" --output "$CMS_CONTENT_FILE"
    ;;
  ghost)
    python3 "$SKILL_SCRIPTS/preflight_ghost.py"
    CMS_PREFLIGHT=$?
    [ "$CMS_PREFLIGHT" = "0" ] && python3 "$SKILL_SCRIPTS/fetch_ghost_content.py" --output "$CMS_CONTENT_FILE"
    ;;
esac

Preflight exit codes:

0 → ready. Content fetched to $CMS_CONTENT_FILE. Load it and use the data in Phase 4.
2 → not configured. Skip silently.
1 → auth/config error. Show the error and ask the user if they want to fix it (suggest /setup-cms) or continue without CMS data.

What to do with the CMS data

Load $CMS_CONTENT_FILE. All CMSes produce the same normalized format: cms_content.entries is a list of published articles with slugs and SEO fields.

Cross-reference against GSC data:

4. Missing SEO fields — Use cms_content.seo_audit directly:

missing_meta_title — entries with no meta title set
missing_meta_description — entries with no meta description set
meta_title_too_long — meta titles over 60 characters
meta_description_too_short/too_long — outside 70-160 char range

Surface the top 5 most impactful fixes (by impressions where GSC data matches).

Pushing fixes back (Strapi only)

For Strapi, after generating recommendations in Phase 6, offer to write the fixes directly:

"I can push the meta title/description fixes directly to Strapi. Want me to apply them?"

python3 "$SKILL_SCRIPTS/push_strapi_seo.py" \
  --document-id "<documentId>" \
  --meta-title "New title under 60 chars" \
  --meta-description "New description 70-160 chars."
# Or batch: python3 "$SKILL_SCRIPTS/push_strapi_seo.py" --batch-file /tmp/seo_updates.json

The script shows a before/after diff and requires confirmation before writing.

Setup / reconfiguration

If no CMS is configured and the user wants to connect one, suggest:

"Run /setup-cms to connect WordPress, Strapi, Contentful, or Ghost."

Phase 3.7 — Business & Persona Discovery

Check for cached personas

Personas are cached at ~/.toprank/personas/ keyed by domain hostname. Check whether a persona file already exists ($DOMAIN is already set from Step 0.5):

PERSONA_FILE="$HOME/.toprank/personas/$DOMAIN.json"
[ -f "$PERSONA_FILE" ] && cat "$PERSONA_FILE" || echo "NOT_FOUND"

If found but stale (>90 days) or not found: Continue to discovery below.

Discover personas from GSC + site content

Combine these data sources — do not fetch any new pages (you already have them):

GSC top queries (from Phase 3) — the actual words real visitors type. Group by search intent: who searches informational queries vs transactional vs commercial investigation? These are different people with different needs.
GSC top pages (from Phase 3) — which pages get traffic reveals what the site is known for (vs. what it claims on the homepage).
Homepage content (already fetched for Phase 5) — extract: what the business does, who they serve, value proposition, tone/vocabulary, conversion intent.
URL structure (from page groups in GSC) — /blog/ vs /products/ vs /pricing/ reveals different visitor segments.

From these signals, identify the 2-3 most distinct visitor segments. For each:

Field	What to capture	Why it matters
Name	Descriptive label (e.g., "Budget-Conscious Founder")	Quick reference throughout the report
Demographics	Role, company size, technical level	Calibrates language register
Primary goal	What they're trying to accomplish	Shapes title tags and meta descriptions
Pain points	Problems driving them to search	Informs content angle and CTAs
Search behavior	Query types, informational vs transactional	Maps personas to GSC query clusters
Language	Specific words, phrases, jargon they use	Direct input to title/description rewrites
Decision trigger	What makes them convert or return	Shapes CTA and landing page copy

Persist personas

Save to ~/.toprank/personas/<domain>.json using a Python one-liner to ensure valid JSON (not a heredoc — heredocs with JSON are fragile):

mkdir -p "$HOME/.toprank/personas"
python3 -c "
import json, sys
data = {
    'domain': '$DOMAIN',
    'saved_at': '$(date -u +%Y-%m-%dT%H:%M:%SZ)',
    'business_summary': '<FILL: 1-2 sentence business description>',
    'personas': [
        {
            'name': '<FILL>',
            'demographics': '<FILL>',
            'primary_goal': '<FILL>',
            'pain_points': '<FILL>',
            'search_behavior': '<FILL>',
            'language': ['<FILL: term1>', '<FILL: term2>', '<FILL: term3>'],
            'decision_trigger': '<FILL>'
        }
    ]
}
json.dump(data, open('$PERSONA_FILE', 'w'), indent=2)
print('Personas saved to $PERSONA_FILE')
"

Replace all <FILL: ...> placeholders with actual discovered values before running. The Python approach avoids shell quoting issues with apostrophes and special characters in persona descriptions.

Present personas (non-blocking)

"Based on your GSC data and site content, I've identified these visitor personas that will shape the recommendations:"

Persona Searches like... Goal
[name] [2-3 example query patterns from GSC] [goal]

"Let me know if any of these are off — otherwise I'll use them throughout the analysis."

Persona	Searches like...	Goal
[name]	[2-3 example query patterns from GSC]	[goal]

Then immediately continue to Phase 4. Do not wait for a response. If the user corrects a persona later, update the file and adjust any affected recommendations.

Reference $PERSONA_FILE path as ~/.toprank/personas/<domain>.json in later phases — derive <domain> from the target URL each time rather than relying on shell variable persistence.

Phase 3.8 — Business Context

Read and follow ../shared/business-context.md.

Branch on CACHE_STATUS from Phase 2:

fresh_loaded: business context is already in memory. No action needed — proceed to Phase 4.

Phase 4 — Search Console Analysis

This is where you earn your keep. Do not just restate the data. Interpret it like an SEO expert would.

Traffic Overview

Branded vs Non-Branded Split

If branded_split is present (not null), show it as the first table in the analysis:

Segment	Queries	Clicks	Impressions	CTR	Avg Position
Branded	X	X	X	X%	X
Non-branded	X	X	X	X%	X

Interpret the gap:

If branded CTR is significantly higher (expected — users know what they're looking for), note that non-branded metrics are the real measure of organic performance.
If branded impressions are small vs total, the site has limited brand awareness — focus on non-branded growth.
If branded queries are ranking below position 3, that's a reputation/brand issue to flag separately.
Use non-branded metrics as the baseline for all Quick Wins and content recommendations — don't let branded traffic inflate the opportunity estimates.

Quick Wins (highest impact, lowest effort)

These are the changes that can move the needle in days, not months:

Position 4-10 queries — ranking on page 1 but below the fold. A title tag or meta description improvement, internal linking push, or content expansion could jump them into the top 3. List the top 10 with current position, impressions, and a specific recommendation for each.
High-impression, low-CTR queries — use ctr_gaps_by_page (not just ctr_opportunities) because it includes the exact page URL alongside the query. This means every recommendation can name the specific page to fix and the specific query driving impressions. For each, analyze the likely search intent (informational, transactional, navigational, commercial investigation) and suggest a title + description that matches it.
Queries dropping month-over-month — flag anything with >30% click decline. For each, hypothesize: is it seasonal? Did a competitor take the SERP feature? Did the page content drift from the query intent?

Search Intent Analysis

For the top 10-15 queries, classify the search intent:

Informational ("how to...", "what is...") → needs comprehensive content, FAQ schema
Transactional ("buy...", "pricing...", "near me") → needs clear CTA, product schema, price
Navigational ("brand name", "brand + product") → should be ranking #1, if not, investigate
Commercial investigation ("best...", "vs...", "review") → needs comparison content, trust signals

Keyword Cannibalization Check

The output includes a cannibalization array. Each entry has structured winner/loser scoring — use it directly instead of re-deriving from raw data:

winner_page — the canonical page to keep (scored by best position, tiebreaker: most clicks)
winner_reason — why it won (e.g. "best position (2.1)")
loser_pages — pages to consolidate away
recommended_action — either "consolidate: 301 redirect losers to winner or add canonical" or "monitor: possible SERP domination" (all pages in top 5, positions within 2 of each other)

For each cannibalized query:

State the winner and losers explicitly — don't make the user figure it out
Use recommended_action directly in your recommendation
Flag queries where position is mediocre (5-15) despite high impressions — splitting is likely suppressing a potential top-3 ranking
If recommended_action is "monitor: possible SERP domination", note this as a positive (owning multiple SERP spots) and skip the consolidation recommendation

Page Group Performance

Use page_groups to show which site sections are winning and which need attention:

Section	Pages	Clicks	Impressions	CTR	Avg Position
/blog/	X	X	X	X%	X
/products/	X	X	X	X%	X
...

Flag:

Low-CTR sections: if an entire section (e.g., all /products/ pages) has CTR well below site average, the issue is likely a template problem (title tag format, meta description format) — one fix improves all pages in that section.
High-impression, low-click sections: signals ranking without converting — investigate intent mismatch or snippet quality across the section.
Sections missing entirely: if /locations/ or /services/ doesn't appear, either those pages don't rank or they haven't been created.
"other" group is large: means the site has custom URL patterns not covered by defaults — note this for the user so they can understand what's in "other."

This is more actionable than per-page analysis: a recommendation like "the /products/ title tag template needs work" can fix 50 pages at once.

Segment Analysis

Country (country_split): Look at the top countries. Flag cases where:

A country has high impressions but very low CTR (title/snippet not landing in that market)
Position is much worse in one country vs others (local competitor or relevance gap)
A country with meaningful impressions has near-zero clicks (potential hreflang or geo-targeting issue)

Content Gaps

Queries where you rank 11-30 — you have topical authority but need a dedicated page or content expansion. Group related queries into topic clusters. For each cluster, recommend whether to:

Expand an existing page (if it partially covers the topic)
Create a new page (if no page targets this topic)
Create a content hub with internal linking (if there are 5+ related queries)

Pages to Fix

List pages with declining clicks. For each:

Current clicks vs previous period
% change
Likely cause (seasonal, algorithm update, new competitor, content staleness, technical issue)
Specific fix recommendation

Phase 4.5 — Keyword Gap Analysis

This phase identifies keyword opportunities directly from the GSC data — no external tools required, though running /keyword-research afterward can go deeper.

Step 1: Find Queries Without Dedicated Pages

From the GSC top_queries data, identify queries where:

The site ranks 4-20 for the query
The page that ranks is NOT a page primarily about that topic (e.g., a homepage or a page written for a different keyword is accidentally ranking)
There is no page on the site with that keyword prominently in the title, H1, or URL slug

For each orphan, state:

The query
Current ranking page (URL) and position
Monthly impressions
Recommended action: "Create a new page targeting '[query]' — currently ranked #[N] from [URL] which is not dedicated to this topic. A dedicated page could realistically move from #[N] to top 5."

Step 2: Build Topic Clusters from GSC Data

Group all ranking queries by theme. A cluster exists when 3+ queries share a core concept. For each cluster:

Name the cluster (e.g., "pricing-related queries", "feature X how-to queries")
List the queries in it, their positions, and their impressions
Identify whether a pillar page exists that ties them together
If no pillar page exists, recommend creating one and note the internal linking structure needed to funnel authority from cluster pages to the pillar

Step 3: Business Context Gap Check

State the gap explicitly: "This appears to be a [type of business]. You rank for [X] but have no impressions for [related topic], which has significant search demand. This is a content gap to close."

Step 4: Offer Deeper Keyword Research

After completing the inline analysis, offer:

"I've identified [N] keyword gaps from your GSC data. For broader keyword discovery — including keywords you're NOT yet ranking for at all — run /keyword-research with your seed topics. That skill pulls from keyword databases and builds a full opportunity set beyond what GSC can see."

Phase 5 — Technical SEO Audit

Crawl the site's key pages to check technical health. Use the firecrawl skill if available, otherwise use WebFetch.

Indexability

Fetch and analyze robots.txt — is it blocking important paths? Are there unnecessary disallow rules?
Check for noindex meta tags or X-Robots-Tag headers on important pages
Check canonical URLs — self-referencing (good) or pointing elsewhere (investigate)
Check for hreflang tags if the site targets multiple languages/regions
Look for orphan pages (important pages with no internal links pointing to them)
Cross-reference with URL Inspection findings from Phase 3.5 — any NOT_INDEXED page found there should be explained here with the root cause

Metadata Audit (Deep)

For each audited page, fetch the actual <title> and <meta name="description"> from the live HTML. Then cross-reference against GSC data:

Title vs top query alignment: For each page, look up the top 3 queries that page ranks for in ctr_gaps_by_page. Does the title tag contain the primary ranking query or a close variant? If the title is generic (e.g., "Home", "Services", "Blog") while the page ranks for specific queries, that is a mismatch — the title is failing to confirm relevance and hurting CTR.
Title length: Under 60 characters? Over 60 characters gets truncated in SERPs. Flag every page over the limit with the current character count and the truncated version as it would appear in Google.
Meta description: Present? 120-160 characters? Contains a call to action? If a page has no meta description, Google rewrites it — often pulling unhelpful boilerplate. Flag every missing description.
Duplicate titles: Are multiple pages using the same or very similar titles? List all duplicates found.
Open Graph tags: og:title, og:description, og:image present? Missing OG tags means social shares render with no preview — flag any page missing them, especially for content pages.

Report the findings as a table:

Page URL	Title (actual)	Title length	Top GSC query	Title/query match?	Meta desc present?	OG tags?
/	[actual title]	[N] chars	[query]	Yes / No	Yes / No	Yes / No

After presenting the metadata audit table, offer:

"I found [N] pages with metadata issues. Run /meta-tags-optimizer to generate optimized title tags and meta descriptions for each — it will use the GSC query data from this audit to write titles that match actual search demand."

Schema Markup Audit (Deep)

Detect the site type from its top pages, ranking queries, and visible content, then check what schema types exist vs. what should exist for that site type.

Step 1: Detect site type

Based on the homepage and top pages content, classify as one of:

E-commerce (products, pricing, cart)
Local business (address, phone, service area)
SaaS / software (features, pricing, signup)
Content / blog (articles, guides, tutorials)
Professional services (agency, consultant, law firm)
Media / news (articles published frequently)

Step 2: Define expected schema for site type

Site Type	Must Have	High Impact if Missing	Nice to Have
E-commerce	Product, BreadcrumbList	AggregateRating, FAQPage, Offer	SiteLinksSearchBox
Local business	LocalBusiness, GeoCoordinates	OpeningHoursSpecification, AggregateRating	FAQPage
SaaS	Organization, SoftwareApplication	FAQPage, BreadcrumbList	HowTo, Review
Content / blog	Article or BlogPosting	FAQPage, BreadcrumbList	HowTo, Video
Professional services	Organization, Service	FAQPage, Review	ProfessionalService, Person
Media / news	NewsArticle	BreadcrumbList	VideoObject, ImageObject

Step 3: Audit each top page for actual schema present

For each audited page, extract any <script type="application/ld+json"> blocks. List what @type values are present. Then compare against the expected set for this site type.

Report findings:

Page URL	Schema found	Missing high-impact schema	Errors in existing schema
/	Organization	FAQPage, SiteLinksSearchBox	None
/pricing	SoftwareApplication	FAQPage, Offer	Missing `price` property

Step 4: Flag errors in existing schema

Common issues to check:

Missing required fields for the @type (e.g., Product schema without name or offers)
url properties using relative paths instead of absolute URLs
Dates not in ISO 8601 format
AggregateRating with ratingCount of 0 or missing
Duplicate schema blocks for the same type on one page
Schema that describes content not visible on the page (violates Google policy)

Cross-reference with rich result status from Phase 3.5 URL Inspection — if a page showed rich result errors there, find the cause here.

After presenting the schema audit, offer:

"I found [N] pages missing high-impact schema and [N] pages with errors in existing schema. Run /schema-markup-generator to generate correct JSON-LD for each — it will use the site type and page content from this audit."

Core Web Vitals & Performance

Render-blocking scripts in <head> — should be deferred or async
Images: lazy-loaded? Have alt attributes? Served in modern formats (WebP/AVIF)? Properly sized (not 3000px wide in a 400px container)?
<link rel="preload"> for critical resources (fonts, above-the-fold images)?
Excessive DOM size (>1500 nodes suggests bloat)?
Third-party script bloat — count external domains loaded

Internal Linking & Site Architecture

Does the page have internal links? Are they descriptive (not "click here")?
Does the page link to related content (topic clusters)?
Is the page reachable within 3 clicks from the homepage?
Broken internal links (404s)?

Mobile Readiness

Viewport meta tag present?
Touch targets large enough (48px minimum)?
Text readable without zooming?
No horizontal scrolling?
Cross-reference mobile usability findings from Phase 3.5 URL Inspection

Phase 5.5 — PageSpeed Insights (Performance Monitoring)

⚡ Speed note: This should already be running in parallel from the Parallel Data Collection step. If not, run it now.

python3 "$SKILL_SCRIPTS/pagespeed.py" \
  --urls "$TARGET_URL,https://example.com/page2,https://example.com/page3" \
  --both-strategies

After pagespeed.py completes, run the display utility:

python3 "$SKILL_SCRIPTS/show_pagespeed.py"

Analyze the Results

1. Performance Scores — Lighthouse scores 0-100 per page:

90-100 (Good): No action needed.
50-89 (Needs Work): Flag the top opportunities. These pages are losing rankings due to performance — Google uses Core Web Vitals as a ranking signal.
0-49 (Poor): Critical. These pages are actively penalized in rankings. Flag as a Priority Action if the page has significant organic traffic.

2. Core Web Vitals (Field Data) — Real-user metrics from Chrome UX Report:

LCP (Largest Contentful Paint): Good < 2.5s, Poor > 4.0s
INP (Interaction to Next Paint): Good < 200ms, Poor > 500ms
CLS (Cumulative Layout Shift): Good < 0.1, Poor > 0.25

3. Cross-Reference with Other Phases:

Phase 3 device split: If mobile performance score is significantly lower than desktop, and Phase 3 shows mobile traffic underperforming, the performance gap is likely a contributing factor.
Phase 5 technical audit: Correlate specific opportunities (e.g., "Eliminate render-blocking resources") with the technical findings (e.g., render-blocking scripts in <head>). This gives concrete evidence for technical fixes.
Phase 3.5 URL Inspection: Pages flagged as mobile-unfriendly that also have poor mobile PageSpeed scores need urgent attention.

4. Top Opportunities — The script extracts Lighthouse optimization opportunities sorted by potential time savings. For each, note:

What the opportunity is (e.g., "Properly size images", "Remove unused JavaScript")
Estimated savings in milliseconds
Which specific page(s) are affected
Whether it's a site-wide template issue or page-specific

Phase 6 — Report

Output a structured report using this format exactly:

SEO Report — [site.com]

[date] · GSC data: [date range] · [First audit / Previous audit: date]

Audit History

(Skip this section entirely on the first audit — do not write "N/A" or "First audit" here; just omit the section.)

On subsequent audits, show only what changed from the previous audit's top issues:

Previously Flagged	Status	Notes
[Issue from last audit]	✅ Resolved / ⚠️ Improved / 🔴 Still present / ↗ Worsened	[1-line update with current metric]

⚡ Top Priority Actions

Use this format for each:

#1 — [Short title, e.g. "Fix title tag on /pricing"] 🔴 Critical / 🟡 High / 🟢 Medium Impact: ~+[N] clicks/mo · Effort: Low / Med / High

Repeat for each of the 3–5 items. Do not add a 6th item — triage ruthlessly. An item only makes the list if you can quantify its impact.

Every persona-informed recommendation must name the persona and cite the specific language from that persona's language field that should appear in the rewrite.

Traffic Snapshot

Metric	Value	vs Prior 28 days
Total Clicks	X	↑/↓ X%
Impressions	X	↑/↓ X%
Avg CTR	X%	↑/↓
Avg Position	X	↑/↓

(Branded/non-branded split — only if brand terms were provided):

Segment	Clicks	Impressions	CTR	Avg Position
Branded	X	X	X%	X
Non-branded	X	X	X%	X

[1-sentence interpretation of the split — what it reveals about organic vs brand performance]

Supporting Findings

Indexing Issues

(From Phase 3.5. Only include if issues found.)

Page	Coverage State	Last Crawl	Fix

Keyword Cannibalization

(Only include if cannibalization data is non-empty.)

Query	Winner Page	Loser Pages	Action

Content Gaps

(Queries ranking 11–30 with >200 impressions and no dedicated page.)

Query	Position	Impressions/mo	Recommended Action

Metadata Issues

(Only pages not already covered in Priority Actions.)

Page	Issue	Current	Recommended Fix

Schema Gaps

(High-impact missing schema for this site type.)

Page	Missing	Impact

Technical Issues

(Severity: Critical / High / Medium. Omit Low unless they surface as Priority Actions.)

Issue	Pages Affected	Fix	Severity

PageSpeed & Core Web Vitals

(From Phase 5.5. Only include if issues found. Lead with field data if available, fall back to lab data.)

Site-wide (Origin): [Overall CrUX rating if available]

Page	Score	LCP	INP	CLS	Top Opportunity
/	[score]	[value] [rating]	[value] [rating]	[value] [rating]	[top opportunity title + savings]

(If any page scores below 50, flag it as a Priority Action candidate — poor Core Web Vitals directly hurt rankings.)

Traffic Drops

(Pages/queries with >30% decline. Only include if not already in Priority Actions.)

Page / Query	Change	Hypothesis	Next Step

CMS SEO Audit

(Only if a CMS is configured. Top 5 impactful fixes only.)

Page	Issue	Current	Fix

What to Ignore (For Now)

List 2–3 things the data shows but that don't make the priority list — so the user knows you saw them and deprioritized them deliberately. One line each.

[e.g., "Device split: mobile CTR 15% below desktop — worth watching but not the bottleneck right now"]
[e.g., "Country split: weak CTR in UK — low volume, investigate after core issues fixed"]

After the report, write the audit log entry (see Phase 6.5 below before ending).

Phase 6.5 — Write Audit Log

After delivering the report, append a concise entry to the audit log. $DOMAIN and $AUDIT_LOG are already set from Step 0.5.

mkdir -p "$HOME/.toprank/audit-log"

Use Python to append (creates the file with a single-element array if it doesn't exist). Replace all <FILL> values with real data from the report before running:

import json, os
from datetime import datetime, timezone

log_path = "$AUDIT_LOG"
existing = json.load(open(log_path)) if os.path.exists(log_path) else []

existing.append({
    "date": datetime.now(timezone.utc).strftime("%Y-%m-%d"),
    "traffic_snapshot": {
        "clicks": <FILL>,
        "impressions": <FILL>,
        "avg_ctr_pct": <FILL>,
        "avg_position": <FILL>
    },
    "pagespeed_snapshot": {
        "avg_score_mobile": <FILL or null>,
        "avg_score_desktop": <FILL or null>,
        "homepage_score_mobile": <FILL or null>,
        "cwv_lcp_ms": <FILL or null>,
        "cwv_inp_ms": <FILL or null>,
        "cwv_cls": <FILL or null>,
        "cwv_source": "<FILL: field|lab>"  # "field" if CrUX data available, else "lab"
    },
    "top_issues": [
        # One entry per Priority Action (max 5), in priority order
        {"rank": 1, "title": "<FILL>", "type": "<FILL: title_tag|indexing|cannibalization|schema|content_gap|performance>", "page": "<FILL>", "metric": "<FILL>", "expected_impact": "<FILL>", "status": "open"}
    ],
    "resolved_from_previous": []  # populated on next audit from Audit History comparison
})

json.dump(existing, open(log_path, "w"), indent=2)
print(f"Audit log saved to {log_path}")

Confirm with a one-liner: "Audit log saved to ~/.toprank/audit-log/$DOMAIN.json."

Phase 7 — Targeted Skill Handoffs (Optional)

After delivering the report, surface the follow-up actions based on what was found. Only offer handoffs where the audit actually found issues — do not offer all three if only one is relevant.

Metadata Handoff

If the metadata audit found [N] pages with issues:

"I found [N] pages with metadata issues — [X] with title/query mismatches, [Y] missing meta descriptions, [Z] missing OG tags. Run /meta-tags-optimizer to generate optimized tags for each page. Share the metadata audit table from this report as context."

Schema Handoff

If the schema audit found gaps or errors:

"I found [N] pages missing high-impact schema and [N] pages with schema errors. Run /schema-markup-generator to generate correct JSON-LD. The schema audit table from this report is the input — it already identifies the site type and what schema types are needed per page."

Keyword Research Handoff

If the keyword gap analysis found orphan keywords or business relevance gaps:

"I found [N] keyword gaps from GSC data. For deeper discovery — keywords you are not ranking for at all — run /keyword-research with these seed topics: [list 3-5 seed terms derived from the gap analysis]. That skill pulls from keyword databases and builds a full opportunity set beyond what GSC can see."

Phase 8 — Content Generation (Optional)

After delivering the report, if the Content Opportunities section identified actionable content gaps, offer to generate the content:

"I found [N] content opportunities. Want me to draft the content? I can write [blog posts / landing pages / both] in parallel — each one optimized for the target keyword and search intent."

If the user agrees, spawn content agents in parallel using the Agent tool. Each agent writes one piece of content independently.

How to Spawn Content Agents

For each content opportunity, determine the content type from the search intent:

Informational / commercial investigation → blog post agent
Transactional / commercial → landing page agent

Spawn agents in parallel. Each agent receives:

The content writing guidelines (located via find — see below)
The specific opportunity data from the analysis

Before spawning agents, locate the content writing reference:

CONTENT_REF=$(find ~/.claude/plugins ~/.claude/skills ~/.codex/skills .agents/skills -name "content-writing.md" -path "*content-writer*" 2>/dev/null | head -1)
if [ -z "$CONTENT_REF" ]; then
  echo "WARNING: content-writing.md not found. Content agents will use built-in knowledge only."
else
  echo "Content reference at: $CONTENT_REF"
fi

Pass $CONTENT_REF as the path in each agent prompt below. If not found, omit the "Read the content writing guidelines" line — the agents will still produce good content using built-in knowledge.

Use this prompt template for each agent:

Blog Post Agent Prompt

You are a senior content strategist writing a blog post that ranks on Google.

Read the content writing guidelines at: $CONTENT_REF
Follow the "Blog Posts" section exactly.

## Assignment

Target keyword: [keyword]
Current position: [position] (query ranked but no dedicated content)
Monthly impressions: [impressions]
Search intent: [informational / commercial investigation]
Site context: [what the site is about, its audience]
Existing pages to link to: [relevant internal pages from the analysis]
[If available] Competitor context: [what currently ranks for this keyword]

## Target Personas
Write primarily for: [Primary persona name]
Their goal: [primary goal]
Their language: [key terms and phrases they use — use these naturally in headings, intro, and body]
Their pain points: [pain points — address these directly, don't make them search for answers]
Secondary audience: [Secondary persona name if applicable] — [brief note on how to serve both without diluting focus]

## Deliverables

Write the complete blog post following the guidelines, including:
1. Full post in markdown with proper heading hierarchy
2. SEO metadata (title tag, meta description, URL slug)
3. JSON-LD structured data (Article/BlogPosting + FAQPage if FAQ included)
4. Internal linking plan (which existing pages to link to/from)
5. Publishing checklist

## Quality Gate
Before finishing, verify:
- Would the reader need to search again? (If yes, not done)
- Does the post contain specific examples only an expert would include?
- Does the format match what Google shows for this query?
- Is every paragraph earning its place? (No filler)

Landing Page Agent Prompt

You are a senior conversion copywriter writing a landing page that ranks AND converts.

Read the content writing guidelines at: $CONTENT_REF
Follow the "Landing Pages" section exactly.

## Assignment

Target keyword: [keyword]
Current position: [position]
Monthly impressions: [impressions]
Search intent: [transactional / commercial]
Page type: [service / product / location / comparison]
Site context: [what the site is about, value prop, target customer]
Existing pages to link to: [relevant internal pages]
[If available] Competitor context: [what currently ranks]

## Target Personas
Write primarily for: [Primary persona name]
Their goal: [primary goal when landing here]
Their language: [terms they use — mirror this in headlines, subheads, and CTAs]
Their decision trigger: [what makes them convert — address this prominently above the fold]
Their objections: [pain points and doubts — address each explicitly, don't leave them wondering]

## Deliverables

Write the complete landing page following the guidelines, including:
1. Full page copy in markdown with proper heading hierarchy and CTA placements
2. SEO metadata (title tag, meta description, URL slug)
3. Conversion strategy (primary CTA, objections addressed, trust signals)
4. JSON-LD structured data
5. Internal linking plan
6. Publishing checklist

## Quality Gate
Before finishing, verify:
- Would you convert after reading this? (If not, what is missing?)
- Are there vague claims that should be replaced with specifics?
- Is every objection addressed?
- Is it clear what the visitor should do next?

Spawning Rules

Spawn up to 5 content agents in parallel (more than 5 gets unwieldy — prioritize by impact)
Prioritize opportunities by: impressions x position-improvement-potential
Each agent works independently — they do not need to coordinate
As agents complete, present each piece of content to the user with its metadata
After all agents finish, provide a summary: what was generated, suggested publishing order (highest impact first), and any cross-linking between new pages

seo-analysis

Popularity

Invocation

Context Preview

Supporting Files

SKILL.md

Similar Skills

Help us improve

Help us improve

Find plugins for your project

seo-analysis

Popularity

Invocation

Context Preview

Supporting Files

SKILL.md

SEO Analysis

Step 0 — Establish the Website URL

Step 0.5 — Load Audit History

Phase 0 — Preflight Check

Phase 1 — Confirm Access to Google Search Console

Phase 2 — Match the Site to a GSC Property

Collect brand terms

Phase 3 — Collect GSC Data

⚡ Parallel Data Collection (after Phase 3 completes)

Phase 3.5 — URL Inspection

Phase 3.6 — CMS Content Inventory (Optional)

Detect configured CMS

Run preflight and fetch

What to do with the CMS data

Pushing fixes back (Strapi only)

Setup / reconfiguration

Phase 3.7 — Business & Persona Discovery

Check for cached personas

Discover personas from GSC + site content

Persist personas

Present personas (non-blocking)

Phase 3.8 — Business Context

Phase 4 — Search Console Analysis

Traffic Overview

Branded vs Non-Branded Split

Quick Wins (highest impact, lowest effort)

Search Intent Analysis

Keyword Cannibalization Check

Page Group Performance

Segment Analysis

Content Gaps

Pages to Fix

Phase 4.5 — Keyword Gap Analysis

Step 1: Find Queries Without Dedicated Pages

Step 2: Build Topic Clusters from GSC Data

Step 3: Business Context Gap Check

Step 4: Offer Deeper Keyword Research

Phase 5 — Technical SEO Audit

Indexability

Metadata Audit (Deep)

Schema Markup Audit (Deep)

Core Web Vitals & Performance

Internal Linking & Site Architecture

Mobile Readiness

Phase 5.5 — PageSpeed Insights (Performance Monitoring)

Analyze the Results

Phase 6 — Report

SEO Report — [site.com]

Audit History

⚡ Top Priority Actions

Traffic Snapshot

Supporting Findings

Indexing Issues

Keyword Cannibalization

Content Gaps

Metadata Issues

Schema Gaps

Technical Issues

PageSpeed & Core Web Vitals

Traffic Drops

CMS SEO Audit

What to Ignore (For Now)

Phase 6.5 — Write Audit Log

Phase 7 — Targeted Skill Handoffs (Optional)