Skill

scrape-codegen-analyze

Analyzes an HTML page against a schema and expected values to produce detailed field extraction instructions, covering CSS selectors, JSON-LD paths, microdata, OpenGraph tags, and script data, for code generation.

automation

npx claudepluginhub zytedata/claude-skills --plugin zyte-web-data

Popularity

Stars

Forks

Shared by

Invocation

How this skill is triggered — by the user, by Claude, or both

Slash command

/zyte-web-data:scrape-codegen-analyze [page-html-path] [work-path] [spec-path] [values-path]

User invocable

Model invocable

Inline context

Default effort

Argument hint[page-html-path] [work-path] [spec-path] [values-path]

Tool Access

This skill is limited to the following tools:

SkillBashReadWrite

Context Preview

The summary Claude sees in its skill listing — used to decide when to auto-load this skill

You are analyzing a detail page to produce extraction instructions for a code generation system. Given an HTML page, a schema, and expected values, you determine WHERE and HOW each field can be extracted from the page.

SKILL.md

108 lines · ~1.3k tokens

Similar Skills

mempalace

55.4k

Mines projects and conversations into a searchable memory palace and retrieves past work via semantic search.

mempalace

payload

42.5k

Guides Payload CMS config (payload.config.ts), collections, fields, hooks, access control, APIs. Debugs validation errors, security, relationships, queries, transactions, hook behavior.

11 files

payload

vector-database-engineer

37.9k

Implements vector databases with Pinecone, Weaviate, Qdrant, Milvus, pgvector for semantic search, RAG, recommendations, and similarity systems. Optimizes embeddings, indexing, and hybrid search.

antigravity-bundle-data-engineering

Stats

LanguagePython

Stars6

Forks2

MaintenanceExcellent

Last CommitJun 2, 2026

Actions

View Source View Plugin View on GitHub View README

Help us improve

Share bugs, ideas, or general feedback.

Stats

Actions

Help us improve

Share bugs, ideas, or general feedback.

mkdir -p {work_path}/codegen-analyze uv run ${CLAUDE_SKILL_DIR}/../scrape-analyze-page/scripts/clean_html.py PAGE.html -l0 -o {work_path}/codegen-analyze/{page_id}.cleaned.html uv run ${CLAUDE_SKILL_DIR}/../scrape-analyze-page/scripts/extract_metadata.py PAGE.html -u PAGE_URL -o {work_path}/codegen-analyze/{page_id}.metadata.json

{ "url": "https://example.com/product/widget-x", "page_id": "detail-1", "fields": { "name": { "target_value": "Widget X", "analysis": "The product name appears in two places:\n\n1. **HTML element** `<h1 class=\"product-title\">Widget X</h1>`\n - Selector: `h1.product-title::text`\n - Clean text, no post-processing needed\n - Reliable: unique h1 on the page\n\n2. **JSON-LD** in `<script type=\"application/ld+json\">`:\n ```json\n {\"@type\": \"Product\", \"name\": \"Widget X\", ...}\n ```\n - Path: `name` on the Product object\n - Also reliable\n\nRecommended: CSS selector `h1.product-title::text` — simplest, most direct." }, "price": { "target_value": "$29.99", "analysis": "..." } } }

detail-1 (https://...): name: "Widget X" — h1.product-title, also in JSON-LD price: "$29.99" — span.price::text, JSON-LD offers.price description: "A premium widget..." (2340 chars) — div.description rating: null — not found in HTML

scrape-codegen-analyze

Popularity

Invocation

Tool Access

Context Preview

SKILL.md

Similar Skills

Help us improve

Help us improve

Find plugins for your project

scrape-codegen-analyze

Popularity

Invocation

Tool Access

Context Preview

SKILL.md

Input

Process

1. Read inputs and prepare HTML

2. Analyze each field

3. Determine target values

4. Save analysis

5. Return summary

Similar Skills

Help us improve

Input

Process

1. Read inputs and prepare HTML

2. Analyze each field

3. Determine target values

4. Save analysis

5. Return summary