From web-intel
Extract structured content from a URL — Twitter/X, GitHub, YouTube, Reddit, or any webpage. Triggers: "scrape" | "fetch url" | "extract content" | "scrape https://" | "get content from" | "extract this page" | "pull data from" | "fetch this page".
npx claudepluginhub roxabi/roxabi-plugins --plugin web-intelThis skill is limited to using the following tools:
Extract structured content from a URL → JSON with content, metadata, and platform-specific fields.
Extracts clean Markdown from web pages by stripping navigation, ads, sidebars, footers, and boilerplate using Defuddle. Use for reading docs, articles, blog posts, research papers, or release notes.
Reads public URLs into clean Markdown with platform-aware fallback strategies for WeChat, Zhihu, Bilibili, X/Twitter, and generic sites using Jina Reader, WebFetch, Playwright.
Ingests URLs to extract structured content from YouTube videos (via Gemini Video API), web articles, PDFs, and audio files, providing transcripts, summaries, and metadata for LLM pipelines.
Share bugs, ideas, or general feedback.
Extract structured content from a URL → JSON with content, metadata, and platform-specific fields.
/scrape https://example.com
¬U → → DP(B)to get one.
PLUGIN_ROOT=$(find ~/projects -maxdepth 4 -path "*/web-intel/pyproject.toml" -print -quit 2>/dev/null | xargs dirname)
if [ -z "$PLUGIN_ROOT" ]; then
echo "ERROR: web-intel plugin not found. Install: claude plugin install web-intel"
exit 1
fi
First invocation in session only:
cd "$PLUGIN_ROOT" && uv run python scripts/doctor.py
cd "$PLUGIN_ROOT" && SSL_CERT_FILE=/etc/ssl/certs/ca-certificates.crt REQUESTS_CA_BUNDLE=/etc/ssl/certs/ca-certificates.crt uv run python scripts/scraper.py "$URL"
Parse JSON → clean summary: content type (twitter|github|youtube|reddit|webpage), title, author (if ∃), content preview (first 500 chars), platform-specific metadata (stars, score, view_count, like_count, upload_date, tags, chapters, transcript, etc.). Show full JSON in <details> block.
success: false → show error, suggest alternatives (WebFetch, manual input)cd $PLUGIN_ROOT && uv sync --extra all$ARGUMENTS