From cwf
Gathers content from URLs (auto-detects Google/Slack/Notion/GitHub), web searches (Tavily/Exa), and local codebase into markdown artifacts for stable reasoning context.
npx claudepluginhub corca-ai/claude-plugins --plugin cwfThis skill uses the workspace's default tool permissions.
Convert scattered sources into local, reusable artifacts so later phases reason over stable context instead of transient links.
README.mdreferences/TOON.mdreferences/google-export.mdreferences/notion-export.mdreferences/query-intelligence.mdreferences/search-api-reference.mdreferences/slack-export.mdscripts/code-search.shscripts/csv-to-toon.shscripts/extract.shscripts/g-export.shscripts/notion-to-md.pyscripts/search.shscripts/slack-api.mjsscripts/slack-to-md.shCreates isolated Git worktrees for feature branches with prioritized directory selection, gitignore safety checks, auto project setup for Node/Python/Rust/Go, and baseline verification.
Executes implementation plans in current session by dispatching fresh subagents per independent task, with two-stage reviews: spec compliance then code quality.
Dispatches parallel agents to independently tackle 2+ tasks like separate test failures or subsystems without shared state or dependencies.
Convert scattered sources into local, reusable artifacts so later phases reason over stable context instead of transient links.
cwf:gather <url> URL auto-detect → download to OUTPUT_DIR
cwf:gather <url1> <url2> ... Multiple URLs
cwf:gather --search <query> Web search (Tavily)
cwf:gather --search --news <q> News search (Tavily)
cwf:gather --search --deep <q> Advanced depth search (Tavily)
cwf:gather --search code <query> Code/technical search (Exa)
cwf:gather --local <query> Local codebase exploration
cwf:gather Usage guide
cwf:gather help Usage guide
Why this exists:
--search | --local | help)OUTPUT_DIR before writes (URL and --local modes)--search for supplementary research if helpfulBefore any file write, resolve output path in this order:
CWF_GATHER_OUTPUT_DIR.cwf/projects (when writable)gather-output fallback directoryThen run:
mkdir -p "$OUTPUT_DIR"
If directory creation fails, stop that target with an explicit error and ask whether to provide a different output directory.
Scan input for all URLs. Classify each by pattern table (most specific first):
| URL Pattern | Handler | Script / Tool |
|---|---|---|
docs.google.com/{document,presentation,spreadsheets}/d/* | Google Export | scripts/g-export.sh |
*.slack.com/archives/*/p* | Slack to MD | scripts/slack-api.mjs + scripts/slack-to-md.sh |
*.notion.site/*, www.notion.so/* | Notion to MD | scripts/notion-to-md.py |
github.com/* | GitHub | gh CLI |
| Any other URL | Generic | scripts/extract.sh → WebFetch fallback |
{SKILL_DIR}/scripts/g-export.sh <url> [format] [output-dir]
Prerequisites and format caveats live in references/google-export.md. TOON behavior for Sheets is defined in references/TOON.md, implemented via scripts/csv-to-toon.sh through g-export.sh.
URL format: https://{workspace}.slack.com/archives/{channel_id}/p{timestamp}
Parse thread_ts: p{digits} → {first10}.{rest} (e.g., p1234567890123456 → 1234567890.123456)
node {SKILL_DIR}/scripts/slack-api.mjs <channel_id> <thread_ts> --attachments-dir OUTPUT_DIR/attachments | \
{SKILL_DIR}/scripts/slack-to-md.sh <channel_id> <thread_ts> <workspace> OUTPUT_DIR/<output_file>.md [title]
After conversion, rename to a meaningful name from the first message (lowercase, hyphens, max 50 chars). Existing .md file: Extract Slack URL from > Source: line to re-fetch.
Prerequisites, token setup, and error recovery are defined in references/slack-export.md.
python3 {SKILL_DIR}/scripts/notion-to-md.py "$URL" "$OUTPUT_PATH"
Publication requirements and known limitations are defined in references/notion-export.md.
For github.com URLs, use the gh CLI to extract content as markdown.
Prerequisite check: Verify command -v gh first. If gh is not available, fall through to Generic handler.
When gh is missing for a GitHub URL, do not silently downgrade only. Ask the user:
Install gh now (recommended) — run bash {SKILL_DIR}/../setup/scripts/install-tooling-deps.sh --install gh, then retry GitHub handler once.Continue with Generic handler — proceed with reduced metadata extraction.Skip this URL — do not process this GitHub URL in this run.| URL type | Command |
|---|---|
| PR (path pattern: /pull/N) | gh pr view <url> --json title,body,state,author,comments --template '...' |
| Issue (path pattern: /issues/N) | gh issue view <url> --json title,body,state,author,comments --template '...' |
| Repository (owner/repo) | gh repo view <url> --json name,description,readme |
| Other GitHub URL | Fall through to Generic handler |
Save output to {OUTPUT_DIR}/{type}-{owner}-{repo}-{number}.md.
Template for PR/Issue (pass to --template):
# {{.title}}
State: {{.state}} | Author: {{.author.login}}
{{.body}}
{{range .comments}}---
**{{.author.login}}** ({{.createdAt}}):
{{.body}}
{{end}}
For URLs that don't match any known service, run this deterministic routine:
Resolve paths:
slug: sanitized title/url token (lowercase, spaces to hyphens, remove special characters, max 50 chars)output_md: {OUTPUT_DIR}/{slug}.mdoutput_meta: {OUTPUT_DIR}/{slug}.meta.yamlMandatory URL safety precheck (before any fetch/extract):
scheme, host, resolved_ips (A/AAAA when available).blocked_reason_code:
non_http_scheme — scheme is not http or httpslocalhost_target — host is localhost or ends with .localhostloopback_target — host/IP in 127.0.0.0/8 or ::1/128link_local_target — host/IP in 169.254.0.0/16 or fe80::/10private_ipv4_target — host/IP in RFC1918 ranges (10/8, 172.16/12, 192.168/16)private_ipv6_target — host/IP in fc00::/7Override once for this URL and continue (explicit user confirmation required)Skip this URL (default)Try Tavily extract first:
{SKILL_DIR}/scripts/extract.sh "<url>" > "{output_md}.tmp"
0 and temp file has non-whitespace content.{output_md} and set metadata method: tavily-extract.TAVILY_API_KEY is missing or extraction fails, continue to Step 4.WebFetch fallback (single fixed procedure):
Fetch this URL with WebFetch: <url>
Return markdown only (preserve headings, lists, and links).
If content cannot be retrieved, return exactly: WEBFETCH_EMPTY
WEBFETCH_EMPTY, save it to {output_md} and set metadata method: webfetch-fallback.Empty-output handling:
{output_md} is missing, whitespace-only, or the fallback response equals WEBFETCH_EMPTY.Metadata capture (always required):
{output_meta} with at least:
source_urlretrieved_at_utc (ISO 8601 UTC)handler: genericsafety_precheck:
status (passed, blocked, or overridden)blocked_reason_code (empty when passed)hostresolved_ipsmethod (tavily-extract, webfetch-fallback, or none)status (success or failed)output_file (empty when failed)failure_reason (when failed)status: failedmethod: noneoutput_file: ""failure_reason: url_safety_blockedWeb and code search via external APIs.
Read references/query-intelligence.md before executing search — it contains the routing logic and parameter tables.
| Command | Backend | Script |
|---|---|---|
--search <query> | Tavily | scripts/search.sh |
--search --news <query> | Tavily (topic: news) | scripts/search.sh --topic news |
--search --deep <query> | Tavily (advanced) | scripts/search.sh --deep |
--search code <query> | Exa | scripts/code-search.sh |
code prefix or auto-detected code context → code-search.sh; otherwise → search.sh{SKILL_DIR}/scripts/search.sh [--topic news|finance] [--time-range day|week|month|year] [--deep] "<query>"
{SKILL_DIR}/scripts/code-search.sh [--tokens NUM] "<query>"
Graceful degradation: If API key is missing, scripts print setup instructions to stderr. Do not stop at an error message only. Ask whether to configure now:
Configure now (recommended) — set the missing API key (TAVILY_API_KEY or EXA_API_KEY) in shell profile/process env, run cwf:setup --tools if runtime dependencies are also missing, then retry the same search once. Do not rely on cwf:setup --env for API key provisioning.Skip search for now — continue without search results.Show setup commands only — print exact export/setup commands.
See references/search-api-reference.md.Queries are sent to external search services. Do not include confidential code or sensitive information in search queries.
Explore the local codebase for a topic and save structured results.
Input mapping:
query_raw: original --local argumentquery_slug: sanitized query token (lowercase, spaces to hyphens, remove special characters, max 50 chars)output_md: {OUTPUT_DIR}/local-{query_slug}.mdoutput_meta: {OUTPUT_DIR}/local-{query_slug}.meta.yamlTask prompt contract:
Explore this codebase for: <query_raw>.
Use Glob, Grep, and Read to find relevant code, patterns, and architecture.
Return a structured markdown summary with:
## Overview
## Key Files
## Code Patterns
## Notable Details
Include file paths and line references where possible.
Write your complete output to: <output_md>
The file MUST exist when you finish and must end with: <!-- AGENT_COMPLETE -->
If query_raw indicates browser-runtime debugging (for example: console error, DOM interaction failure, CDP/DevTools reproduction, viewport/mobile regression), require the Task agent to follow Web Debug Loop Protocol in addition to normal codebase exploration.
In this branch, the Task output must also include:
{OUTPUT_DIR}/debug/...Execution and failure handling:
output_md exists, has non-whitespace content, and ends with <!-- AGENT_COMPLETE -->).Provenance metadata guidance:
output_meta with: mode: local, query_raw, query_slug, subagent_type, attempts, status, output_md (if any), generated_at_utc.failure_reason and preserve diagnostics from the last Task run.| Variable | Default | Description |
|---|---|---|
CWF_GATHER_OUTPUT_DIR | .cwf/projects | Unified default output directory |
CWF_GATHER_GOOGLE_OUTPUT_DIR | (falls back to unified) | Google-specific override |
CWF_GATHER_NOTION_OUTPUT_DIR | (falls back to unified) | Notion-specific override |
TAVILY_API_KEY | — | Required for --search and generic URL extract |
EXA_API_KEY | — | Required for --search code |
Output dir priority: CLI argument > service-specific env var > CWF_GATHER_OUTPUT_DIR > .cwf/projects (if writable) > gather-output fallback directory
When a service-specific env var is not set, pass the unified output dir as a CLI argument to the handler script.
After gathering URL content, if best practices, reference documentation, or supplementary context would help the user, use the search scripts directly via Bash (not the WebSearch tool):
{SKILL_DIR}/scripts/search.sh "<query>"
Examples:
Print when no args or "help":
Gather Context — Unified Information Acquisition
Usage:
cwf:gather <url> Gather content from URL (auto-detect service)
cwf:gather --search <query> Web search (Tavily)
cwf:gather --search --news <q> News search
cwf:gather --search --deep <q> Deep search
cwf:gather --search code <query> Code/technical search (Exa)
cwf:gather --local <query> Explore local codebase
Supported URL services:
Google Docs/Slides/Sheets, Slack threads, Notion pages, GitHub PRs/issues, generic web
Environment variables:
TAVILY_API_KEY Web search and URL extraction (https://app.tavily.com)
EXA_API_KEY Code search (https://dashboard.exa.ai)
.cwf/projects (if writable) > gather-output fallback directory--local mode must follow Web Debug Loop Protocol and persist evidence paths.