Help us improve
Share bugs, ideas, or general feedback.
From beagle-analysis
Structured, cited extraction of insights and context from local documents and project knowledge. Writes a scan plan, parallel-subagent findings, and a cited synthesis report to disk — never inline prose.
npx claudepluginhub existential-birds/beagle --plugin beagle-analysisHow this skill is triggered — by the user, by Claude, or both
Slash command
/beagle-analysis:artifact-analysisThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
Turn a set of local paths (or a beagle project's conventional knowledge locations) into a cited, structured extraction of insights, context, decisions, and raw detail.
Based on the Recursive Language Models (RLM) research by Zhang, Kraska, and Khattab (2025), this skill provides strategies for handling tasks that exceed comfortable context limits through programmatic decomposition and recursive self-invocation. Triggers on phrases like "analyze all files", "process this large document", "aggregate information from", "search across the codebase", or tasks involving 10+ files or 50k+ tokens.
Orchestrates autonomous deep research on codebases and technical topics via map-reduce explorer architecture with sub-agents, generating structured reports.
Bulk imports knowledge from files, directories, or URLs into structured backlogs, or captures a single document with a 5-section template (claims, worth-keeping, contested, action, reaction).
Share bugs, ideas, or general feedback.
Turn a set of local paths (or a beagle project's conventional knowledge locations) into a cited, structured extraction of insights, context, decisions, and raw detail.
The deliverable is always on disk: a written scan plan the caller can audit, one findings file per slice, and a synthesized report with path-anchored citations. Nothing returns as inline prose, and no claim ships without a source path + verbatim excerpt behind it.
references/companion-contract.md).web-research.llm-judge.humanize-beagle or the file tools.beagle-core:docling is the future path.Four steps, in order. No step is skippable.
Advance to the next step only when the pass condition is true—confirm using files under output_dir (and tool output), not memory.
| After | Pass condition |
|---|---|
plan.md written | plan.md exists and includes intent, resolved paths, slices, per-slice briefs, skip patterns, budgets applied, and synthesis approach (same fields as The scan plan (plan.md)). |
| Subagent dispatch | Either the empty corpus path was taken (no subagents; plan.md documents zero readable documents) or every slice listed in plan.md has findings/<slice-slug>.md on disk. |
report.md written | report.md exists; headings match references/report-template.md (seven sections plus ## Sources). |
| Before return to caller | Every row of references/failure-modes.md → Verification checklist (orchestrator runs at end) is checked off, or any failed check is recorded under ## Gaps & Limitations in report.md as that failure-modes file prescribes. |
plan.md — resolved paths (with any auto-discovery applied), intent summary (when provided), per-slice briefs, skip patterns, and how findings will be synthesized.findings/<slice-slug>.md under output_dir.report.md — fold findings into the seven fixed sections with path-anchored citations.references/failure-modes.md (Verification checklist (orchestrator runs at end)). Any check that fails becomes an entry in Gaps & Limitations per that file—do not return a deliverable with silent checklist failures.Receive paths + optional intent ──→ Auto-discover if paths empty
↓
Write plan.md (no user-confirmation pause)
↓
Dispatch subagents (up to 3 parallel)
↓
Collect findings/<slice>.md files
↓
Synthesize report.md
↓
Return paths to caller
Unlike web-research, artifact-analysis does not pause for a plan review gate. Local scanning is cheap; plan.md is written for auditability so a reader weeks later can tell what each subagent was told. Unlike web-research, there is no fail-fast on missing tools — filesystem tooling (Read, Glob, Grep) is assumed present in the Claude Code environment.
| Field | Type | Required | Default | Purpose |
|---|---|---|---|---|
intent | string | no | — | What the caller is looking for / why. When absent, the skill extracts anything structurally important. |
paths | list of strings | no | auto-discover | Directories and/or explicit files. When absent, auto-discover (see below). |
output_dir | absolute path | no | derived | Where plan.md, findings/, and report.md land. |
refresh | bool | no | false | When true, allow overwriting a prior run in the same output_dir. |
The skill does not parse caller-specific structures. Callers pass an intent string and/or a path list.
When paths is absent or empty, scan beagle's conventional knowledge locations:
.beagle/concepts/ — concept specs and analysis folders..planning/ — roadmap, state, and phase artifacts.docs/ — project documentation.README*, BRIEF*, OVERVIEW*, CONTEXT*, CLAUDE.md, AGENTS.md.Resolved paths (including any auto-discovery) are listed verbatim in plan.md so the caller can see exactly which files were included.
intent present — extraction is targeted to what is relevant to the intent. Off-topic material goes into Raw Detail Worth Preserving only when it is a specific quote or metric worth keeping.intent absent — generic-salient-extraction mode. Subagents extract anything structurally important (insights, decisions, technical constraints, user/market context) without an interpretive filter.If the caller passes output_dir, use it verbatim. Otherwise derive the default:
.beagle/analysis/<slug>/
Slug derivation (stable so re-running the same input on the same day lands on the same folder):
intent is present, slug from the intent string: lowercase, strip punctuation, collapse whitespace to single hyphens, truncate to 60 characters on a word boundary (cut at the last hyphen before 60; if no hyphen exists before position 60, hard-cut at 60).intent is absent, slug from the first scanned path's basename using the same rules.YYYY-MM-DD-.Re-run protection. Before writing anything, check whether output_dir already contains plan.md or report.md. If it does and refresh is not true, refuse with a message naming the existing folder. When refresh: true, archive the prior contents into <output_dir>/.archive-<timestamp>/ before starting fresh. See references/failure-modes.md.
Callers embedding artifact-analysis in a concept-folder convention (e.g. prfaq-beagle) pass output_dir explicitly so the analysis sits next to its consumer: .beagle/concepts/<concept-slug>/analysis/.
plan.md)The plan is written before any subagent runs. It is the audit trail, not a review gate — the skill does not pause for user confirmation.
plan.md contains:
"generic salient extraction" when intent is absent.plan.md can predict what each subagent was told.references/skip-patterns.md).report.md.Report: Wrote plan.md and proceed to dispatch. No pause, no gate.
Up to 3 subagents run concurrently over non-overlapping slices. Each gets a mechanically-derived brief built from plan.md — no interpretation drift between the plan and the briefs. The brief template lives in references/subagent-brief.md.
Each subagent:
findings/<slice-slug>.md under output_dir.The orchestrator waits for all subagents to finish, then verifies every expected findings file exists before synthesis. A missing file is a silent failure, recorded in Gaps & Limitations — see references/failure-modes.md.
Subagents do not read everything end-to-end. They apply:
index.md plus multiple sub-files) — read index.md first, then only the sub-files the index points to as relevant.Sensitive, binary, and vendor/build paths are skipped silently without the caller re-specifying them. The default denylist lives in references/skip-patterns.md. Each subagent records the paths it skipped under the paths_skipped frontmatter field so the audit trail shows exactly what was excluded.
Every claim in a findings file and in report.md carries a citation. The shape lives in references/citation-schema.md. At a glance:
path (relative to the scanned root when possible), excerpt (a verbatim quoted string from the document).lines (line range or single line, only when the subagent naturally has them — never synthesized), heading (the nearest enclosing heading), document_type.Inline references use [^n] footnotes; the full citation sits in the numbered Sources section at the bottom of the report.
report.md)The report has a fixed seven-section layout, in this order. Every section is required, every time — even when a section is thin, the report includes a bullet saying so.
## Documents Found — each included path with a one-line note on its relevance.## Key Insights — the highest-signal observations from the corpus, grouped by theme.## User / Market Context — users, customers, competition, market data surfaced from the documents.## Technical Context — platforms, constraints, integrations, dependencies.## Ideas & Decisions — each tagged accepted, rejected, or open with rationale. Rejected ideas are preserved deliberately so future work does not re-propose them.## Raw Detail Worth Preserving — specific quotes, data points, metrics, and other detail that would be lost to summarization.## Gaps & Limitations — what the corpus could not establish; which paths were empty, skipped, or unreadable; which subagents failed.Gaps & Limitations is required even when the scan looks complete. Honest accounting of what was and was not in the corpus is part of the product. The full literal skeleton the skill copies from lives in references/report-template.md.
Gaps & Limitations, including the last-known brief and the stub-file reason.plan.md and a minimal report.md with a single "no documents found" bullet under Gaps & Limitations, does not spawn subagents, and returns cleanly to the caller. Callers decide how to proceed.findings/<slice-slug>.md with status: frontmatter (ok, empty, failed) before returning. Missing file after dispatch = silent failure, recorded in Gaps & Limitations.references/failure-modes.md.Unlike web-research, artifact-analysis does not fail-fast on missing tools. Filesystem tooling (Read, Glob, Grep) is assumed present in the Claude Code environment. If it is somehow absent, the skill will surface that as a subagent failure under partial-success rather than aborting the whole run.
Full rules and the structured error shape live in references/failure-modes.md.
Tunable knobs, not hard-coded invariants:
| Knob | Default |
|---|---|
| Parallel subagents | 1-3 |
| Slice overlap | none (enforced) |
| Skim threshold (large) | > ~2000 lines or > ~50 pages |
A caller that needs narrower or broader scope overrides by passing a more specific paths list — one path per subagent forces narrower slicing; a single folder lets the skill partition internally.
Other beagle skills invoke this one via a small, documented contract. The minimal call passes only intent; the full call adds paths, output_dir, and refresh.
Worked examples for the three known callers (prfaq-beagle, brainstorm-beagle, strategy-interview) plus the success and refusal return shapes live in references/companion-contract.md. Callers are expected to honor the contract verbatim rather than invent parallel invocation styles.
This skill is a tone-neutral primitive. It does not:
llm-judge's job).If the caller is a coaching skill (prfaq-beagle, brainstorm-beagle), the coaching happens before and after this skill runs. Inside this skill, the intent is treated as final.
llm-judge.beagle-core:docling is the future path.references/subagent-brief.md — template the orchestrator mechanically fills from plan.md when dispatching each subagent.references/citation-schema.md — required and optional citation fields, footnote convention, and a well-formed example.references/report-template.md — literal report.md skeleton with all seven fixed sections.references/failure-modes.md — partial-success, empty-corpus, silent-failure detection, and re-run protection rules.references/companion-contract.md — programmatic invocation shape with worked examples for prfaq-beagle, brainstorm-beagle, and strategy-interview.references/skip-patterns.md — default denylist (sensitive, binary/media, vendor/build) applied to every run.