From knowledge-distillery
Extracts evidence identifiers from a merged PR and posts an Evidence Bundle Manifest comment. Stage A of the distillation pipeline — lightweight, identifier-only, no content fetching. Triggered on PR merge or manual invocation. Use after a PR merge to begin knowledge tracking, or manually with a specific PR number to retroactively mark evidence.
npx claudepluginhub ether-moon/knowledge-distillery --plugin knowledge-distilleryThis skill uses the workspace's default tool permissions.
- A PR has been merged to `main` or `master` (GitHub Actions `pull_request.closed` + `merged == true`)
Performs scope-focused PR reviews on GitHub/GitLab: validates requirements compliance, prevents scope creep, triages out-of-scope findings to backlog issues.
Resolves GitHub PR issues including review comments, CI failures via triage-dispatch workflow with code edits, replies, and verification.
Addresses GitHub PR review feedback systematically: fetches inline comments and review bodies, handles outside-diff-range comments, triages by priority, resolves threads with attribution.
Share bugs, ideas, or general feedback.
main or master (GitHub Actions pull_request.closed + merged == true)/knowledge-distillery:mark-evidence <pr_number>/knowledge-distillery:mark-evidencepull_requests,issues,labels toolsetsgit with access to refs/notes/commits (memento notes)Use ONLY: GitHub MCP (read + write), git, Linear MCP (read-only), Bash, Read, Glob, Grep.
Do NOT use any other tools. Do NOT write files. Do NOT access vault.db or knowledge-gate CLI.
| Field | Source | Format |
|---|---|---|
| PR number | GitHub Actions event context or manual argument | Integer |
| Repository | GitHub Actions context or derived via GitHub MCP | owner/repo |
| Merge SHA | GitHub Actions context or derived via GitHub MCP | Hex string |
| Artifact | Format | Consumer |
|---|---|---|
| PR comment | Evidence Bundle Manifest (per evidence-manifest.spec.md) | /knowledge-distillery:collect-evidence |
| PR label | knowledge:pending added | /knowledge-distillery:batch-refine (discovery) |
Follow these steps in exact order. Do not skip steps. Do not reorder.
Use GitHub MCP to list all issue-level comments on PR #{pr_number}. Extract each comment's body text.
Scan all comment bodies for the delimiter <!-- EVIDENCE_BUNDLE_MANIFEST_START -->.
If Manifest comment exists:
knowledge:pending label is present on the PR:
Use GitHub MCP to get PR #{pr_number} labels. Check if `knowledge:pending` is present.
Use GitHub MCP to add the `knowledge:pending` label to PR #{pr_number}.
Then exit with success.If no Manifest comment exists — proceed to Step 2.
Use GitHub MCP to fetch PR #{pr_number} metadata: title, body, commits (with SHAs and messages), changed files, base branch, and merge commit SHA.
Extract from the response:
title — PR title stringbody — PR body stringcommits — array of commit objects (each has oid and messageHeadline, messageBody)files — array of changed file objects; extract relative file paths into changed_filesbaseRefName — target branch name (e.g., main)mergeCommit.oid — merge commit SHAApply the regex pattern /\b([A-Z]+-\d+)\b/g to these sources, in order:
source: "pr_title"source: "pr_body"source: "commit_message"Deduplicate by ID, keeping the first source encountered for each unique ID.
Note on regex breadth: This pattern intentionally matches any
PREFIX-123format (Linear, JIRA, etc.) rather than filtering by known project prefixes. False positives are handled gracefully downstream —collect-evidencelooks up each ID in Linear and setsretrieved: falseif not found. Overly narrow patterns risk missing valid references.
Slack links can appear in two independent sources: PR text and Linear issues. Collect from both.
4a. Extract Slack links from PR body and comments (independent of Linear):
https://*.slack.com/archives/*/p*source: "pr_body"source: "pr_comment"4b. Extract Slack links from Linear issues (requires Linear MCP):
For each Linear issue ID found in Step 3:
https://*.slack.com/archives/*/p*source: "linear_issue"Graceful degradation: If Linear MCP is unavailable (connection error, timeout, not configured), skip Step 4b only. Slack links from Step 4a are still collected. This is NOT a failure — log a warning and continue.
4c. Deduplicate all collected Slack URLs by URL, keeping the first source encountered.
Ensure notes refs are available first:
git fetch origin refs/notes/commits:refs/notes/commits 2>/dev/null || true
Then, for each commit SHA in the PR (from Step 2):
git notes --ref=refs/notes/commits show {sha}
{ "sha": "{short_sha_7chars}", "has_notes": true }Use GitHub MCP to list all review comments (inline on diff) for PR #{pr_number}. Filter for comments by users whose login contains "greptile" (case-insensitive). Count the matching comments.
Also check issue comments:
Use GitHub MCP to list all issue-level comments on PR #{pr_number}.
Look for comments from users whose login contains "greptile" (case-insensitive).
{ "review_id": "greptile-pr-{pr_number}", "comment_count": {count} }greptile array stays emptyNotion page URLs can appear in PR text and Linear issues. Collect from both.
7a. Extract Notion links from PR body and comments:
https://(www\.)?notion\.(so|site)/\S+source: "pr_body"source: "pr_comment"7b. Extract Notion links from Linear issues (requires Linear MCP):
For each Linear issue ID found in Step 3:
source: "linear_issue"Graceful degradation: If Linear MCP was unavailable in Step 4b, skip Step 7b only. Notion links from Step 7a are still collected.
7c. Deduplicate all collected Notion URLs by URL, keeping the first source encountered.
Build the Evidence Bundle Manifest with the following structure. All fields are required. Empty arrays are valid — never omit a field.
Human-readable summary table:
## Evidence Bundle Manifest
| Category | Count | Details |
|----------|-------|---------|
| Linear Issues | {n} | {comma-separated IDs, or "—"} |
| Slack Threads | {n} | {comma-separated channel names extracted from URLs, or "—"} |
| Git Sessions | {n} | {n} commits with memento notes |
| Greptile Reviews | {n} | {total comment_count} review comments, or "—" |
| Notion Pages | {n} | {comma-separated page titles or shortened URLs, or "—"} |
Machine-parseable JSON (inside HTML comment delimiters, using actual code fences):
<!-- EVIDENCE_BUNDLE_MANIFEST_START -->
```json
{
"version": "1",
"pr": {
"number": <integer>,
"merge_sha": "<full or 7+ char hex SHA>",
"base_branch": "<branch name>",
"changed_files": ["<relative path>", ...]
},
"identifiers": {
"linear": [
{ "id": "<PROJECT-NNN>", "source": "<source_type>" }
],
"slack": [
{ "url": "<slack permalink>", "source": "<source_type>" }
],
"memento": [
{ "sha": "<7+ char hex>", "has_notes": true }
],
"greptile": [
{ "review_id": "<id>", "comment_count": <integer> }
],
"notion": [
{ "url": "<notion page URL>", "source": "<source_type>" }
]
},
"collected_at": "<ISO 8601 timestamp>"
}
**Validation before posting — verify ALL of these:**
| Rule | Check |
|------|-------|
| V1 | `version` is `"1"` |
| V2 | `pr.merge_sha` matches `/^[0-9a-f]{7,40}$/` |
| V3 | `pr.number` is a positive integer |
| V4 | `pr.changed_files` is a non-empty array of strings |
| V5 | Each `linear[].id` matches `/^[A-Z]+-\d+$/` |
| V6 | Each `slack[].url` matches `https://*.slack.com/archives/*/p*` |
| V7 | Each `memento[].sha` matches `/^[0-9a-f]{7,40}$/` |
| V8 | `collected_at` is valid ISO 8601 |
| V9 | Each `notion[].url` matches `https://(www.)?notion.(so\|site)/*` |
If any validation fails, fix the data before posting. Do not post an invalid Manifest.
### Step 9: Post Comment and Add Label
Ensure the label exists:
Use GitHub MCP to ensure the label knowledge:pending exists on the repository (description: "PR awaiting knowledge distillation", color: "FBCA04"). If it already exists, continue without error.
Post the Manifest as a PR comment:
Use GitHub MCP to post a comment on PR #{pr_number} with the full Manifest content as the comment body.
Add the `knowledge:pending` label:
Use GitHub MCP to add the knowledge:pending label to PR #{pr_number}.
## Error Handling
| Failure Mode | Behavior |
|-------------|----------|
| Linear MCP unavailable | Continue without Slack links. `linear` array retains IDs found in PR text. `slack` array contains only URLs from PR body (if any). Log a warning. |
| Linear issue ID not found in Linear | Keep the ID in the `linear` array (it was in PR text). Log a warning. |
| `git notes show` fails | Skip that commit's memento entry. Not an error. |
| No identifiers found at all | Post Manifest with all empty arrays. Add `knowledge:pending` label. This is valid. |
| PR comment posting fails | This is the critical output — report failure. |
| PR already has Manifest comment | Exit with success (idempotent). No action taken. |
## Constraints
- MUST NOT fetch actual evidence content (no reading Linear issue bodies for knowledge extraction — only for Slack link discovery)
- MUST NOT modify any files in the repository
- MUST NOT interact with vault.db or `knowledge-gate` CLI
- MUST NOT make judgments about evidence sufficiency (that is Stage B's job)
- MUST be idempotent — safe to re-run on the same PR
- MUST post exactly one Manifest comment per PR
- MUST ensure the human-readable summary table is consistent with the JSON data