From agent-almanac
Extracts conceptual essence of codebases—roles, procedures, coordination patterns—as generalized skills, agents, teams definitions. Use for onboarding, agentic bootstrapping, project DNA study.
npx claudepluginhub pjt222/agent-almanacThis skill uses the workspace's default tool permissions.
---
Bootstraps modular Agent Skills from Git repos: clones to sources/, extracts core docs into categorized references under skills/, registers in AGENTS.md.
Analyzes git session changes to detect reusable patterns and automatically generates Claude Code skills with duplicate checks and Agent Teams.
Replicates knowledge from GitHub repos, articles, papers, or code into sd0x-dev-flow SKILL.md files via analysis report and 3-layer validation.
Share bugs, ideas, or general feedback.
Extract the conceptual DNA of a repository — its roles, procedures, and coordination patterns — as generalized agentskills.io definitions. Like extracting noble metal from ore, the skill separates what a project IS (its essence) from what it DOES (its implementation), producing reusable skill, agent, and team definitions that capture the project's organizational genome without reproducing its codebase.
survey (prospect + assay only), extract (full procedure), or report (extraction + written report) (default: extract)The central quality criterion for all extraction:
Could this concept exist in a completely different implementation?
If YES — it is metal (essence). Extract it. If NO — it is gangue (implementation detail). Leave it behind.
Example: A weather app's concept "integrate external data source" is metal — it applies to any project fetching third-party data. But "parse OpenWeatherMap v3 JSON response" is gangue — it is specific to one API.
Extracted skills should describe the CLASS of task, not the specific instance. Extracted agents should describe the ROLE, not the person. Extracted teams should describe the COORDINATION PATTERN, not the org chart.
Survey the repository structure without judgment. Map the terrain before mining.
package.json, DESCRIPTION, setup.py, Cargo.toml, go.mod, MakefileREADME.md, CLAUDE.md, CONTRIBUTING.md, architecture docs.github/workflows/, Dockerfile, deployment configsProject: [name]
Declared Purpose: [from README/manifest]
Languages: [primary, secondary]
Size: [file count, approx LOC]
Shape: [monorepo/library/app/framework/docs]
External Surface: [CLI/API/UI/library exports/none]
Expected: A factual survey — what is here, how large, what does the project claim to be. No classification or judgment yet. The report reads like a geological survey, not a review.
On failure: If the repository has no README or manifest, infer purpose from directory names, file contents, and test descriptions. If the project is too large (>1000 source files), narrow the scope to the most active directories (use git log frequency or README references).
Read representative files to understand what the project DOES at the conceptual level.
Expected: A conceptual map of the project that reads like a domain glossary, not a code walkthrough. Someone unfamiliar with the tech stack should understand what the project does from this report.
On failure: If the codebase is opaque (heavy metaprogramming, generated code, or obfuscated), lean on tests and documentation rather than source code. If no tests exist, read commit messages for intent.
Pause to clear the cognitive anchoring from reading code.
Expected: The Assay Report is now free of framework-specific language. Every finding passes the Ore Test. The concepts feel portable — they could apply to a project in any language or framework.
On failure: If bias persists (findings keep referencing specific technologies), try inverting: "If this project were rewritten in a completely different stack, which concepts would survive?" Only those are metal.
The core extraction step. Classify each essential concept into skills, agents, or teams.
Classification Criteria:
+--------+----------------------------+----------------------------+----------------------------+
| Type | What to Look For | Naming Convention | Test Question |
+--------+----------------------------+----------------------------+----------------------------+
| SKILL | Repeatable procedures, | Verb-first kebab-case: | "Could an agent follow |
| | workflows, transformations | validate-input, | this as a step-by-step |
| | with clear inputs/outputs | deploy-artifact | procedure?" |
+--------+----------------------------+----------------------------+----------------------------+
| AGENT | Persistent roles, domain | Noun/role kebab-case: | "Does this require ongoing |
| | expertise, judgment calls, | data-engineer, | context, expertise, or a |
| | communication styles | quality-reviewer | specific communication |
| | | | style?" |
+--------+----------------------------+----------------------------+----------------------------+
| TEAM | Multi-role coordination, | Group descriptor: | "Does this need more than |
| | handoffs, reviews, | pipeline-ops, | one distinct perspective |
| | parallel workstreams | review-board | to accomplish?" |
+--------+----------------------------+----------------------------+----------------------------+
For each extracted element:
identity-manager (agent). "deployToAWS()" becomes deploy-artifact (skill).Guard against common classification errors:
Expected: A classified inventory where each item has a type (skill/agent/team), a generalized name, and a one-line description. No item references the source project's specific technologies, APIs, or data structures.
On failure: If classification is ambiguous (is this a skill or an agent?), ask: "Is this about DOING something (skill) or BEING someone who does things (agent)?" A skill is a recipe; an agent is a chef. If still unclear, default to skill — skills are easier to compose later.
Assess whether the extraction is honest — neither too much nor too little.
Over-extraction check: Read each extracted definition and ask:
Under-extraction check: Show only the extracted definitions (without the source project) and ask:
Generalization check: For each definition:
Balance check: Review the extraction ratios:
Expected: Confidence that the extraction is at the right level of abstraction. Each definition is a seed that could grow in different soil, not a cutting that only survives in the original garden.
On failure: If over-extracted, raise the abstraction level — merge specific skills into broader ones, collapse similar agents into a single role. If under-extracted, return to Step 2 and sample additional files. If generalization check fails, strip technology references and rewrite descriptions.
Produce the agentskills.io-standard output documents.
# Skill: [generalized-name]
name: [generalized-name]
description: [one-line, framework-agnostic]
domain: [closest domain from the 52 existing domains, or suggest a new one]
complexity: [basic/intermediate/advanced]
# Concept-level procedure (3-5 steps, NOT full implementation):
# Step 1: [high-level action]
# Step 2: [high-level action]
# Step 3: [high-level action]
# Derived from: [source concept in original project]
# Agent: [role-name]
name: [role-name]
description: [one-line purpose]
tools: [minimal tool set needed]
skills: [list of extracted skills this agent would carry]
# Derived from: [source role/module in original project]
# Team: [group-name]
name: [group-name]
description: [one-line purpose]
lead: [lead agent from extracted agents]
members: [list of member agents]
coordination: [hub-and-spoke/sequential/parallel/adaptive]
# Derived from: [source workflow/process in original project]
Expected: A structured report containing all extracted definitions in agentskills.io format. Each definition is skeletal (concept-level, not implementation-level) and could serve as a starting point for the create-skill, create-agent, or create-team skills to flesh out.
On failure: If the output exceeds 15 items, prioritize by centrality — keep the concepts that are most unique to this project's domain. Generic concepts (like "manage-configuration") that exist in most projects should be dropped unless they have an unusual twist.
Verify the complete extraction and produce the summary.
Temper Assessment:
+-----+---------------------------+----------+------------------------------------+
| # | Name | Type | Ore Test Result |
+-----+---------------------------+----------+------------------------------------+
| 1 | [name] | skill | PASS / FAIL (reason) |
| 2 | [name] | agent | PASS / FAIL (reason) |
| ... | ... | ... | ... |
+-----+---------------------------+----------+------------------------------------+
Expected: A validated Assay Report with a summary table, confidence assessment, and actionable next steps. The report is self-contained — someone who has never seen the source project can read it and understand the extracted concepts.
On failure: If more than 20% of items fail the final Ore Test, return to Step 4 (Smelt) and re-extract at a higher abstraction level. If coverage is below 60% of identified domains, return to Step 2 (Assay) and sample additional files.
create-skill, not finished products. A 50-step extraction is a reproduction, not an essence.athanor — When metal reveals the project needs transformation, not just essence extractionchrysopoeia — Value extraction at the code level; metal works at the conceptual level above codetransmute — Converting extracted concepts between domains or paradigmscreate-skill — Flesh out extracted skill sketches into full SKILL.md implementationscreate-agent — Flesh out extracted agent sketches into full agent definitionscreate-team — Flesh out extracted team sketches into full team compositionsobserve — Deeper observation when the prospect phase reveals an unfamiliar domainanalyze-codebase-for-mcp — Complementary: metal extracts concepts, analyze-codebase-for-mcp extracts tool surfacesreview-codebase — Complementary: metal extracts essence, review-codebase evaluates quality