Assesses team's AI collaboration literacy by scanning repo for signals like habitat docs, CI workflows, vulnerability scans, tool configs; generates report and badge.
npx claudepluginhub habitat-thinking/ai-literacy-superpowers --plugin ai-literacy-superpowersThis skill uses the workspace's default tool permissions.
Assess a team's AI collaboration literacy level by combining observable
Aggregates AI literacy assessments across multiple repositories into a portfolio view with level distributions, shared gaps, outliers, and prioritized improvement plans by organizational impact. Discovers repos from local paths, GitHub orgs, or topics.
Assesses codebase for AI agent readiness by detecting stacks, monorepos, git setup, and evaluating style, testing, code quality, secrets, and file sizes.
Evaluates repository readiness for autonomous AI-assisted development across five pillars (Agent Instructions, Feedback Loops, Workflows & Automation, Policy & Governance, Build & Dev Environment) covering 74 features using bash scanners and checklists.
Share bugs, ideas, or general feedback.
Assess a team's AI collaboration literacy level by combining observable evidence from the repository with clarifying questions, then produce a timestamped assessment document and a README badge.
Scan the repository for signals that indicate which framework level the team is operating at.
Habitat document discovery comes first. Before scanning for any of
the level indicators below, apply the methodology in
references/habitat-discovery.md to find HARNESS.md, AGENTS.md,
and CLAUDE.md — including their alternative paths and embedded
forms. A habitat document found at a non-conventional path counts as
present for the Level 3 indicators that reference it; "not at the
default path" is not the same as "doesn't exist". Every absence claim
must come from a fully-completed search across known alternatives,
with the discovery report citing what was matched and where.
Each signal below maps to a specific level:
Level 0-1 indicators (awareness + prompting):
Level 2 indicators (verification):
*.yml in .github/workflows/)Level 3 indicators (habitat engineering):
CLAUDE.md or equivalent context engineering file (apply
references/habitat-discovery.md — alternative paths and embedded
forms count as present)HARNESS.md with declared constraints (same — alternative paths
count)AGENTS.md compound learning memory (same).cursor/rules/,
.github/copilot-instructions.md, .windsurf/rules/, or custom
AI tooling locations expressing harness control through whichever
AI surface the team uses. Apply references/tool-config-evidence.md
for the methodology. A project with rich parallel-tool configs is
at L3 context engineering even without HARNESS.md/CLAUDE.md,
but tool-config evidence does NOT signal architectural constraints
or compound learning by itself.MODEL_ROUTING.md model-tier guidance.claude/skills/ project-local skills.claude/agents/ custom agent definitions.claude/commands/ custom commandshooks.json)REFLECTION_LOG.md with entries.markdownlint.json or equivalent configLevel 4 indicators (specification architecture):
specs/ directory with specification filesplan.md, plan-*.md)Level 5 indicators (sovereign engineering):
After scanning, ask questions to fill gaps that observable evidence cannot answer. Focus on:
Ask 3-5 questions maximum. Each question should disambiguate between adjacent levels.
Produce a timestamped Markdown document at
assessments/YYYY-MM-DD-assessment.md with:
After documenting the assessment, identify adjustments that can be made immediately — without changing any application code or requiring team discussion. These are habitat hygiene fixes:
Stale counts: If HARNESS.md Status section shows outdated counts, update them. If README badges show old numbers, update them.
Missing entries: If AGENTS.md GOTCHAS is empty but the assessment revealed gotchas, add them. If REFLECTION_LOG.md has no entries from this assessment, add one.
Drift detection: If HARNESS.md declares constraints that no longer match reality (tools removed, workflows renamed), update the declarations.
Mechanism map staleness: If the README mechanism map is missing components that the scan found (new agents, commands, hooks, skills), update it.
Present each adjustment to the user and apply it immediately. Record what was adjusted in the assessment document.
Based on the gaps identified, recommend specific changes to how existing workflows and artifacts are operated (not built — the infrastructure exists, it just needs to be used differently):
Operating rhythm: Recommend cadences for harness audits, reflection reviews, mutation score checks, and cost monitoring. Suggest adding these to a calendar or checklist.
Habit formation: Identify which framework habits (from Part VII) are not yet automatic and suggest specific practice exercises.
Artifact activation: Identify artifacts that exist but are not actively used (e.g., AGENTS.md that isn't read at session start, MODEL_ROUTING.md that isn't consulted when dispatching agents) and recommend how to activate them.
Promotion opportunities: Identify unverified HARNESS.md constraints that could be promoted to agent or deterministic with available tooling.
Present each recommendation to the user. For accepted recommendations, apply the change (update CLAUDE.md with new cadences, promote HARNESS.md constraints, add operating notes to AGENTS.md). Record accepted and rejected recommendations in the assessment document.
After workflow recommendations, invoke the literacy-improvements
skill with the assessed level and the gaps from section 7 of the
assessment document. The skill handles target level selection, plan
generation, and interactive execution.
Phase 5 and Phase 5b are complementary:
The skill records its outcomes (accepted, skipped, deferred) in the assessment document.
Capture a reflection on the assessment itself:
Append this to REFLECTION_LOG.md as a structured entry.
Add or update a badge in the project's README showing the assessed level:
[](assessments/YYYY-MM-DD-assessment.md)
Colour coding:
| Level | Colour | Hex |
|---|---|---|
| L0 | Grey | 808080 |
| L1 | Light blue | 87CEEB |
| L2 | Blue | 4682B4 |
| L3 | Teal | 20B2AA |
| L4 | Green | 2E8B57 |
| L5 | Gold | DAA520 |
Link target: the assessment document, so anyone who clicks the badge sees the full assessment with evidence and rationale.
The assessed level is the highest level where the team has substantial evidence across all three disciplines. A team with L3 context engineering but L1 verification is assessed at L1 — the weakest discipline is the ceiling.
| Level | Minimum evidence required |
|---|---|
| L0 | Repo exists, team is aware of AI tools |
| L1 | Some AI tool usage, basic prompting |
| L2 | Automated tests in CI, systematic verification of AI output |
| L3 | CLAUDE.md + at least 3 harness constraints enforced + custom agents or skills |
| L4 | Specifications before code + agent pipeline with safety gates |
| L5 | Platform-level governance + cross-team standards + observability |
Surface counts (script count, hook count, agent count, command count)
are insufficient on their own. Apply the content-shape methodology in
references/sophistication-markers.md before assigning a level. That
reference defines simple-vs-sophisticated markers per artefact type
and the level adjustments they justify.
The principle: a project with one sophisticated state-based orchestration script is not at the same maturity as one with ten simple bash hooks. Sophistication markers raise the floor on the discipline they evidence (orchestration sophistication → guardrail design; state-based hook sophistication → architectural constraints). The weakest-discipline-is-the-ceiling rule still applies — a single sophisticated artefact does not raise the overall level unless the other disciplines also have evidence at that level.
Every sophistication marker the assessor applies must be cited explicitly in the assessment document — what was found and where — so the level determination is auditable. No silent shifts.
The adjustments are introduced conservatively at this release — prefer surfacing markers without changing previously-assigned levels unless the evidence is unambiguous. As the framework accumulates assessments using the markers, the adjustments will tune.