Skill

ai-literacy-assessment

Assesses team's AI collaboration literacy by scanning repo for signals like habitat docs, CI workflows, vulnerability scans, tool configs; generates report and badge.

Github

GitHub Actions

npx claudepluginhub habitat-thinking/ai-literacy-superpowers --plugin ai-literacy-superpowers

Tool Access

This skill uses the workspace's default tool permissions.

Preview

Assess a team's AI collaboration literacy level by combining observable

Supporting Assets

references/assessment-template.mdreferences/habitat-discovery.mdreferences/sophistication-markers.mdreferences/tool-config-evidence.md

SKILL.md

Similar Skills

portfolio-assessment

Aggregates AI literacy assessments across multiple repositories into a portfolio view with level distributions, shared gaps, outliers, and prioritized improvement plans by organizational impact. Discovers repos from local paths, GitHub orgs, or topics.

2 files

ai-literacy-superpowers

readiness

Assesses codebase for AI agent readiness by detecting stacks, monorepos, git setup, and evaluating style, testing, code quality, secrets, and file sizes.

harness-engineering

readiness-report

Evaluates repository readiness for autonomous AI-assisted development across five pillars (Agent Instructions, Feedback Loops, Workflows & Automation, Policy & Governance, Build & Dev Environment) covering 74 features using bash scanners and checklists.

6 files

craft-workspace-webconsulting-skills

Stats

Parent Repo Stars25

Parent Repo Forks3

Last CommitApr 28, 2026

Actions

View Source View Plugin View on GitHub View README

Help us improve

Share bugs, ideas, or general feedback.

AI Literacy Assessment

Assess a team's AI collaboration literacy level by combining observable evidence from the repository with clarifying questions, then produce a timestamped assessment document and a README badge.

The Assessment Process

Phase 1: Observable Evidence

Scan the repository for signals that indicate which framework level the team is operating at.

Habitat document discovery comes first. Before scanning for any of the level indicators below, apply the methodology in references/habitat-discovery.md to find HARNESS.md, AGENTS.md, and CLAUDE.md — including their alternative paths and embedded forms. A habitat document found at a non-conventional path counts as present for the Level 3 indicators that reference it; "not at the default path" is not the same as "doesn't exist". Every absence claim must come from a fully-completed search across known alternatives, with the discovery report citing what was matched and where.

Each signal below maps to a specific level:

Level 0-1 indicators (awareness + prompting):

Does the repo exist and contain code? (baseline)
Are there any AI-related configuration files at all?

Level 2 indicators (verification):

CI workflows that run tests (*.yml in .github/workflows/)
Test coverage enforcement (coverage thresholds in CI or build files)
Vulnerability scanning (govulncheck, OWASP, Docker Scout)
Markdownlint or other linting in CI
Mutation testing configuration
Small, TDD-paced diffs visible in commit history (Human Pace signal)
Depletion signals recognised — developer can name observable markers of degraded judgment (Depletable Collaborator signal)

Level 3 indicators (habitat engineering):

CLAUDE.md or equivalent context engineering file (apply references/habitat-discovery.md — alternative paths and embedded forms count as present)
HARNESS.md with declared constraints (same — alternative paths count)
AGENTS.md compound learning memory (same)
Parallel-tool config evidence: .cursor/rules/, .github/copilot-instructions.md, .windsurf/rules/, or custom AI tooling locations expressing harness control through whichever AI surface the team uses. Apply references/tool-config-evidence.md for the methodology. A project with rich parallel-tool configs is at L3 context engineering even without HARNESS.md/CLAUDE.md, but tool-config evidence does NOT signal architectural constraints or compound learning by itself.
MODEL_ROUTING.md model-tier guidance
.claude/skills/ project-local skills
.claude/agents/ custom agent definitions
.claude/commands/ custom commands
Hooks configuration (hooks.json)
REFLECTION_LOG.md with entries
.markdownlint.json or equivalent config
Spec-scoped changes constraint in HARNESS.md (Human Pace signal)
Change cadence drift GC rule active (Human Pace signal)
Session boundaries designed into workflow — time-based, not task-based (Depletable Collaborator signal)
Recovery cadence visible in work log — gaps between sessions respected (Depletable Collaborator signal)
Depletion Check practised — 90-minute time-based self-assessment (Depletable Collaborator signal)

Level 4 indicators (specification architecture):

specs/ directory with specification files
Implementation plans (plan.md, plan-*.md)
Agent pipeline with orchestrator
Plan approval gate in orchestrator
Loop guardrails (MAX_REVIEW_CYCLES)
Spec-to-PR mapping — each spec produces one PR (Human Pace signal)
Team-level sustainability — spec decomposition accounts for human energy, not just modularity (Depletable Collaborator signal)

Level 5 indicators (sovereign engineering):

Platform-level tooling (reusable plugins)
Cross-team harness templates
OpenTelemetry configuration
Organisational governance documentation
Multiple agent teams or cloud async agents
Change cadence metrics reviewed as team health signal (Human Pace signal)
Sustainable pace as platform metric alongside cost and quality (Depletable Collaborator signal)
Agent orchestration policies account for human verification capacity (Depletable Collaborator signal)

Phase 2: Clarifying Questions

After scanning, ask questions to fill gaps that observable evidence cannot answer. Focus on:

Workflow habits: "Do you write specs before code, or after?"
Verification practices: "Do you verify AI output systematically or trust it if it looks right?"
Cost awareness: "Do you know what your AI tools cost per month?"
Team practices: "Does your team have shared AI conventions, or does each developer work differently?"
Learning: "Do you capture what you learn from AI sessions?"

Ask 3-5 questions maximum. Each question should disambiguate between adjacent levels.

Phase 3: Assessment Document

Produce a timestamped Markdown document at assessments/YYYY-MM-DD-assessment.md with:

Header: team name, date, assessor, assessed level
Observable evidence: what was found in the repo (with file paths)
Clarifying responses: summary of answers
Level assessment: primary level with rationale
Discipline maturity: context engineering, architectural constraints, guardrail design — each rated
Strengths: what the team does well at their current level
Gaps: what's missing for the next level
Recommendations: 3-5 specific actions to progress
Immediate adjustments applied: what was fixed during this assessment
Workflow operation changes: accepted changes to how artifacts are used
Reflection: what the assessment itself revealed
Next assessment date: suggested quarterly re-assessment

Phase 4: Immediate Habitat Adjustments

After documenting the assessment, identify adjustments that can be made immediately — without changing any application code or requiring team discussion. These are habitat hygiene fixes:

Stale counts: If HARNESS.md Status section shows outdated counts, update them. If README badges show old numbers, update them.

Missing entries: If AGENTS.md GOTCHAS is empty but the assessment revealed gotchas, add them. If REFLECTION_LOG.md has no entries from this assessment, add one.

Drift detection: If HARNESS.md declares constraints that no longer match reality (tools removed, workflows renamed), update the declarations.

Mechanism map staleness: If the README mechanism map is missing components that the scan found (new agents, commands, hooks, skills), update it.

Present each adjustment to the user and apply it immediately. Record what was adjusted in the assessment document.

Phase 5: Workflow Operation Recommendations

Based on the gaps identified, recommend specific changes to how existing workflows and artifacts are operated (not built — the infrastructure exists, it just needs to be used differently):

Operating rhythm: Recommend cadences for harness audits, reflection reviews, mutation score checks, and cost monitoring. Suggest adding these to a calendar or checklist.

Habit formation: Identify which framework habits (from Part VII) are not yet automatic and suggest specific practice exercises.

Artifact activation: Identify artifacts that exist but are not actively used (e.g., AGENTS.md that isn't read at session start, MODEL_ROUTING.md that isn't consulted when dispatching agents) and recommend how to activate them.

Promotion opportunities: Identify unverified HARNESS.md constraints that could be promoted to agent or deterministic with available tooling.

Present each recommendation to the user. For accepted recommendations, apply the change (update CLAUDE.md with new cadences, promote HARNESS.md constraints, add operating notes to AGENTS.md). Record accepted and rejected recommendations in the assessment document.

Phase 5b: Improvement Plan

After workflow recommendations, invoke the literacy-improvements skill with the assessed level and the gaps from section 7 of the assessment document. The skill handles target level selection, plan generation, and interactive execution.

Phase 5 and Phase 5b are complementary:

Phase 5 = operate better at your current level
Phase 5b = build toward the next level

The skill records its outcomes (accepted, skipped, deferred) in the assessment document.

Phase 6: Assessment Reflection

Capture a reflection on the assessment itself:

What did the scan reveal that was surprising?
Where did observable evidence diverge from the team's self-perception?
What should future assessments pay attention to?

Append this to REFLECTION_LOG.md as a structured entry.

Phase 7: README Badge

Add or update a badge in the project's README showing the assessed level:

[![AI Literacy](https://img.shields.io/badge/AI_Literacy-Level_N-COLOUR?style=flat-square)](assessments/YYYY-MM-DD-assessment.md)

Colour coding:

Level	Colour	Hex
L0	Grey	`808080`
L1	Light blue	`87CEEB`
L2	Blue	`4682B4`
L3	Teal	`20B2AA`
L4	Green	`2E8B57`
L5	Gold	`DAA520`

Link target: the assessment document, so anyone who clicks the badge sees the full assessment with evidence and rationale.

Scoring Heuristic

The assessed level is the highest level where the team has substantial evidence across all three disciplines. A team with L3 context engineering but L1 verification is assessed at L1 — the weakest discipline is the ceiling.

Level	Minimum evidence required
L0	Repo exists, team is aware of AI tools
L1	Some AI tool usage, basic prompting
L2	Automated tests in CI, systematic verification of AI output
L3	CLAUDE.md + at least 3 harness constraints enforced + custom agents or skills
L4	Specifications before code + agent pipeline with safety gates
L5	Platform-level governance + cross-team standards + observability

Content-shape sophistication adjustments

Surface counts (script count, hook count, agent count, command count) are insufficient on their own. Apply the content-shape methodology in references/sophistication-markers.md before assigning a level. That reference defines simple-vs-sophisticated markers per artefact type and the level adjustments they justify.

The principle: a project with one sophisticated state-based orchestration script is not at the same maturity as one with ten simple bash hooks. Sophistication markers raise the floor on the discipline they evidence (orchestration sophistication → guardrail design; state-based hook sophistication → architectural constraints). The weakest-discipline-is-the-ceiling rule still applies — a single sophisticated artefact does not raise the overall level unless the other disciplines also have evidence at that level.

Every sophistication marker the assessor applies must be cited explicitly in the assessment document — what was found and where — so the level determination is auditable. No silent shifts.

The adjustments are introduced conservatively at this release — prefer surfacing markers without changing previously-assigned levels unless the evidence is unambiguous. As the framework accumulates assessments using the markers, the adjustments will tune.