From awesome-claude-notes
Measures coding agent compliance with skills/rules/agents by generating specs/scenarios at 3 strictness levels, running agents, classifying tool calls, and reporting timelines with scores.
npx claudepluginhub loulanyue/awesome-claude-notes --plugin awesome-claude-notesThis skill uses the workspace's default tool permissions.
Measures whether coding agents actually follow skills, rules, or agent definitions by:
fixtures/compliant_trace.jsonlfixtures/noncompliant_trace.jsonlfixtures/tdd_spec.yamlprompts/classifier.mdprompts/scenario_generator.mdprompts/spec_generator.mdpyproject.tomlscripts/__init__.pyscripts/classifier.pyscripts/grader.pyscripts/parser.pyscripts/report.pyscripts/run.pyscripts/runner.pyscripts/scenario_generator.pyscripts/spec_generator.pyscripts/utils.pytests/test_grader.pytests/test_parser.pyEnforces C++ Core Guidelines for writing, reviewing, and refactoring modern C++ code (C++17+), promoting RAII, immutability, type safety, and idiomatic practices.
Provides patterns for shared UI in Compose Multiplatform across Android, iOS, Desktop, and Web: state management with ViewModels/StateFlow, navigation, theming, and performance.
Implements Playwright E2E testing patterns: Page Object Model, test organization, configuration, reporters, artifacts, and CI/CD integration for stable suites.
Measures whether coding agents actually follow skills, rules, or agent definitions by:
claude -p and capturing tool call traces via stream-jsonskills/*/SKILL.md): Workflow skills like search-first, TDD guidesrules/common/*.md): Mandatory rules like testing.md, security.md, git-workflow.mdagents/*.md): Whether an agent gets invoked when expected (internal workflow verification not yet supported)/skill-comply <path># Full run
uv run python -m scripts.run ~/.claude/rules/common/testing.md
# Dry run (no cost, spec + scenarios only)
uv run python -m scripts.run --dry-run ~/.claude/skills/search-first/SKILL.md
# Custom models
uv run python -m scripts.run --gen-model haiku --model sonnet <path>
Measures whether a skill/rule is followed even when the prompt doesn't explicitly support it.
Reports are self-contained and include:
For users familiar with hooks, reports also include hook promotion recommendations for steps with low compliance. This is informational — the main value is the compliance visibility itself.