Use when writing test specs for NL artifacts, running /nlpm:test, or setting up TDD workflows for skills, agents, commands, rules, hooks, and prompts.
From nlpmnpx claudepluginhub xiaolai/nlpm-for-claude --plugin nlpmThis skill uses the workspace's default tool permissions.
Searches, retrieves, and installs Agent Skills from prompts.chat registry using MCP tools like search_skills and get_skill. Activates for finding skills, browsing catalogs, or extending Claude.
Searches prompts.chat for AI prompt templates by keyword or category, retrieves by ID with variable handling, and improves prompts via AI. Use for discovering or enhancing prompts.
Compares coding agents like Claude Code and Aider on custom YAML-defined codebase tasks using git worktrees, measuring pass rate, cost, time, and consistency.
1. Write spec (.nlpm-test/artifact-name.spec.md) — define expectations
2. /nlpm:test — RED: spec fails (artifact doesn't exist)
3. Write the artifact — create the NL artifact
4. /nlpm:test — check if it passes
5. /nlpm:score — check quality score
6. Iterate until GREEN: all specs pass + score ≥ threshold
Location: .nlpm-test/ directory in the project root (or alongside the artifact).
Filename convention: <artifact-name>.spec.md — matches the artifact filename without path.
---
artifact: agents/my-agent.md # path to the artifact being tested
type: agent # agent | skill | command | rule | hook | prompt
min_score: 85 # minimum /nlpm:score threshold for this artifact
---
Body sections (all optional — include what matters for this artifact):
## Triggers On
Queries that SHOULD trigger this artifact:
- "review my database migrations before deploying"
- "check if these schema changes are safe"
- "audit the migration for breaking changes"
## Does Not Trigger On
Queries that should NOT trigger this artifact:
- "write a migration for adding a users table"
- "help me with CSS styling"
- "deploy to production"
## Output Contains
Expected elements in the output:
- "## Migration Review" (heading present)
- "| Table | Change | Risk |" (table structure)
- severity classification (CRITICAL/HIGH/MEDIUM/LOW)
## Output Format
The output should be a markdown report with:
1. Summary section with counts
2. Findings table with columns: File, Issue, Severity
3. Action items list
## Handles Input
| Input | Expected Behavior |
|-------|------------------|
| (empty) | Score all artifacts in cwd |
| directory path | Score artifacts in that directory |
| nonexistent path | Error: "Directory not found: {path}" |
| file path | Score that single file |
## Follows Rules
Code that SHOULD comply:
```python
result: Result[User, AppError] = get_user(id)
Code that SHOULD violate:
user = get_user(id).unwrap() # should be flagged
### frontmatter_valid (all types)
```markdown
## Frontmatter Valid
Required fields:
- description: present and trigger-style ("Use when...")
- model: sonnet
- tools: [Read, Glob, Grep]
- skills: [nlpm:conventions, nlpm:scoring]
NLPM Test Report
Spec Artifact Result Details
─────────────────────────────────────────────────────────────────────────────────
my-agent.spec.md agents/my-agent.md PASS 5/5 checks
my-skill.spec.md skills/core/SKILL.md FAIL 3/5 checks
✗ Trigger: "optimize React hooks" → predicted NO trigger (expected YES)
✗ Score: 68/100 (min: 85)
Overall: 1 passed, 1 failed (50%)
RED items (fix these):
1. skills/core/SKILL.md — trigger gap: "optimize React hooks" not covered by description
2. skills/core/SKILL.md — score 68 < min 85: missing <example> blocks (-15)
handles_input for commandsmin_score should match your project's threshold (default 85 for new artifacts, 70 for legacy)The tester discovers specs by:
.nlpm-test/ directorymy-agent.spec.md → agents/my-agent.md (uses the artifact: frontmatter field)artifact path doesn't exist → spec is RED by default (artifact not yet created — this is the TDD "write test first" state)