From claude-impl-tools
Logs cross-project outcomes and recalls lessons to inform new sessions, avoiding past mistakes. Analyzes skill executions for better routing. Use /memento modes: log, global recall, health, route.
npx claudepluginhub insightflo/claude-impl-tools --plugin claude-impl-toolsThis skill uses the workspace's default tool permissions.
> Inspired by [Memento-Skills (arXiv:2603.18743)](https://arxiv.org/abs/2603.18743) — "Let Agents Design Agents"
Logs errors, user corrections, missing features, API failures, knowledge gaps, and best practices to .learnings/ markdown files. Promotes key insights to CLAUDE.md and AGENTS.md for AI agent self-improvement.
Captures high/medium/low confidence patterns from conversations to prevent repeating mistakes and preserve successes. Invoke proactively after corrections, praise, edge cases, or skill-heavy sessions.
Captures high/medium/low confidence learnings from conversations via triggers like corrections, praise, edge cases. Improves skills by preventing mistakes and preserving successes. Invoke proactively after 'no/wrong', 'perfect', or session ends.
Share bugs, ideas, or general feedback.
Inspired by Memento-Skills (arXiv:2603.18743) — "Let Agents Design Agents"
Core idea: Skills evolve through execution experience, not just manual editing. LLM parameters stay frozen; only SKILL.md files and routing knowledge change.
Cross-project knowledge memory with two primary purposes:
Skill intelligence is secondary — analyzing execution data to improve future routing:
Invoke with: /memento <mode>
| Mode | Purpose | When |
|---|---|---|
log | Record what happened — successes, failures, decisions | After any significant outcome |
global recall <topic> | Retrieve cross-project learnings on a topic | Session start, before new work |
global search <query> | Full-text search across all projects | Need specific past experience |
These modes analyze skill execution data to improve future routing. They are means to improve recommendations, not the primary purpose.
| Mode | Purpose | When |
|---|---|---|
route <task> | Recommend best skill with experience weighting | workflow-guide static rules insufficient |
health | Skill ecosystem dashboard | Periodic review |
reflect <skill> | Failure pattern analysis + improvement suggestions | Skill underperforms repeatedly |
profile <skill> | Detailed execution history for one skill | Before modifying or deprecating |
harness <skill> | Auto-generate deterministic guardrails from failures | Recurring failures caught by code |
| Mode | Purpose |
|---|---|
global health | Unified ecosystem dashboard across all projects |
global sync | Sync MEMORY.md files → DuckDB |
global sql <query> | Direct SQL on unified experience store |
logRecord what happened after a skill ran. This is the foundation — without experience data, routing and reflection have nothing to learn from.
Read the experience schema from references/experience-schema.md, then create an entry:
# Append to the project's experience store
STORE="${PROJECT_ROOT}/.claude/memento/experience.jsonl"
mkdir -p "$(dirname "$STORE")"
Observe these signals to judge success. Don't ask the user — infer from context:
| Signal | Interpretation | Confidence |
|---|---|---|
| User proceeds to next task | success | high |
| Explicit positive ("좋아", "perfect") | success | very high |
| Quality gate passed (/checkpoint, /audit) | success | very high |
| Same skill re-invoked immediately | partial | medium |
| User corrects output or says "아니" | failure | high |
| Quality gate failed | partial | high |
| Session ends without feedback | unknown — exclude from stats | low |
The ideal setup: a PostToolUse hook on Skill invocations that writes experience entries automatically. See references/hook-setup.md for the hook configuration. Until automated, log manually after significant skill runs.
route <task description>Recommend the best skill using a 3-layer scoring system that improves as experience accumulates.
Layer 1: Rule-based matching (existing workflow-guide logic)
↓ produces rule_scores: {skill: 0.0-1.0}
Layer 2: Experience-based matching
↓ find similar past tasks in experience.jsonl
↓ compute success_rate per skill, weighted by:
↓ - recency (recent experiences count more, decay=0.95)
↓ - similarity (closer task signatures count more)
↓ produces exp_scores: {skill: 0.0-1.0}
Layer 3: Blend
↓ alpha = min(0.7, 0.3 + experience_count * 0.04)
↓ → starts at 0.3 (rules dominate)
↓ → grows to 0.7 (experience dominates at 10+ data points)
↓ final_score = (1-alpha)*rule_score + alpha*exp_score
Cold start is handled gracefully: with zero experience, alpha=0.3 and exp_score=0.5 (neutral), so the existing workflow-guide rules drive routing. As experience accumulates, data gradually takes over.
From the user's request, extract:
Use these dimensions for similarity matching against past experiences.
Present the recommendation with confidence and evidence:
📊 Memento Route: maintenance (confidence: 0.87)
Rule match: 0.82 (source_code=yes, intent=bugfix)
Experience: 0.91 (7 similar tasks, 6 succeeded with /maintenance)
Blend α: 0.58 (experience weight, based on 7 data points)
Alternative: /agile iterate (0.64)
Recent similar:
• 2026-03-25 auth middleware fix → /maintenance → success
• 2026-03-22 payment validation → /maintenance → success
• 2026-03-20 cross-domain refactor → /maintenance → partial
healthDisplay a skill ecosystem dashboard. Read all experience data, generate profiles, and present:
For each skill with experience data, auto-generate a profile. See references/skill-profile-schema.md for the schema. Profiles include:
Store profiles in .claude/memento/profiles/<skill-name>.json.
reflect <skill-name>Analyze why a skill underperforms and suggest concrete improvements.
When confidence is high (>0.8) and the pattern is clear:
/autoresearch-skills with the new test caseWhen confidence is medium (0.5-0.8):
When confidence is low (<0.5):
🔍 Memento Reflect: maintenance
Evidence: 12 runs analyzed (8 success, 3 partial, 1 failure)
Pattern detected: cross-domain failures
• 3/3 cross-domain tasks resulted in partial or failure
• All succeeded for single_file and multi_file scales
Hypothesis: SKILL.md lacks guidance for cross-domain impact analysis.
The 5-stage ITIL process jumps to modification without checking
cross-domain dependencies first.
Proposed fix: Add "Step 0: Run /impact for cross-domain changes"
before Stage 3 (Safe Modification).
Confidence: 0.75 (3 consistent data points)
Recommendation: Present to user for approval before modifying.
profile <skill-name>Show the complete execution history and statistics for one skill.
Read references/skill-profile-schema.md for the data structure. Display:
harness <skill-name>Inspired by AutoHarness (arXiv:2603.03329) — LLM agents that generate their own guardrail code.
Analyze failure patterns and generate validation scripts that prevent recurring failures. The key insight from AutoHarness: a small model + code guardrails beats a large model without them.
| Type | File | Purpose | LLM needed at runtime? |
|---|---|---|---|
| Pre-check | scripts/harness/pre_check.sh | Validate environment before skill runs | No |
| Action-verifier | scripts/harness/verify_action.py | Check proposed actions are valid | No |
| Post-verify | scripts/harness/post_verify.sh | Confirm skill achieved its goal | No |
1. Read experience.jsonl for target skill
2. Group failures by root cause:
- Missing prerequisites (file not found, tool not installed)
- Invalid actions (wrong file modified, forbidden operation)
- Incomplete results (partial output, missing verification)
3. For each failure pattern, generate a validation script:
- pre_check.sh: catches prerequisite failures
- verify_action.py: catches invalid action patterns
- post_verify.sh: catches incomplete results
4. Test the harness against past failure cases
5. If it would have caught the failures → install to skill's scripts/harness/
If maintenance skill fails 3/3 times on cross-domain changes because it doesn't check dependencies first:
# scripts/harness/pre_check.sh (auto-generated)
#!/bin/bash
# Harness: cross-domain dependency check
# Generated from 3 failure cases (2026-03-25, 03-22, 03-20)
CHANGED_FILES=$(git diff --name-only HEAD 2>/dev/null)
DOMAINS=$(echo "$CHANGED_FILES" | sed 's|/.*||' | sort -u | wc -l)
if [ "$DOMAINS" -gt 1 ]; then
echo "HARNESS_WARN: Cross-domain change detected ($DOMAINS domains)."
echo "HARNESS_SUGGEST: Run /impact first to map dependencies."
exit 1
fi
exit 0
skill-name/
├── SKILL.md
└── scripts/
└── harness/ ← Auto-generated by /memento harness
├── pre_check.sh ← Runs before skill, exit 1 = block
├── verify_action.py ← Validates proposed actions
└── post_verify.sh ← Runs after skill, exit 1 = warn
Harness scripts are deterministic (no LLM calls). They're the cheapest possible guardrails.
See references/harness-generation.md for the full generation algorithm and templates.
global <subcommand>Unified Experience Store powered by DuckDB. Breaks project silos. 22개 프로젝트의 66개 메모리 파일을 하나의 SQL DB로 통합 검색.
pip install duckdb # 1회만
python3 ~/.claude/memento/query.py sync # MEMORY.md → DB 동기화
global search <query> — 전 프로젝트 학습 검색
python3 ~/.claude/memento/query.py search "cross-domain bugfix"
# → 14개 프로젝트에서 관련 학습 검색
global recall <topic> — 특정 주제의 크로스 프로젝트 지식 회수
python3 ~/.claude/memento/query.py recall "FastAPI 인증"
# → feedback/project 타입 우선, 실행 가능한 지식 반환
global health — 통합 대시보드
python3 ~/.claude/memento/query.py health
# → 프로젝트별 학습 현황 + 스킬 건강 (experience 데이터 있을 때)
global sync — MEMORY.md 동기화
python3 ~/.claude/memento/query.py sync
# → 모든 프로젝트의 메모리 파일을 DB에 upsert
global sql <query> — 직접 SQL
python3 ~/.claude/memento/query.py sql "SELECT type, COUNT(*) FROM learnings GROUP BY type"
~/.claude/memento/experience.duckdb ← 전역 통합 저장소
~/.claude/memento/query.py ← CLI 쿼리 도구
| Name | Type | Content |
|---|---|---|
experience | table | 스킬 실행 경험 (memento log에서 축적) |
learnings | table | MEMORY.md 파일들 (22개 프로젝트 통합) |
skill_health | view | 스킬별 사용횟수, 성공률, 평균토큰 |
project_knowledge | view | 프로젝트별 학습 통계 |
All memento data lives in .claude/memento/ within the project:
.claude/memento/
├── experience.jsonl ← Append-only execution log
└── profiles/ ← Auto-generated skill profiles
├── maintenance.json
├── agile.json
└── ...
Experience is project-scoped because skill effectiveness varies by project type. A skill that works well for a web app may not suit a CLI tool.
For detailed schemas and technical specifications:
references/experience-schema.md — Experience log entry formatreferences/skill-profile-schema.md — Skill profile data structurereferences/smart-router.md — Full routing algorithm with edge casesreferences/hook-setup.md — Automated experience logging via hooksreferences/harness-generation.md — AutoHarness-inspired validation script generation