From copilot-cli-toolkit
Captures high/medium/low confidence learnings from conversations via triggers like corrections, praise, edge cases. Improves skills by preventing mistakes and preserving successes. Invoke proactively after 'no/wrong', 'perfect', or session ends.
npx claudepluginhub rjmurillo/ai-agentsThis skill uses the workspace's default tool permissions.
**Critical learning capture system** that prevents repeating mistakes and preserves successful patterns across sessions.
Captures high/medium/low confidence patterns from conversations to prevent repeating mistakes and preserve successes. Invoke proactively after corrections, praise, edge cases, or skill-heavy sessions.
Analyzes session history via ccrecall.db or in-context to extract learnings from corrections, discoveries, and failures, then proposes persistent skill updates. Invoke /reflect post-session.
Logs errors, user corrections, missing features, API failures, knowledge gaps, and best practices to .learnings/ markdown files. Promotes key insights to CLAUDE.md and AGENTS.md for AI agent self-improvement.
Share bugs, ideas, or general feedback.
Critical learning capture system that prevents repeating mistakes and preserves successful patterns across sessions.
Analyze the current conversation and propose improvements to skill-based memories based on what worked, what didn't, and edge cases discovered. Every correction is a learning opportunity - invoke proactively to build institutional knowledge.
| Trigger Phrase | Operation |
|---|---|
reflect on this session | Extract learnings from conversation |
learn from this mistake | Capture correction patterns |
capture what we learned | Document session insights |
improve skill {name} | Target specific skill memory |
what did we learn | Review and store patterns |
Also monitor user phrasing such as "what if...", "ensure", or "don't forget". These phrases should immediately route into the MEDIUM trigger tables below.
| Trigger | Example | Why Critical |
|---|---|---|
| User correction | "no", "wrong", "not like that", "never do" | Captures mistakes to prevent repetition |
| Chesterton's Fence | "you removed that without understanding" | Documents architectural decisions |
| Immediate fixes | "debug", "root cause", "fix all" | Learns from errors in real-time |
| Trigger | Example | Why Important |
|---|---|---|
| User praise | "perfect", "exactly", "great" | Reinforces successful patterns |
| Tool preferences | "use X instead of Y", "prefer", "rather than" | Builds workflow preferences |
| Edge cases | "what if X happens?", "don't forget", "ensure" | Captures scenarios to handle |
| Questions | Short questions after output | May indicate confusion or gaps |
| Trigger | Example | Why Useful |
|---|---|---|
| Repeated patterns | Frequent use of specific commands/tools | Identifies workflow preferences |
| Session end | After skill-heavy work | Consolidates all session learnings |
| Phrase | Action |
|---|---|
| "reflect" | Full analysis of current session |
| "improve skill" | Target specific skill for improvement |
| "learn from this" | Extract learnings from recent interaction |
| "what did we learn" | Summarize accumulated learnings |
Use this skill when:
Use retrospective instead when:
Don't wait for users to ask! Invoke reflect immediately when you detect:
Why this matters: Without proactive reflection, learnings are LOST. The Stop hook captures some patterns, but manual reflection is MORE ACCURATE because you have full conversation context.
Cost: ~30 seconds of analysis. Benefit: Prevents repeating mistakes forever.
Locate the skill-based memory to update:
-observations.md in .serena/memories/{skill-name}-observations.md (skill observations pattern)Storage Locations:
.serena/memories/{skill-name}-observations.md via mcp__serena__write_memoryScan the conversation for learning signals with confidence levels:
User actively steered or corrected output. These are the most valuable signals.
Detection patterns:
Example:
User: "No, use the PowerShell skill script instead of raw gh commands"
→ [HIGH] + Add constraint: "Use PowerShell skill scripts, never raw gh commands"
Output was accepted or praised. Good signals but may be context-specific.
Detection patterns:
Example:
User: "Perfect, that's exactly what I needed"
→ [MED] + Add preference: "Include example usage in script headers"
Scenarios the skill didn't anticipate. Opportunities for improvement.
Detection patterns:
Example:
User: "What if the file doesn't exist?"
→ [MED] ~ Add edge case: "Handle missing file scenario"
Accumulated patterns over time. Need more evidence before formalizing.
Detection patterns:
Example:
User consistently uses `-Force` flag
→ [LOW] ~ Note for review: "User prefers -Force flag for overwrites"
Only propose changes when sufficient evidence exists:
| Threshold | Action |
|---|---|
| ≥1 HIGH signal | Always propose (user explicitly corrected) |
| ≥2 MED signals | Propose (sufficient pattern) |
| ≥3 LOW signals | Propose (accumulated evidence) |
| 1-2 LOW only | Skip (insufficient evidence), note for next session |
Present findings using WCAG AA accessible colors (4.5:1 contrast ratio):
┌─────────────────────────────────────────────────────────────┐
│ SKILL REFLECTION: {skill-name} │
├─────────────────────────────────────────────────────────────┤
│ │
│ [HIGH] + Add constraint: "{specific constraint}" │
│ Source: "{quoted user correction}" │
│ │
│ [MED] + Add preference: "{specific preference}" │
│ Source: "{evidence from conversation}" │
│ │
│ [MED] + Add edge case: "{scenario}" │
│ Source: "{question or workaround}" │
│ │
│ [LOW] ~ Note for review: "{observation}" │
│ Source: "{pattern observed}" │
│ │
├─────────────────────────────────────────────────────────────┤
│ Apply changes? [Y/n/edit] │
└─────────────────────────────────────────────────────────────┘
Color Key (accessible):
[HIGH] - Red/bold: Mandatory corrections (user explicitly said "no")[MED] - Yellow/amber: Recommended additions[LOW] - Blue/dim: Notes for later reviewUser Response Handling:
| Response | Action |
|---|---|
| Y (yes) | Proceed to Step 4 (update memory) |
| n (no) | Abort update, ask "What would you like to change or was this not useful?" |
| edit | Present each finding individually, allow user to modify/reject each one |
On rejection (n):
On edit:
ALWAYS show changes before applying.
After user approval:
.serena/memories/{skill-name}-observations.mdStorage Strategy:
Serena MCP (canonical):
mcp__serena__write_memory(memory_file_name="{name}-observations", memory_content="...")
If Serena unavailable (contingency):
path=".serena/memories/{name}-observations.md"
# Append new learnings to existing file (create if missing)
echo "$newLearnings" >> "$path"
git add "$path"
git commit -m "chore(memory): update {name} skill sidecar learnings"
Record the manual edit in the session log so Serena MCP can replay the update when the service is available again.
Memory Format:
# Skill Sidecar Learnings: {Skill Name}
**Last Updated**: {ISO date}
**Sessions Analyzed**: {count}
## Constraints (HIGH confidence)
- {constraint 1} (Session {N}, {date})
- {constraint 2} (Session {N}, {date})
## Preferences (MED confidence)
- {preference 1} (Session {N}, {date})
- {preference 2} (Session {N}, {date})
## Edge Cases (MED confidence)
- {edge case 1} (Session {N}, {date})
- {edge case 2} (Session {N}, {date})
## Notes for Review (LOW confidence)
- {note 1} (Session {N}, {date})
- {note 2} (Session {N}, {date})
When persisting learnings that reference specific code locations, automatically capture citations:
Detect code references in learning text:
`path/to/file.ext` line N`functionName()` in `file.ext`Extract citation metadata: file path, line number, snippet (if available)
Add citations to memory frontmatter:
python -m memory_enhancement add-citation <memory-id> --file <path> --line <num> --snippet <text>
Update confidence score based on initial verification
Detection Patterns:
| Pattern | Example | Extraction |
|---|---|---|
| Inline code + line | In `src/client/constants.ts` line 42 | file=src/client/constants.ts, line=42 |
| Function in file | `handleError()` in `src/utils.ts` | file=src/utils.ts (file-level) |
| Explicit citation | See: src/api.py:100 | file=src/api.py, line=100 |
Integration Point:
After user approves learnings (step 4 above), before writing to Serena:
python -m memory_enhancement add-citation <memory-id> --file <path> --line <num> --snippet <text>Example:
Learning text: "The bug was in scripts/health.py line 45, where we forgot to handle None"
python -m memory_enhancement add-citation memory-observations --file scripts/health.py --line 45 --snippet "handle None"User says "reflect" or similar?
│
├─► YES
│ │
│ ├─► Identify skill(s) used in conversation
│ │ │
│ │ └─► Skill identified?
│ │ │
│ │ ├─► YES → Analyze conversation for signals
│ │ │ │
│ │ │ └─► Meets confidence threshold?
│ │ │ │
│ │ │ ├─► YES → Present findings, await approval
│ │ │ │ │
│ │ │ │ ├─► User says Y → Update memory file
│ │ │ │ │ │
│ │ │ │ │ ├─► Serena available? → Use MCP write
│ │ │ │ │ └─► Serena unavailable? → Use Git fallback
│ │ │ │ │
│ │ │ │ ├─► User says n → Ask for feedback
│ │ │ │ │ │
│ │ │ │ │ ├─► User wants revision → Re-analyze
│ │ │ │ │ └─► User skips → End workflow
│ │ │ │ │
│ │ │ │ └─► User says edit → Interactive review
│ │ │ │ │
│ │ │ │ └─► Per-finding [keep/modify/remove]
│ │ │ │
│ │ │ └─► NO → Report "Insufficient evidence. Note for next session."
│ │ │
│ │ └─► NO → Ask user which skill to reflect on
│ │ │
│ │ ├─► User specifies skill → Continue with that skill
│ │ └─► User says "none" → End workflow
│ │
│ └─► Multiple skills?
│ │
│ └─► Analyze each, group findings by skill, present together
│
└─► NO → This skill not invoked
Conversation:
User: "Create a PR for this change"
Agent: [runs gh pr create directly]
User: "No, use the github skill script!"
Analysis:
[HIGH] + Add constraint: "Always use .claude/skills/github/ scripts for PR operations"
Source: User correction - "No, use the github skill script!"
Conversation:
User: "Add error handling"
Agent: [adds try/catch with specific error types]
User: "Perfect! That's exactly what I wanted"
Analysis:
[MED] + Add preference: "Use specific error types in catch blocks, not generic [Exception]"
Source: User approval after seeing specific error types
Conversation:
User: "Run the build"
Agent: [runs build command]
User: "Wait, what if the node_modules folder doesn't exist?"
Analysis:
[MED] + Add edge case: "Check for node_modules existence before build"
Source: User question about missing dependencies
Capture learnings about code review patterns:
Example memory: .serena/memories/code-review-observations.md
Track API design decisions:
Example memory: .serena/memories/api-design-observations.md
Remember testing preferences:
Example memory: .serena/memories/testing-observations.md
Learn documentation patterns:
Example memory: .serena/memories/documentation-observations.md
| Avoid | Why | Instead |
|---|---|---|
| Applying without showing | User loses visibility | Always preview changes |
| Overwriting existing learnings | Loses history | Append with timestamps |
| Generic observations | Not actionable | Be specific and contextual |
| Ignoring LOW confidence | Lose valuable patterns | Track for future validation |
| Creating memory for one-off | Noise | Wait for repeated patterns |
Run reflection at session end as part of retrospective:
## Session End Checklist
- [ ] Complete session log
- [ ] Run skill reflection (if skills were used)
- [ ] Update Serena memory
- [ ] Commit changes
Skill memories integrate with the memory system:
# Search skill sidecar learnings
python3 .claude/skills/memory/scripts/search_memory.py --query "github-observations constraints"
# Read specific skill sidecar
Read .serena/memories/github-observations.md
If Serena MCP is available:
mcp__serena__read_memory(memory_file_name="github-observations")
mcp__serena__write_memory(memory_file_name="github-observations", memory_content="...")
| Action | Verification |
|---|---|
| Analysis complete | Signals categorized by confidence |
| User approved | Explicit Y or approval statement |
| Memory updated | File written to .serena/memories/ |
| Changes preserved | Existing content not lost |
| Commit ready | Changes staged, message drafted |
{skill-name}-observations.mdDecision: Skill memories follow the ADR-007 sidecar pattern (e.g., github-observations.md).
Rationale:
{domain}-{description} format while making "skill-sidecar" explicitmemory-index.md, preventing orphaned learningsMigration: Rename {skill}-observations.md (or legacy skill-{name}.md) to {skill}-observations.md and update index references.
{skill}-observations.md file.skill-{name} and reference the Serena sidecar instead of duplicating the content.curating-memoriescurating-memories = general-purpose maintenance of any memory artifact (linking, pruning, marking obsolete).reflect = targeted retrospective that feeds those artifacts with new learnings.curating-memories for cleanup.curating-memories for consolidation.memory skill for search/recall before proposing redundant learnings.skill-{name} tags for semantic recall.session-log-fixer.| Skill | Relationship |
|---|---|
memory | Skill memories are part of Tier 1 |
using-forgetful-memory | Alternative storage for skill learnings |
curating-memories | For maintaining/pruning skill memories |
retrospective | Full session retrospective (this is mini version) |
When committing skill observation updates:
chore(memory): update {skill-name} skill sidecar learnings (session {N})
- Added {count} constraints (HIGH confidence)
- Added {count} preferences (MED confidence)
- Added {count} edge cases (MED confidence)
- Added {count} notes (LOW confidence)
Session: {session-id}