Skill

metacognitive-self-mod

Analyzes effectiveness of past skill improvements, detects quality regressions, and refines meta-optimization. Auto-triggers on low effectiveness, recent failures, or periodic checks.

Python

developer-tools

npx claudepluginhub athola/claude-night-market --plugin abstract

Popularity

Parent stars

279

Parent forks

Invocation

How this skill is triggered — by the user, by Claude, or both

Slash command

/abstract:metacognitive-self-mod

User invocable

Model invocable

Inline context

Default effort

Context Preview

The summary Claude sees in its skill listing — used to decide when to auto-load this skill

Analyze the effectiveness of past skill improvements and

SKILL.md

276 lines · ~2k tokens

Similar Skills

skill-improver

Guides structured reflection on skill and process execution to identify targeted improvements after complex tasks, session ends, or friction like confusion and failures.

2 files

terma

workflow-improvement

279

Retrospectively evaluates and improves skills, agents, commands, and hooks using skill logs, git history, knowledge base, and performance metrics. Use after slow, confusing, or fragile sessions.

1 file

sanctum

evaluate-improve

Analyzes skill evaluation benchmarks and suggests prioritized improvements to instructions, descriptions, examples, error handling, and structure. Use after running /evaluate:skill for actionable skill enhancements.

12 tools

evaluate-plugin

Stats

LanguagePython

Parent stars279

Parent forks23

MaintenanceExcellent

Last CommitApr 5, 2026

Actions

View Source View Plugin View on GitHub View README

Help us improve

Share bugs, ideas, or general feedback.

Stats

Actions

Help us improve

Share bugs, ideas, or general feedback.

Metacognitive Self-Modification

Analyze the effectiveness of past skill improvements and refine the improvement process itself. This is the core innovation from the Hyperagents paper: not just improving skills, but improving HOW skills are improved.

Context Triggers (auto-invocation)

This skill should be invoked automatically when:

Regression detected: The homeostatic monitor finds a skill's evaluation window ended in pending_rollback_review status. The improvement made things worse -- we need to understand why.
Low effectiveness rate: When ImprovementMemory.get_effective_strategies() vs get_failed_strategies() shows effectiveness below 50%, the improvement process itself needs refinement.
Degradation despite improvements: When PerformanceTracker.get_improvement_trend() returns negative for a skill that was recently improved.
Periodic check: After every 10 improvement cycles (tracked via outcome count in ImprovementMemory).

Hook integration

The homeostatic monitor emits "improvement_triggered": true when a skill crosses the flag threshold. At that point, before dispatching the skill-improver, check if metacognitive analysis is warranted:

from abstract.improvement_memory import ImprovementMemory
from pathlib import Path

memory = ImprovementMemory(
    Path.home() / ".claude/skills/improvement_memory.json"
)

# Check if metacognitive analysis is warranted
effective = memory.get_effective_strategies()
failed = memory.get_failed_strategies()
total = len(effective) + len(failed)

needs_metacognition = False

# Trigger 1: Low effectiveness rate
if total >= 5 and len(effective) / total < 0.5:
    needs_metacognition = True

# Trigger 2: Periodic check (every 10 outcomes)
if total > 0 and total % 10 == 0:
    needs_metacognition = True

# Trigger 3: Recent regression
if failed and failed[-1].get("outcome_type") == "failure":
    needs_metacognition = True

if needs_metacognition:
    # Run metacognitive analysis before next improvement
    pass  # Skill(abstract:metacognitive-self-mod)

When To Use (Manual)

After a batch of skill improvements to assess what worked
When improvement outcomes show regressions
Periodically (monthly) to refine improvement strategy
When the skill-improver agent seems ineffective

When NOT To Use

Routine skill improvements (use skill-improver directly)
First-time skill creation (use skill-authoring)

Workflow

Step 1: Load improvement data

Read improvement memory and performance tracker data:

# Check for improvement memory
MEMORY_FILE=~/.claude/skills/improvement_memory.json
TRACKER_FILE=~/.claude/skills/performance_history.json

if [ ! -f "$MEMORY_FILE" ]; then
  echo "No improvement memory found."
  echo "Run skill-improver first to generate improvement data."
  exit 0
fi

Load the JSON files using Python:

from abstract.improvement_memory import ImprovementMemory
from abstract.performance_tracker import PerformanceTracker
from pathlib import Path

memory = ImprovementMemory(Path.home() / ".claude/skills/improvement_memory.json")
tracker = PerformanceTracker(Path.home() / ".claude/skills/performance_history.json")

Step 2: Classify improvement outcomes

For each improvement outcome in memory, classify:

Effective: after_score - before_score >= 0.1
Neutral: -0.1 < improvement < 0.1
Regression: after_score < before_score

effective = memory.get_effective_strategies()
failed = memory.get_failed_strategies()

# Calculate effectiveness rate
total = len(effective) + len(failed)
if total > 0:
    effectiveness_rate = len(effective) / total

Step 3: Extract meta-patterns

Analyze WHAT types of improvements succeed vs fail:

Success patterns to look for:

Adding error handling (reduces failure rate)
Adding examples (improves user ratings)
Adding quiet/verbose modes (reduces friction)
Simplifying workflow steps (reduces duration)

Failure patterns to look for:

Over-engineering (adding too many options)
Breaking existing workflows (regression)
Adding complexity without validation
Token budget overflow from verbose additions

For each pattern found, record as a causal hypothesis:

memory.record_insight(
    skill_ref="_meta",  # Special ref for meta-insights
    category="causal_hypothesis",
    insight="Error handling improvements have 85% success rate",
    evidence=["skill-A v1.1.0: +0.3", "skill-B v2.1.0: +0.15"]
)

Step 4: Analyze improvement trends

Use PerformanceTracker to identify:

Skills with sustained improvement (positive trend)
Skills with degradation despite improvement attempts
Domains where improvements are most effective

for skill_ref in tracker.get_all_skill_refs():
    trend = tracker.get_improvement_trend(skill_ref)
    if trend is not None:
        if trend > 0.05:
            # Sustained improvement - what's working?
            pass
        elif trend < -0.05:
            # Degrading despite improvements - investigate
            pass

Step 5: Generate strategy recommendations

Based on the meta-analysis, generate recommendations for the skill-improver:

Priority formula adjustments: If certain issue types have higher improvement success rates, weight them higher.
Approach selection: If "add error handling" has 85% success vs "restructure workflow" at 30%, bias toward error handling.
Threshold adjustments: If improvements below priority 3.0 consistently fail, raise the minimum threshold.
Avoidance rules: Document anti-patterns to avoid in future improvements.

Step 6: Store meta-insights

Record all findings back into ImprovementMemory under the special _meta skill ref:

# Record strategy recommendation
memory.record_insight(
    skill_ref="_meta",
    category="strategy_success",
    insight="Recommendation: Prioritize error handling and examples over restructuring",
    evidence=[f"Success rate: error_handling={eh_rate:.0%}, restructure={rs_rate:.0%}"]
)

Step 7: Update skill-improver strategy

If significant meta-insights are found, propose concrete modifications to the skill-improver agent:

Update priority weights in the priority formula
Add avoidance rules for known anti-patterns
Adjust thresholds based on empirical data
Add new improvement patterns that proved effective

Important: Propose changes, do not auto-apply. The user must approve modifications to the improvement process.

Output

Metacognitive Self-Modification Report

Improvement Data:
  Total outcomes analyzed: 15
  Effective improvements: 11 (73%)
  Regressions: 2 (13%)
  Neutral: 2 (13%)

Success Patterns:
  1. Error handling additions: 5/6 success (83%)
  2. Example additions: 3/3 success (100%)
  3. Quiet mode additions: 2/2 success (100%)

Failure Patterns:
  1. Workflow restructuring: 1/3 success (33%)
  2. Token-heavy additions: 0/1 success (0%)

Performance Trends:
  Improving: 8 skills (positive trend)
  Stable: 4 skills (no trend)
  Degrading: 1 skill (negative trend despite attempts)

Recommendations:
  1. Weight error handling improvements 2x in priority
  2. Avoid workflow restructuring below priority 8.0
  3. Cap additions at 200 tokens to prevent budget overflow
  4. Focus next improvement cycle on degrading skill X

Meta-insights stored: 5 new entries in improvement memory

abstract:skill-improver - The agent this skill analyzes and proposes modifications for
abstract:skills-eval - Evaluation framework whose criteria could be refined by meta-insights
abstract:aggregate-logs - Data source for improvement metrics

metacognitive-self-mod

Popularity

Invocation

Context Preview

SKILL.md

Similar Skills

Help us improve

Help us improve

Find plugins for your project

metacognitive-self-mod

Popularity

Invocation

Context Preview

SKILL.md

Metacognitive Self-Modification

Context Triggers (auto-invocation)

Hook integration

When To Use (Manual)

When NOT To Use

Workflow

Step 1: Load improvement data

Step 2: Classify improvement outcomes

Step 3: Extract meta-patterns

Step 4: Analyze improvement trends

Step 5: Generate strategy recommendations

Step 6: Store meta-insights

Step 7: Update skill-improver strategy

Output

Related

Similar Skills

Help us improve

Metacognitive Self-Modification

Context Triggers (auto-invocation)

Hook integration

When To Use (Manual)

When NOT To Use

Workflow

Step 1: Load improvement data

Step 2: Classify improvement outcomes

Step 3: Extract meta-patterns

Step 4: Analyze improvement trends

Step 5: Generate strategy recommendations

Step 6: Store meta-insights

Step 7: Update skill-improver strategy

Output

Related