Skill

self-improving-loop

Observes Langfuse traces via API, analyzes quality regressions, cost spikes, and failure patterns using Prometheus/CloudWatch, proposes PR drafts for prompt/skill fixes in 4-stage loop.

Prometheus

AWS

npx claudepluginhub aws-samples/sample-oh-my-aidlcops --plugin agenticops

Configuration

Model: claude-opus-4-7

Tool Access

This skill is limited to using the following tools:

ReadGrepBashmcp__cloudwatchmcp__prometheus

Preview

다음 상황에서 본 skill을 실행합니다.

SKILL.md

Similar Skills

using-superpowers

185.1k

Mandates invoking relevant skills via tools before any response in coding sessions. Covers access, priorities, and adaptations for Claude Code, Copilot CLI, Gemini CLI.

3 files

superpowers

Stats

Parent Repo Stars7

Parent Repo Forks2

Last CommitApr 30, 2026

Actions

View Source View Plugin View on GitHub View README

Help us improve

Share bugs, ideas, or general feedback.

LANGFUSE_HOST="${LANGFUSE_HOST:-http://langfuse.svc.cluster.local}" TARGET_AGENT="$1" # 예: rag-qa-agent, code-reviewer-skill WINDOW_HOURS="${WINDOW_HOURS:-168}" # 기본 7일 curl -s -u "$LANGFUSE_PUBLIC_KEY:$LANGFUSE_SECRET_KEY" \ "$LANGFUSE_HOST/api/public/traces?name=$TARGET_AGENT&from=$(date -u -d "-${WINDOW_HOURS} hours" +%Y-%m-%dT%H:%M:%SZ)&limit=500" \ | jq -c '.data[]' \ > .omao/plans/self-improving/traces-raw.jsonl

{ "target": "rag-qa-agent", "window": "2026-04-14T00:00:00Z/2026-04-21T00:00:00Z", "regressions": [ { "dimension": "faithfulness", "baseline": 0.87, "current": 0.79, "delta_pp": -8.0, "affected_trace_ids": ["tr_abc", "tr_def", "..."], "hypothesized_cause": "system_prompt revision on 2026-04-16 removed citation instruction" } ] }

--- a/skills/rag-qa-agent/SKILL.md +++ b/skills/rag-qa-agent/SKILL.md @@ -42,7 +42,11 @@ ## Instructions You are a RAG QA agent. Retrieve relevant context and answer the user's question. -Always respond concisely. +Always respond concisely and include inline citations in the form [source:doc-id] +for every claim grounded in retrieved context. If no retrieved context supports +a claim, state "I cannot verify this from available sources" instead of speculating. + +Never output PII tokens (email, SSN, phone) even if present in retrieved context.

BRANCH="self-improving/${TARGET_AGENT}-$(date +%Y%m%d-%H%M)" git checkout -b "$BRANCH" git add skills/rag-qa-agent/SKILL.md git commit -m "improve($TARGET_AGENT): restore citation instruction + block PII leakage" gh pr create --draft \ --title "improve($TARGET_AGENT): address faithfulness -8.0pp regression" \ --body "$(cat <<'EOF' ## Self-Improving Loop Proposal **Target**: `rag-qa-agent` **Window**: 2026-04-14 .. 2026-04-21 **Detected Regression**: faithfulness 0.87 → 0.79 (-8.0pp) ### Evidence - Affected traces: 47 traces, avg reward 0.42 (baseline 0.78) - Langfuse query: see `.omao/plans/self-improving/report-rag-qa-agent-20260421.json` ### Hypothesized Cause System prompt revision on 2026-04-16 removed the "include inline citations" instruction. ### Proposed Fix Restore citation instruction + add explicit PII blocking clause. ### Acceptance Criteria - [ ] `continuous-eval` faithfulness ≥ 0.87 on golden dataset - [ ] No PII leakage detected in 48h canary (5% traffic) - [ ] user_feedback positive rate ≥ baseline (0.71) ### Rollback `.omao/plans/self-improving/rollback/rag-qa-agent-20260421.md` holds the pre-change content. ### ADR Compliance Per ADR Self-Improving Loop §2, this proposal does NOT trigger model retraining. Merge of this PR results in prompt-only change; no weight update. `continuous-eval` will verify on next scheduled run. EOF )"

self-improving-loop

Configuration

Tool Access

Preview

SKILL.md

Similar Skills

Help us improve

Help us improve

self-improving-loop

Configuration

Tool Access

Preview

SKILL.md

When to Use

Prerequisites

4-Stage Loop

Stage 1: Observe — Langfuse trace 수집

Stage 2: Analyze — Regression 패턴 탐지

Stage 3: Propose — Prompt/Skill 수정안 생성

Stage 4: PR — Draft Pull Request 생성

Example Inputs/Outputs

ADR 경계 준수 체크리스트

참고 자료

공식 문서

기술 블로그

관련 문서 (내부)

Similar Skills

Help us improve

When to Use

Prerequisites

4-Stage Loop

Stage 1: Observe — Langfuse trace 수집

Stage 2: Analyze — Regression 패턴 탐지

Stage 3: Propose — Prompt/Skill 수정안 생성

Stage 4: PR — Draft Pull Request 생성

Example Inputs/Outputs

ADR 경계 준수 체크리스트

참고 자료

공식 문서

기술 블로그

관련 문서 (내부)