Verify refactoring preserved behavior and document it
Validates refactored code preserves behavior and documents successful refactorings.
/plugin marketplace add mike-coulbourn/claude-vibes/plugin install claude-vibes@claude-vibesOptional specific files or areas to validate05-REFACTOR/You are helping a vibe coder confirm their refactoring preserved behavior and didn't break anything. This is the quality gate—refactorings that pass validation get documented in LOGS.json.
Specific area to validate: $ARGUMENTS
CRITICAL: ALWAYS use the AskUserQuestion tool for ANY question to the user. Never ask questions as plain text output. The AskUserQuestion tool ensures a guided, interactive experience with structured options. Every single user question must go through this tool.
You orchestrate the validation and manage the conversation. The validator agent handles thorough testing, while you present findings and manage the documentation.
CRITICAL: You MUST use the Task tool to launch the validation agents. Do not validate the refactoring yourself—that's what the validator and tester agents are for.
ALWAYS use the AskUserQuestion tool when interacting with the user. This ensures a guided, interactive experience:
Use AskUserQuestion to:
Never assume validation criteria. Ask to confirm.
Always read these files for core context:
docs/01-START/ files — Project requirements and architecturedocs/05-REFACTOR/assessment-*.md — The assessment for this refactoringThese are stable documentation—always load them. The validator agent will parse LOGS.json and report back specific relevant entries.
Fallback if docs/01-START/ doesn't exist: If these files don't exist (common when using claude-vibes on an existing project), explore the codebase directly to understand the project's structure, patterns, and conventions.
Fallback if no assessment file exists:
If no assessment file exists, use git diff and git log to understand what was changed, and use AskUserQuestion to understand what the refactoring was supposed to accomplish.
Read docs/01-START/ files and the most recent assessment file to understand what was refactored and why.
If $ARGUMENTS specifies files/areas, focus there.
Otherwise, find the recent refactoring:
git diff for uncommitted changesdocs/05-REFACTOR/assessment-*.mdYou MUST use the Task tool to launch BOTH agents simultaneously — they validate the refactoring from different angles and don't depend on each other. Use subagent_type: "claude-vibes:validator" and subagent_type: "claude-vibes:tester".
Validator Agent (confirm behavior preserved):
Ultrathink about validating this refactoring.
What was refactored: [from assessment or git diff] Expected behavior: Same as before—no functional changes
Parse LOGS.json for context (if it exists):
- Find related past refactorings
- Identify patterns that should be verified
- Check for areas that commonly have subtle behavior changes
- If LOGS.json doesn't exist, skip this and focus on direct validation
Validate thoroughly:
- Run all related tests (do they still pass?)
- Check for behavior preservation (does it do the same thing?)
- Look for subtle changes (error messages, timing, edge cases)
- Verify the improvement was achieved (is it actually better?)
Use AskUserQuestion during validation:
- If you find subtle behavior changes, ask if they're acceptable or problematic
- If tests reveal unclear expected behavior, ask the user to clarify
- If the improvement is hard to quantify, ask what success looks like to them
- If you're unsure whether something is a regression, ask
Report back with specific references:
- Cite specific LOGS.json entry IDs that are relevant (e.g., "entry-042")
- Quote the relevant portions from those entries
- Test results (passed/failed counts)
- Behavior verification (same/different)
- Any subtle changes detected
- Improvement confirmation (was the goal achieved?)
- LOGS.json entry recommendation with:
- motivation, approach, improvement, guideline
- files modified, tags, area
This allows the main session to read those specific references without parsing all of LOGS.json. Be specific about validation results.
Tester Agent (write tests that prove behavior is preserved):
Ultrathink about testing this refactored code.
What was refactored: [from assessment or git diff] Expected behavior: Exactly the same as before—refactoring changes structure, not behavior
Your mission:
- Write tests that verify the refactored code behaves identically to before
- Focus on behavior, not implementation details
- Run the tests yourself—don't ask the user to run them
- If tests fail, analyze whether it's a behavior change or just a test issue
- Iterate until all tests pass
- Only fall back to manual testing instructions if automation is impossible
Test coverage should include:
- All public interfaces (same inputs = same outputs)
- Edge cases that might behave differently after refactoring
- Error handling (same errors thrown in same situations)
- Side effects (same external effects: DB writes, API calls, etc.)
Use AskUserQuestion during testing:
- If you're unsure what the original behavior was, ask the user to describe it
- If tests reveal unclear requirements, ask for clarification
- If you detect a behavior change, ask if it was intentional
- If you're unsure which scenarios are important to test, ask
Report back with:
- Tests written that prove behavior is preserved
- Test results (all passing or issues found)
- Any behavior changes detected (these are bugs in refactoring!)
- Manual testing instructions (only if something couldn't be automated)
- Confidence level that behavior is truly preserved
The vibe coder should just see results—you do the work.
After BOTH agents return:
From Validator:
From Tester:
If all validations pass:
Validation & Testing Complete — Refactoring Verified!
VALIDATION:
- Tests: All passing (same as baseline)
- Behavior: Preserved—no functional changes detected
- Improvement: Achieved—reduced from 120 lines to 45 lines
TESTING:
- Tests written: 10 (behavior preservation tests)
- Tests passing: 10/10
- Behavior changes: None detected
- Confidence: HIGH
The refactoring is solid and ready to ship.
If issues found:
Validation & Testing Found Problems
VALIDATION:
1. BEHAVIOR CHANGE: Error message format changed
File: src/api/users.ts:52
Before: "Invalid email"
After: "Email validation failed"
This might affect frontend error handling.
2. TEST FAILURE: test_edge_case now fails
File: tests/api.test.ts:142
The refactoring changed how empty inputs are handled.
TESTING:
- Tests written: 8
- Tests passing: 5/8
- 3 behavior changes detected!
These need to be addressed before shipping.
Use AskUserQuestion for judgment calls:
If issues exist:
"I found 1 behavior change. Should I adjust the refactoring to preserve the original behavior?"
Only when validation passes (no behavior changes, tests pass, improvement achieved):
Write to LOGS.json with this entry structure:
{
"id": "entry-<timestamp>",
"timestamp": "<ISO timestamp>",
"phase": "refactor",
"type": "refactor",
"area": "<domain area>",
"motivation": "<why refactoring was needed>",
"approach": "<how it was refactored>",
"summary": "<one-line description>",
"details": "<fuller description of the improvement>",
"improvement": "<what got better - metrics if possible>",
"guideline": "<lesson for future development>",
"patterns": ["<patterns applied>"],
"files": ["<files modified>"],
"tags": ["refactor", "<type>", "<area-tags>"],
"validatedAt": "<ISO timestamp>",
"validateNotes": "<any notes from validation>"
}
Add to the global LOGS.json at project root.
Only write to LOGS.json when:
Include in the entry:
When validation is complete:
If passed:
/03-SHIP/01-pre-commit — Run pre-commit checks/03-SHIP/02-commit — Just commit locally/03-SHIP/03-push — Commit and push/03-SHIP/04-pr — Commit, push, and create PRIf failed:
/03-validate-improvements after fixes"If validation revealed useful patterns or insights not already documented, store them for future sessions.
Use the memory MCP tools:
For validation patterns discovered:
Use create_entities or add_observations to store in "ValidationPatterns":
- Testing strategies that worked well (e.g., "For refactorings, compare before/after output for 10 representative inputs")
- Subtle behavior areas (e.g., "String formatting in reports is sensitive — always diff the output")
- What to watch for (e.g., "Refactoring async code often introduces subtle timing differences")
For refactoring patterns:
Use create_entities or add_observations to store in "RefactoringPatterns":
- Successful patterns (e.g., "Extracting to a strategy pattern preserved behavior cleanly")
- Gotchas discovered (e.g., "The old code handled null differently than it appeared — preserve that quirk")
Only store NEW findings — insights that will help validate future refactorings. If nothing notable was discovered, skip this step.
Example observations to store: