Safely remove intermediate files from completed research sessions while preserving important data
Safely removes intermediate files from completed research sessions while preserving all important data. Claude uses this when research is complete and users want to clean up temporary files before archiving or sharing.
/plugin marketplace add kthorn/research-superpower/plugin install kthorn-research-superpowers@kthorn/research-superpowerThis skill inherits all available tools. When active, it can use any tool Claude has access to.
Remove intermediate files created during research workflow while preserving all important data.
Core principle: Conservative cleanup with user confirmation. Never delete anything important.
Use this skill when:
When NOT to use:
NEVER delete these (protected list):
Core outputs:
SUMMARY.md - Enhanced findings with methodologyrelevant-papers.json - Filtered relevant paperspapers-reviewed.json - Complete screening historypapers/ directory - All PDFs and supplementary filescitations/citation-graph.json - Citation relationshipsMethodology documentation:
screening-criteria.json - Rubric definition (if exists)test-set.json - Rubric validation papers (if exists)abstracts-cache.json - Cached abstracts for re-screening (if exists)rubric-changelog.md - Rubric version history (if exists)Auxiliary documentation (if exists):
README.md - Project overviewTOP_PRIORITY_PAPERS.md - Curated priority listevaluated-papers.json - Rich structured dataProject configuration:
.claude/ directory - Permissions and settings*.py helper scripts that were created - Keep for reproducibilityCandidates for removal (with confirmation):
Intermediate search results:
initial-search-results.json - Raw PubMed results before screening
Temporary files:
*.tmp files*.swp files (vim swap files).DS_Store (macOS)__pycache__/ (Python cache)*.pyc (Python compiled)Log files:
*.log filesdebug-*.txt filescd research-sessions/YYYY-MM-DD-description/
# List all files with sizes
find . -type f -exec ls -lh {} \; | awk '{print $5, $9}' | sort -rh
Identify files by category:
Show what will be deleted:
๐งน Cleanup Analysis for: research-sessions/2025-10-11-btk-selectivity/
Files to KEEP (protected):
โ
SUMMARY.md (45 KB)
โ
relevant-papers.json (12 KB)
โ
papers-reviewed.json (28 KB)
โ
papers/ (14 PDFs, 32 MB)
โ
citations/citation-graph.json (5 KB)
โ
screening-criteria.json (2 KB)
โ
abstracts-cache.json (156 KB)
Files that CAN be removed (intermediate):
๐๏ธ initial-search-results.json (8 KB) - Raw PubMed results
๐๏ธ .DS_Store (6 KB) - macOS metadata
Total space to recover: 14 KB
Proceed with cleanup? (y/n/review)
Options:
y - Delete intermediate filesn - Cancel cleanup, keep everythingreview - Show contents of each file before decidingBefore deleting ANY file:
Example confirmation:
About to delete:
- initial-search-results.json (8 KB)
This file contains raw PubMed search results. The data is preserved in
papers-reviewed.json, so this is safe to delete.
Confirm deletion? (y/n)
Delete confirmed files:
# Move to trash instead of rm (safer)
# On macOS:
mv initial-search-results.json ~/.Trash/
# On Linux:
mv initial-search-results.json ~/.local/share/Trash/files/
# Or use rm if user confirms
rm initial-search-results.json
Report results:
โ
Cleanup complete!
Removed:
- initial-search-results.json (8 KB)
- .DS_Store (6 KB)
Space recovered: 14 KB
Protected files preserved:
- All 8 core files kept
- All 14 PDFs kept
- All methodology documentation kept
After cleanup, verify critical files:
# Check core files exist
test -f SUMMARY.md && echo "โ SUMMARY.md"
test -f relevant-papers.json && echo "โ relevant-papers.json"
test -f papers-reviewed.json && echo "โ papers-reviewed.json"
test -d papers && echo "โ papers/ directory"
# Verify JSON files are valid
jq empty relevant-papers.json && echo "โ relevant-papers.json valid JSON"
jq empty papers-reviewed.json && echo "โ papers-reviewed.json valid JSON"
Report to user:
โ
Integrity check passed
- All core files present
- All JSON files valid
- All PDFs intact
If abstracts-cache.json is very large (>100 MB):
โ ๏ธ abstracts-cache.json is 256 MB
This file enables re-screening if you update the rubric. Options:
1. Keep (recommended if you might refine rubric)
2. Compress (gzip to ~50 MB, can decompress later)
3. Delete (only if research is final and won't be updated)
Choice? (1/2/3)
If user chooses compress:
gzip abstracts-cache.json
# Creates abstracts-cache.json.gz
echo "Compressed abstracts-cache.json to $(du -h abstracts-cache.json.gz | cut -f1)"
If user created helper scripts during research:
๐ Found helper scripts:
- screen_papers.py (created for batch screening)
- deep_dive_papers.py (created for data extraction)
These scripts document your methodology. Recommendations:
- Keep for reproducibility
- Add comments if not already documented
- Reference in SUMMARY.md under "Reproducibility" section
Keep scripts? (y/n)
If cleaning up multiple sessions:
# Find all research sessions
find research-sessions/ -maxdepth 1 -type d
# For each session:
for session in research-sessions/*/; do
echo "Analyzing: $session"
# Run cleanup analysis
done
Ask user:
Found 5 completed research sessions.
Clean up all sessions? (y/n/select)
- y: Analyze and clean all sessions
- n: Cancel
- select: Choose which sessions to clean
Maintain hardcoded list of patterns to NEVER delete:
PROTECTED_PATTERNS = [
'SUMMARY.md',
'relevant-papers.json',
'papers-reviewed.json',
'papers/*.pdf',
'papers/*.zip',
'citations/citation-graph.json',
'screening-criteria.json',
'test-set.json',
'abstracts-cache.json',
'rubric-changelog.md',
'README.md',
'TOP_PRIORITY_PAPERS.md',
'evaluated-papers.json',
'*.py', # Helper scripts
'.claude/*', # Project settings
]
Before deleting any file:
def is_protected(filepath):
"""Check if file matches any protected pattern"""
for pattern in PROTECTED_PATTERNS:
if fnmatch(filepath, pattern):
return True
return False
# Never delete protected files
if is_protected(file_to_delete):
print(f"โ ๏ธ ERROR: {file_to_delete} is protected and cannot be deleted")
return
Always show what will be deleted before doing it:
# Dry run (show only, don't delete)
echo "DRY RUN - No files will be deleted"
for file in $candidate_files; do
if is_safe_to_delete "$file"; then
echo "Would delete: $file ($(du -h $file | cut -f1))"
fi
done
echo ""
echo "Proceed with actual deletion? (y/n)"
After answering-research-questions workflow:
Add to answering-research-questions Phase 8:
### Optional: Cleanup
After reviewing outputs, optionally clean up intermediate files:
"Research session is complete. Would you like me to clean up intermediate files?
I'll show you what will be deleted before removing anything."
If yes: Use `cleaning-up-research-sessions` skill
Deleting papers-reviewed.json: This is the deduplication database - NEVER delete โ Always protect with hardcoded list Deleting abstracts-cache.json: Needed for re-screening โ Ask user, default to keep Deleting helper scripts: Important for reproducibility โ Keep by default, ask if user wants to remove Not showing user what will be deleted: User needs to see the plan โ Always show dry run first Deleting without confirmation: Too risky โ Always ask for final confirmation Not verifying after cleanup: Files could be corrupted โ Run integrity check after deletion
| Action | Command |
|---|---|
| List all files with sizes | find . -type f -exec ls -lh {} \; | sort -rh |
| Check if JSON valid | jq empty file.json |
| Move to trash (macOS) | mv file ~/.Trash/ |
| Move to trash (Linux) | mv file ~/.local/share/Trash/files/ |
| Compress large file | gzip file |
| Calculate folder size | du -sh . |
Cleanup successful when:
User: "Can you clean up the research session folder?"
Agent: "I'll analyze the research session and show you what can be safely cleaned up."
[Analyzes folder]
Agent: "๐งน Cleanup Analysis for: research-sessions/2025-10-11-btk-selectivity/
Files to KEEP (protected):
โ
SUMMARY.md (45 KB) - Final findings
โ
relevant-papers.json (12 KB) - Filtered results
โ
papers-reviewed.json (28 KB) - Complete history
โ
papers/ (14 PDFs, 32 MB) - Downloaded papers
โ
citations/citation-graph.json (5 KB) - Citation network
โ
screening-criteria.json (2 KB) - Rubric definition
โ
abstracts-cache.json (156 KB) - Cached abstracts
Files that CAN be removed:
๐๏ธ initial-search-results.json (8 KB) - Raw PubMed results
โ Data is preserved in papers-reviewed.json
๐๏ธ .DS_Store (6 KB) - macOS metadata
โ Not needed
Total space to recover: 14 KB
Proceed with cleanup? (y/n/review)"
User: "y"
Agent: "Final confirmation - About to delete:
- initial-search-results.json (8 KB)
- .DS_Store (6 KB)
All protected files will be preserved. Confirm? (y/n)"
User: "y"
Agent: [Deletes files]
"โ
Cleanup complete!
Removed:
- initial-search-results.json (8 KB)
- .DS_Store (6 KB)
Space recovered: 14 KB
Running integrity check...
โ
All core files intact
โ
All JSON files valid
โ
All 14 PDFs present
Research session is ready for archiving or sharing."
After cleanup:
Use when working with Payload CMS projects (payload.config.ts, collections, fields, hooks, access control, Payload API). Use when debugging validation errors, security issues, relationship queries, transactions, or hook behavior.
Applies Anthropic's official brand colors and typography to any sort of artifact that may benefit from having Anthropic's look-and-feel. Use it when brand colors or style guidelines, visual formatting, or company design standards apply.
Creating algorithmic art using p5.js with seeded randomness and interactive parameter exploration. Use this when users request creating art using code, generative art, algorithmic art, flow fields, or particle systems. Create original algorithmic art rather than copying existing artists' work to avoid copyright violations.