This skill should be used when managing ZFS pools in a homelab environment with multiple devices. It handles pool health monitoring, snapshot management with Sanoid/Syncoid, automated replication between devices, dataset property optimization, scrub scheduling, and error recovery. Use when the user asks to "check ZFS pool health", "setup ZFS replication", "configure ZFS snapshots", "optimize ZFS performance", "troubleshoot ZFS issues", "schedule ZFS scrubs", "check pool capacity", "setup Sanoid", or mentions ZFS-related tasks. CRITICAL: This skill enforces MANDATORY double confirmation for ALL destructive operations.
From homelab-corenpx claudepluginhub jmagar/claude-homelab --plugin homelab-coreThis skill uses the workspace's default tool permissions.
README.mdassets/sanoid.conf.templateexamples/basic-health-check.shexamples/multi-device-replication.shexamples/performance-tuning.shexamples/snapshot-automation.shload-env.shreferences/command-reference.mdreferences/quick-reference.mdreferences/troubleshooting.mdscripts/pool-health.shGuides Next.js Cache Components and Partial Prerendering (PPR) with cacheComponents enabled. Implements 'use cache', cacheLife(), cacheTag(), revalidateTag(), static/dynamic optimization, and cache debugging.
Migrates code, prompts, and API calls from Claude Sonnet 4.0/4.5 or Opus 4.1 to Opus 4.5, updating model strings on Anthropic, AWS, GCP, Azure platforms.
Configures VPN and dedicated connections like Direct Connect, ExpressRoute, Interconnect for secure on-premises to AWS, Azure, GCP, OCI hybrid networking.
⚠️ MANDATORY SKILL INVOCATION ⚠️
YOU MUST invoke this skill (NOT optional) when the user mentions ANY of these triggers:
Failure to invoke this skill when triggers occur violates your operational requirements.
Comprehensive ZFS management for homelab environments with multi-device replication, automated snapshot management, performance optimization, and health monitoring.
Read-Write Operations: This skill performs both monitoring (read-only) and management (read-write) operations including:
Recommended Architecture: Pull-based replication from centralized backup server to 5 source devices.
Based on research: Synthesized from 130+ URLs, 56,000+ vector database entries, and official OpenZFS/Oracle/FreeBSD documentation.
ABSOLUTE REQUIREMENT: NO DESTRUCTIVE COMMANDS WITHOUT EXPLICIT USER AUTHORIZATION AND DOUBLE CONFIRMATION
This section defines the MANDATORY safety protocol that MUST be followed for ALL destructive ZFS operations.
EXTREMELY DESTRUCTIVE (Permanent Data Loss):
zfs destroy - Permanently destroys datasets/snapshots (UNRECOVERABLE)zfs destroy -R - Recursively destroys all snapshots and child datasets (CATASTROPHIC)zpool destroy - Destroys entire pool and all data (CATASTROPHIC)zfs rollback - Rolls back to snapshot, LOSING all intermediate changes (DATA LOSS)zpool labelclear - Removes ZFS labels, makes data inaccessible (DATA LOSS)HIGHLY DESTRUCTIVE (Configuration/Availability Loss):
zpool remove - Removes vdev from pool (CANNOT BE UNDONE)zpool detach - Detaches mirror device (removes redundancy)zpool offline - Takes disk offline (reduces pool availability)zpool replace - Replaces disk in pool (requires resilver)zpool clear - Clears pool errors (can mask real problems)MODERATELY DESTRUCTIVE (Cleanup Operations):
sanoid --prune-snapshots - Deletes old snapshots per retention policysyncoid --no-sync-snap - Deletes snapshots on destinationzpool scrub -s - Stops in-progress scrub (interrupts data verification)POTENTIALLY IRREVERSIBLE (Property Changes):
zfs set recordsize= - CANNOT be changed efficiently on existing datazfs set compression= - Only affects new data, not existingzfs inherit - Resets properties to inherited values (loses custom settings)YOU MUST FOLLOW THIS PROTOCOL FOR ALL DESTRUCTIVE OPERATIONS:
--dry-run or -n flags if availableWRONG - NEVER DO THIS:
User: "I have too many snapshots"
Assistant: "Let me clean those up for you."
[Executes: sanoid --prune-snapshots] ❌ NO CONFIRMATION
CORRECT - ALWAYS DO THIS:
User: "I have too many snapshots"
Assistant: "I can help you prune old snapshots. First, let me check what would be deleted:
Command: sanoid --prune-snapshots --verbose --dry-run
This will show which snapshots would be removed based on your retention policy in /etc/sanoid/sanoid.conf.
⚠️ WARNING: This is a DESTRUCTIVE operation. Deleted snapshots cannot be recovered.
Before proceeding:
1. Have you verified your retention policy is correct?
2. Do you have backups of critical data?
3. Are you sure you want to permanently delete old snapshots?
If yes, I'll provide the command for you to run manually."
[User confirms twice]
Assistant: "Here is the command to run manually:
```bash
sudo sanoid --prune-snapshots --verbose
⚠️ FINAL WARNING: This will PERMANENTLY DELETE old snapshots. Are you absolutely sure?"
[User confirms a second time: "Yes, delete them"]
Assistant: "Understood. Please run the command above. I cannot execute destructive operations automatically."
### Safe Operations (No Confirmation Needed)
These operations are read-only and safe to execute:
- `./scripts/pool-health.sh` - Pool health checks
- `zpool status` - View pool status
- `zfs list` - List datasets/snapshots
- `zpool list` - List pools
- `zfs get all` - View properties
- `sanoid --verbose --dry-run` - Preview snapshot operations
- `syncoid --dry-run` - Preview replication
## Setup
This skill uses ZFS commands directly and does not require additional credentials beyond SSH access for remote replication.
**Required for remote replication:**
Remote replication requires passwordless SSH authentication between the backup server and source devices. This allows Syncoid to pull snapshots automatically without manual intervention.
1. **SSH keys configured** (passwordless authentication):
```bash
# Generate SSH key on backup server
ssh-keygen -t ed25519 -C "zfs-replication"
# Copy to each source device
ssh-copy-id user@device1
ssh-copy-id user@device2
# ... repeat for all devices
# Test connectivity
ssh user@device1 echo "SSH working"
ZFS delegation (for non-root replication):
# On backup server
zfs allow -u replication-user create,mount,receive backup/device1
Sanoid/Syncoid installed (for automation):
sudo apt install sanoid # Debian/Ubuntu
sudo pkg install sanoid # FreeBSD
See README.md for detailed setup instructions.
# Check all pools
./scripts/pool-health.sh
# Check specific pool
./scripts/pool-health.sh tank
# JSON output for monitoring
./scripts/pool-health.sh --json
# Configure Sanoid
cp assets/sanoid.conf.template /etc/sanoid/sanoid.conf
sudo nano /etc/sanoid/sanoid.conf
# Manual snapshot
sudo sanoid --take-snapshots --verbose
# Manual prune
sudo sanoid --prune-snapshots --verbose
# List snapshots
zfs list -t snapshot
# Manual replication (pull-based)
syncoid --recursive user@device1:tank backup/device1
# With options
syncoid \
--recursive \
--no-privilege-elevation \
--identifier=device1 \
--compress=zstd-fast \
user@device1:tank backup/device1
# Enable compression (always)
zfs set compression=lz4 pool/dataset
# Disable atime
zfs set atime=off pool/dataset
# Tune recordsize (workload-specific)
zfs set recordsize=8K pool/databases # Databases
zfs set recordsize=1M pool/media # Large files
zfs set recordsize=128K pool/data # Default
# Manual scrub
zpool scrub poolname
# Check scrub status
zpool status poolname
# Pause scrub
zpool scrub -p poolname
# Add to cron (monthly)
0 2 * * 0 [ $(date +\%d) -le 7 ] && /usr/sbin/zpool scrub tank
./scripts/pool-health.shzpool status -v poolnamegrep syncoid /var/log/syslog | tail -20zfs list -t snapshot (SAFE - read-only)zfs rollback or zfs clone (REQUIRES DOUBLE CONFIRMATION)sanoid --prune-snapshots (REQUIRES DOUBLE CONFIRMATION, use --dry-run first)recordsize=8K, enable lz4 compression--identifier flag per device--compress=zstd-fastscripts/pool-health.sh - Comprehensive pool health checker with JSON output support. Checks state, capacity, scrub status, and generates alerts.
references/command-reference.md - Complete ZFS command syntax reference for zpool, zfs, sanoid, and syncoid commands with parameters and examples.
references/quick-reference.md - Quick command cheatsheet for common ZFS operations.
references/troubleshooting.md - Comprehensive troubleshooting guide covering:
Load this reference when user encounters errors or requests troubleshooting assistance.
assets/sanoid.conf.template - Sanoid configuration template with homelab-optimized retention policies ready to copy to /etc/sanoid/sanoid.conf.
ALL DESTRUCTIVE COMMANDS REQUIRE:
NEVER execute these commands automatically:
zfs destroy - Permanent data losszpool destroy - Catastrophic data losszfs rollback - Loses intermediate changessanoid --prune-snapshots - Deletes snapshots permanentlySee "🚨 DESTRUCTIVE OPERATIONS - CRITICAL SAFETY PROTOCOL" section above for complete details.
For production deployments:
Official Documentation:
Automation Tools:
Additional Resources:
references/command-reference.md - Complete ZFS command syntaxreferences/quick-reference.md - Quick command cheatsheetreferences/troubleshooting.md - Detailed troubleshooting guideCRITICAL: When invoking scripts from this skill via the zsh-tool, ALWAYS use pty: true.
Without PTY mode, command output will not be visible even though commands execute successfully.
Correct invocation pattern:
<invoke name="mcp__plugin_zsh-tool_zsh-tool__zsh">
<parameter name="command">./skills/zfs/scripts/pool-health.sh [args]</parameter>
<parameter name="pty">true</parameter>
</invoke>