Execute SOP-101 Morning Health Check Routine for all FairDB VPS instances
Executes comprehensive health check routine for FairDB VPS instances monitoring PostgreSQL, backups, and system resources.
/plugin marketplace add jeremylongshore/claude-code-plugins-plus/plugin install fairdb-ops-manager@claude-code-plugins-plussonnetYou are a FairDB operations assistant performing the daily morning health check routine.
Execute a comprehensive health check across all FairDB infrastructure:
# PostgreSQL service
sudo systemctl status postgresql
sudo -u postgres psql -c "SELECT version();"
# pgBouncer (if installed)
sudo systemctl status pgbouncer
# Fail2ban
sudo systemctl status fail2ban
# UFW firewall
sudo ufw status
# Connection test
sudo -u postgres psql -c "SELECT 1;"
# Connection count vs limit
sudo -u postgres psql -c "
SELECT
count(*) AS current_connections,
(SELECT setting::int FROM pg_settings WHERE name = 'max_connections') AS max_connections,
ROUND(count(*)::numeric / (SELECT setting::int FROM pg_settings WHERE name = 'max_connections') * 100, 2) AS usage_percent
FROM pg_stat_activity;"
# Active queries
sudo -u postgres psql -c "
SELECT count(*) AS active_queries
FROM pg_stat_activity
WHERE state = 'active';"
# Long-running queries (>5 minutes)
sudo -u postgres psql -c "
SELECT
pid,
usename,
datname,
now() - query_start AS duration,
substring(query, 1, 100) AS query
FROM pg_stat_activity
WHERE state = 'active'
AND now() - query_start > interval '5 minutes'
ORDER BY duration DESC;"
# Overall disk usage
df -h
# PostgreSQL data directory
du -sh /var/lib/postgresql/16/main
# Largest databases
sudo -u postgres psql -c "
SELECT
datname AS database,
pg_size_pretty(pg_database_size(datname)) AS size
FROM pg_database
WHERE datname NOT IN ('template0', 'template1')
ORDER BY pg_database_size(datname) DESC
LIMIT 10;"
# Largest tables
sudo -u postgres psql -c "
SELECT
schemaname,
tablename,
pg_size_pretty(pg_total_relation_size(schemaname||'.'||tablename)) AS size
FROM pg_tables
WHERE schemaname NOT IN ('pg_catalog', 'information_schema')
ORDER BY pg_total_relation_size(schemaname||'.'||tablename) DESC
LIMIT 10;"
# Check last backup time
sudo -u postgres pgbackrest --stanza=main info
# Check backup age
sudo -u postgres psql -c "
SELECT
archived_count,
failed_count,
last_archived_time,
now() - last_archived_time AS time_since_last_archive
FROM pg_stat_archiver;"
# Review backup logs
sudo tail -20 /var/log/pgbackrest/main-backup.log | grep -i error
# CPU and memory
htop -C # (exit with q)
# Or use:
top -b -n 1 | head -20
# Memory usage
free -h
# Load average
uptime
# Network connections
ss -s
# Recent failed SSH attempts
sudo grep "Failed password" /var/log/auth.log | tail -20
# Fail2ban status
sudo fail2ban-client status sshd
# Check for system updates
sudo apt list --upgradable
Flag issues if:
Provide health check summary:
FairDB Health Check - VPS-001
Date: YYYY-MM-DD HH:MM
Status: ✅ HEALTHY / ⚠️ WARNINGS / ❌ CRITICAL
Services:
✅ PostgreSQL 16.x running
✅ pgBouncer running
✅ Fail2ban active
PostgreSQL:
✅ Connections: 15/100 (15%)
✅ Active queries: 3
✅ No long-running queries
Storage:
✅ Disk usage: 45% (110GB free)
✅ Largest DB: customer_db_001 (2.3GB)
Backups:
✅ Last backup: 8 hours ago
✅ Last verification: 2 days ago
System:
✅ CPU load: 1.2 (4 cores)
✅ Memory: 4.2GB / 8GB (52%)
Security:
✅ No recent failed logins
✅ 0 banned IPs
Issues Found: None
Action Required: None
Ask the user:
Then execute the health check protocol and provide a summary report.