From argos
Redis disipline — cache (eviction policy + TTL jitter + stampede mitigation) + pub-sub backplane (multi-replica WS fanout, subscriber leak) + rate limit (token bucket Lua atomic) + ops (memory audit + slow log + persistence) + Sentinel HA. Cluster ileri sürümde. Streams + distributed lock scope dışında. `database-optimizer` Postgres tarafı; bu skill Redis tarafı.
npx claudepluginhub resultakak/argos --plugin argosThis skill uses the workspace's default tool permissions.
`agents/shared/severity-rubric.md` ve `agents/shared/escalation-matrix.md`
Mandates invoking relevant skills via tools before any response in coding sessions. Covers access, priorities, and adaptations for Claude Code, Copilot CLI, Gemini CLI.
Share bugs, ideas, or general feedback.
agents/shared/severity-rubric.md ve agents/shared/escalation-matrix.md
default-load sayılır (agents/coordination.md §11). Bu skill'in çıktısı
Critical / High / Medium / Low + kanıt formatında olmak zorunda — spekülatif
Critical yasak. Sahiplik dışı bulgu ilgili agent'a delege:
realtime-systems-reviewer — pub-sub backplane WS scalesecurity-reviewer — rate limit auth integration, OWASP A04database-optimizer — cache invalidation Postgres-bağımlı (write-through)performance-profiler — memory/latency genelKEYS yasak prod; SCAN.| Konu | Komut / araç |
|---|---|
| Genel inspect | redis-cli INFO, CLIENT LIST, CONFIG GET * |
| Big key | redis-cli --bigkeys, MEMORY USAGE <key> |
| Key scan | SCAN 0 MATCH "..." COUNT 100 |
| Slow log | SLOWLOG GET 25 |
| Latency | redis-cli --latency, --latency-history |
| Memory | INFO memory (used_memory_human, mem_fragmentation_ratio) |
| Persistence | INFO persistence, LASTSAVE, BGSAVE |
| Pub-sub | PUBSUB CHANNELS, PUBSUB NUMSUB, PUBSUB NUMPAT |
| Replication | INFO replication, ROLE |
| Sentinel | SENTINEL masters, SENTINEL sentinels mymaster |
# Sürüm + bellek
redis-cli INFO server | grep -E "redis_version|os|arch_bits"
redis-cli INFO memory | grep -E "used_memory_human|maxmemory_human|maxmemory_policy|mem_fragmentation_ratio"
redis-cli INFO stats | grep -E "instantaneous_ops|total_commands|keyspace_hits|keyspace_misses"
# Hit rate
HITS=$(redis-cli INFO stats | grep keyspace_hits | cut -d: -f2 | tr -d '\r')
MISS=$(redis-cli INFO stats | grep keyspace_misses | cut -d: -f2 | tr -d '\r')
echo "scale=4; $HITS/($HITS+$MISS)" | bc
# Cache: target > 0.85; çok düşükse TTL/eviction/key tasarımı sorunu
# Key sayısı + distribution
redis-cli DBSIZE
redis-cli INFO keyspace
# Eviction policy
redis-cli CONFIG GET maxmemory-policy
# noeviction = OOM yazma reddi; allkeys-lru / volatile-lru kullanım'a göre
# maxmemory
redis-cli CONFIG GET maxmemory
# 0 = sınırsız (host RAM); set zorunlu
# TTL'siz key tarama
redis-cli --scan --pattern '*' | head -100 | while read k; do
ttl=$(redis-cli TTL "$k")
[ "$ttl" = "-1" ] && echo "NO_TTL: $k"
done
# Çıktı çoksa: cache key'lerinde TTL yok → memory leak riski
İki seçenek:
A. Lock + cache-aside:
def get_cached(key: str, loader):
val = redis.get(key)
if val: return val
if redis.set(f"lock:{key}", "1", nx=True, ex=5):
try:
fresh = loader()
redis.setex(key, 3600 + jitter(), fresh)
return fresh
finally:
redis.delete(f"lock:{key}")
else:
time.sleep(0.05)
return redis.get(key) or loader() # fallback waiter
B. XFetch (probabilistic early refresh):
def get_cached_xfetch(key: str, loader, beta=1.0):
val_with_delta = redis.hgetall(key)
if val_with_delta:
ttl = redis.ttl(key)
delta = float(val_with_delta.get("delta", "0"))
# Stochastic refresh: TTL azaldıkça olasılık artar
if random.random() < math.exp(-beta * delta / ttl):
fresh = loader()
redis.hset(key, mapping={"val": fresh, "delta": elapsed_to_load()})
redis.expire(key, 3600 + jitter())
return fresh
return val_with_delta["val"]
fresh = loader()
redis.hset(key, mapping={"val": fresh, "delta": elapsed_to_load()})
redis.expire(key, 3600 + jitter())
return fresh
# Backend publish (any service)
redis.publish("notifications", json.dumps({"user_id": 42, "msg": "..."}))
# WS replica (N kopya)
class WsManager:
def __init__(self):
self.local_clients: dict[int, set[WebSocket]] = {}
async def listen(self):
pubsub = self.redis.pubsub()
await pubsub.subscribe("notifications")
try:
async for msg in pubsub.listen():
if msg["type"] != "message":
continue
payload = json.loads(msg["data"])
clients = self.local_clients.get(payload["user_id"], set())
for ws in clients:
await ws.send_json(payload)
finally:
await pubsub.unsubscribe("notifications")
await pubsub.close()
Subscriber leak teşhis:
redis-cli CLIENT LIST | awk '$8 ~ /sub=[1-9]/ {print}' # sub > 0 client'lar
redis-cli PUBSUB CHANNELS # aktif kanal
redis-cli PUBSUB NUMSUB channel1 # her kanaldaki subscriber count
sub=N artıyor ama replica sayısı sabit → leak. Eski PID'lere ait subscription
kapanmamış. CLIENT KILL ID <id> veya restart.
RATE_LIMIT_LUA = """
local key = KEYS[1]
local capacity = tonumber(ARGV[1])
local refill_rate = tonumber(ARGV[2])
local now = tonumber(ARGV[3])
local data = redis.call('HMGET', key, 'tokens', 'last_refill')
local tokens = tonumber(data[1]) or capacity
local last_refill = tonumber(data[2]) or now
local elapsed = math.max(0, now - last_refill)
tokens = math.min(capacity, tokens + elapsed * refill_rate)
if tokens < 1 then
redis.call('HSET', key, 'tokens', tokens, 'last_refill', now)
redis.call('EXPIRE', key, 3600)
return {0, tokens}
end
tokens = tokens - 1
redis.call('HSET', key, 'tokens', tokens, 'last_refill', now)
redis.call('EXPIRE', key, 3600)
return {1, tokens}
"""
sha = redis.script_load(RATE_LIMIT_LUA)
def check_rate_limit(user_id: int, capacity=60, refill_rate=1.0) -> bool:
allowed, remaining = redis.evalsha(
sha, 1, f"rl:{user_id}", capacity, refill_rate, time.time(),
)
return bool(allowed)
# Big key
redis-cli --bigkeys
# Tek key bytes
redis-cli MEMORY USAGE order:cart:42 SAMPLES 0
# Top 100 key sample (Redis 6.2+)
redis-cli --memkeys --memkeys-samples 100
# Fragmentation
redis-cli INFO memory | grep mem_fragmentation_ratio
# > 1.5 → defrag düşün
redis-cli CONFIG SET activedefrag yes
redis-cli CONFIG SET active-defrag-threshold-lower 10
redis-cli CONFIG SET slowlog-log-slower-than 10000 # 10ms
redis-cli SLOWLOG GET 25 | head -50
Sık çıkanlar:
KEYS * → SCAN ile değiştirHGETALL bigkey → HSCANLRANGE 0 -1 → keyset-style (LPOP/RPOP batch veya ZRANGEBYSCORE)SMEMBERS huge_set → SSCANredis-cli INFO persistence
# rdb_last_save_time, rdb_changes_since_last_save
# aof_enabled, aof_last_rewrite_time_sec
# Cache-only — persistence kapat
redis-cli CONFIG SET save ""
redis-cli CONFIG SET appendonly no
# Source-of-truth — AOF everysec
redis-cli CONFIG SET appendonly yes
redis-cli CONFIG SET appendfsync everysec
# Sentinel info
redis-cli -p 26379 SENTINEL masters
redis-cli -p 26379 SENTINEL sentinels mymaster
redis-cli -p 26379 SENTINEL get-master-addr-by-name mymaster
# Manual failover drill (staging)
redis-cli -p 26379 SENTINEL failover mymaster
# Min replicas
redis-cli CONFIG SET min-replicas-to-write 1
redis-cli CONFIG SET min-replicas-max-lag 10
App tarafı: sentinel-aware client zorunlu (örn. redis-py Sentinel,
ioredis sentinel mode). Direct master IP yazılırsa failover sonrası bağlantı
yenilenmez.
# Redis Findings: <cluster/service>
## Critical
- [ ] `maxmemory` set değil + persistence on → OOM kernel kill, RDB corrupt
riski — `redis.conf` veya `CONFIG SET maxmemory 4gb`
## High
- [ ] Eviction policy `noeviction` cache için → yazma reddi —
`CONFIG SET maxmemory-policy allkeys-lru`
- [ ] 1.4M key TTL'siz (cache) — `SCAN` ile sample 100, %78 TTL=-1 →
memory leak; `setex` zorunlu hale getir, mevcut'lara backfill TTL
## Medium
- [ ] Slow log: 12 `KEYS *` query saatte; p99 320ms — `SCAN` migrasyonu
- [ ] `mem_fragmentation_ratio` 1.83 — `activedefrag` enable
## Low
- [ ] Sentinel quorum 3 (3 sentinel) — best practice 5 (split-brain
düşük olasılık)
maxmemory set + workload'a uygunmaxmemory-policy cache için allkeys-lru/lfu veya karma için
volatile-lru/lfu--bigkeys); > 100KB key yokKEYS/HGETALL bigkey yokmem_fragmentation_ratio < 1.5CLIENT LIST | sub=)noeviction cache için — yazma reddi prod incident.volatile-lru + TTL'siz key karışımı — TTL'siz'ler büyür sınırsız.KEYS * prod'da — O(N) blocking.mem_fragmentation_ratio ignored.appendfsync always cache için — write latency artar, kazanım yok.rules/redis.md — discipline rule.rules/websocket.md — pub-sub backplane fanout protocol.rules/security.md — rate limit OWASP A04 cross-link.skills/websocket-realtime-systems/SKILL.md — WS scale + Redis backplane.skills/postgres-performance/SKILL.md — cache-aside + write-through DB
invalidation pattern.agents/database-optimizer.md — Postgres tarafı (Redis cache invalidation
Postgres write trigger ile).agents/realtime-systems-reviewer.md — pub-sub backplane sahiplik.commands/redis-review.md — slash command entrypoint.redis (read-only) — INFO, CLIENT LIST, SLOWLOG, MEMORY USAGE.