Analyze, apply, and test DAPR resiliency configurations including retry policies, circuit breakers, and timeouts
Analyzes DAPR resiliency configurations, applies patterns, and tests with chaos injection.
/plugin marketplace add Sahib-Sawhney-WH/sahibs-claude-plugin-marketplace/plugin install dapr@sahib-claude-marketplaceManage resiliency configurations for your DAPR applications to ensure fault tolerance and graceful degradation.
When the user runs /dapr:resiliency:
Analyze the current resiliency configuration:
DAPR Resiliency Analysis
=========================
Configuration Files Found:
✓ resiliency.yaml (dapr.io/v1alpha1)
✓ components/statestore.yaml
✓ components/pubsub.yaml
Policy Analysis:
-----------------
Retry Policies (3 defined):
✓ serviceRetry
- Type: exponential
- Max Retries: 5
- Max Interval: 30s
Status: GOOD
⚠ stateStoreRetry
- Type: constant
- Duration: 500ms
- Max Retries: 10
Status: WARNING - Consider exponential backoff for production
✗ externalApiRetry
- Type: constant
- Duration: 100ms
- Max Retries: 20
Status: PROBLEM - Too aggressive, may cause rate limiting
Timeout Policies (2 defined):
✓ serviceTimeout: 30s (reasonable)
⚠ externalTimeout: 5s (may be too short for external APIs)
Circuit Breaker Policies (1 defined):
✓ serviceBreaker
- Threshold: 5 consecutive failures
- Recovery: 60s
Status: GOOD
Missing Policies:
⚠ No circuit breaker for state store operations
⚠ No timeout policy for pub/sub
Target Coverage:
-----------------
Apps: 2/3 covered (order-service, payment-service)
Components: 1/2 covered (statestore)
Uncovered:
- notification-service (no resiliency)
- pubsub (no policies)
Recommendations:
-----------------
1. Add circuit breaker for statestore to prevent cascade failures
2. Increase externalTimeout to 30s for third-party APIs
3. Change externalApiRetry to exponential backoff
4. Add resiliency policies for notification-service
Apply a resiliency pattern to the project:
/dapr:resiliency apply --pattern microservices
Applying 'microservices' resiliency pattern...
Created: resiliency.yaml
Applied policies:
✓ serviceRetry (exponential, 5 retries, 30s max)
✓ serviceTimeout (30s)
✓ serviceCircuitBreaker (5 failures, 60s recovery)
Configured targets:
✓ All apps: serviceTimeout + serviceRetry + serviceCircuitBreaker
✓ statestore: stateStoreRetry + stateStoreCircuitBreaker
✓ pubsub: pubsubRetry
Next steps:
1. Review resiliency.yaml and customize values
2. Run 'dapr run' to test locally
3. Use '/dapr:resiliency test' to validate behavior
Test resiliency policies with chaos injection:
/dapr:resiliency test --component statestore
Testing resiliency for 'statestore'...
Test 1: Retry Policy
---------------------
Injecting failures: 3 consecutive errors
Expected behavior: Retry and succeed on 4th attempt
Result: ✓ PASSED
- Attempts: 4
- Total time: 2.3s
- Backoff observed: 500ms, 1s, 2s
Test 2: Circuit Breaker
-----------------------
Injecting failures: 6 consecutive errors
Expected behavior: Circuit opens after 5 failures
Result: ✓ PASSED
- Failures before open: 5
- Circuit state: OPEN
- Rejections: 1
Test 3: Timeout
---------------
Injecting latency: 35s
Expected behavior: Timeout at 30s
Result: ✓ PASSED
- Actual timeout: 30.1s
Test 4: Recovery
----------------
Waiting for circuit recovery: 60s
Expected behavior: Circuit moves to HALF-OPEN
Result: ✓ PASSED
- Recovery time: 60s
- State: HALF-OPEN
- Test request: SUCCESS
- Final state: CLOSED
Summary:
---------
All tests passed! Resiliency configuration is working correctly.
| Argument | Description |
|---|---|
analyze | Analyze current resiliency configuration (default) |
apply | Apply a resiliency pattern |
test | Test resiliency with chaos injection |
--pattern | Pattern to apply (high-throughput, critical-service, external-api, microservices, event-driven, batch-processing) |
--target | Target platform (local, aca, aks, kubernetes) |
--component | Specific component to test |
--app | Specific app to test |
--duration | Test duration in seconds |
--output | Output format (text, json, yaml) |
Best for: APIs with high request volume, real-time systems
Best for: Payment processing, order management
Best for: Third-party integrations
Best for: General service-to-service communication
Best for: Pub/sub, async processing
Best for: ETL, data migration
/dapr:resiliency
/dapr:resiliency apply --pattern microservices
/dapr:resiliency apply --pattern critical-service --target aca
/dapr:resiliency test --component statestore --duration 120
/dapr:resiliency test --duration 300
/dapr:resiliency analyze --output json > resiliency-report.json
When --target aca is specified, the command generates ACA-compatible resiliency configuration:
# ACA-specific resiliency (applied via az containerapp update)
properties:
template:
containers:
- name: my-app
scale:
rules:
- name: http-scale
http:
metadata:
concurrentRequests: "100"
configuration:
dapr:
enabled: true
appId: my-app
# Resiliency via DAPR component
| Code | Meaning |
|---|---|
| 0 | Success / All tests passed |
| 1 | Analysis found issues |
| 2 | Tests failed |
| 3 | Configuration error |
/dapr:security - Security scanning/dapr:component - Component management/dapr:test - Integration testing/dapr:deploy - Deployment with resiliency validation