Tracks and analyzes agent performance metrics in real-time, generates performance reports, identifies trends, and provides data-driven recommendations
Monitors agent performance metrics in real-time, generates reports, and provides data-driven recommendations.
/plugin marketplace add Lobbi-Docs/claude/plugin install jira-orchestrator@claude-orchestrationhaikuYou are a high-speed performance monitoring and metrics analysis agent. Your mission is to provide real-time visibility into agent performance, track learning system effectiveness, and alert on anomalies.
function calculateSuccessRate(events: LearningEvent[]): number {
if (events.length === 0) return 0;
const successes = events.filter(e => e.outcome.success).length;
return successes / events.length;
}
// With confidence interval
function successRateWithCI(events: LearningEvent[], confidence: number = 0.95): {
rate: number;
lowerBound: number;
upperBound: number;
} {
const n = events.length;
const successes = events.filter(e => e.outcome.success).length;
const rate = successes / n;
// Wilson score interval (better for small samples)
const z = confidence === 0.95 ? 1.96 : 2.576; // 95% or 99%
const denominator = 1 + (z * z) / n;
const center = (rate + (z * z) / (2 * n)) / denominator;
const margin = (z * Math.sqrt(rate * (1 - rate) / n + (z * z) / (4 * n * n))) / denominator;
return {
rate,
lowerBound: center - margin,
upperBound: center + margin
};
}
function calculateEfficiency(events: LearningEvent[]): number {
const tasksWithEstimates = events.filter(e =>
e.task.estimatedDuration && e.task.estimatedDuration > 0
);
if (tasksWithEstimates.length === 0) return 1.0;
const totalActual = tasksWithEstimates.reduce((sum, e) => sum + e.outcome.duration, 0);
const totalEstimated = tasksWithEstimates.reduce((sum, e) => sum + e.task.estimatedDuration!, 0);
return totalActual / totalEstimated; // <1.0 = faster than estimated, >1.0 = slower
}
function calculateSpecializationIndex(events: LearningEvent[]): number {
// Measures how concentrated work is in specific domains
// 0 = perfectly balanced, 1 = highly specialized
const domainCounts = new Map<string, number>();
let totalDomains = 0;
for (const event of events) {
for (const domain of event.task.domains || []) {
domainCounts.set(domain, (domainCounts.get(domain) || 0) + 1);
totalDomains++;
}
}
if (domainCounts.size === 0) return 0;
// Calculate Herfindahl-Hirschman Index
let hhi = 0;
for (const count of domainCounts.values()) {
const share = count / totalDomains;
hhi += share * share;
}
// Normalize: 0 (perfectly balanced) to 1 (single domain)
const maxHHI = 1;
const minHHI = 1 / domainCounts.size;
return (hhi - minHHI) / (maxHHI - minHHI);
}
function calculateLearningRate(events: LearningEvent[]): number {
// Measures performance improvement over time
// Uses linear regression on success rate over time
if (events.length < 10) return 0;
// Sort by timestamp
const sorted = events.sort((a, b) => a.timestamp.getTime() - b.timestamp.getTime());
// Split into equal time buckets (e.g., weeks)
const buckets = splitIntoTimeBuckets(sorted, 7); // 7-day buckets
if (buckets.length < 3) return 0;
// Calculate success rate per bucket
const points = buckets.map((bucket, i) => ({
x: i,
y: calculateSuccessRate(bucket)
}));
// Simple linear regression
const n = points.length;
const sumX = points.reduce((sum, p) => sum + p.x, 0);
const sumY = points.reduce((sum, p) => sum + p.y, 0);
const sumXY = points.reduce((sum, p) => sum + p.x * p.y, 0);
const sumXX = points.reduce((sum, p) => sum + p.x * p.x, 0);
const slope = (n * sumXY - sumX * sumY) / (n * sumXX - sumX * sumX);
return slope; // Positive = improving, negative = declining
}
import { getLearningSystem } from '../lib/learning-system';
function generateDashboard(): string {
const system = getLearningSystem();
const metrics = system.getMetrics();
const profiles = Array.from(system.profiles.values());
const dashboard = `
╔════════════════════════════════════════════════════════════╗
║ Jira Orchestrator - Learning System Dashboard ║
╠════════════════════════════════════════════════════════════╣
║ SYSTEM METRICS ║
║────────────────────────────────────────────────────────────║
║ Total Events: ${metrics.totalEvents.toString().padStart(8)} ║
║ Success Rate: ${(metrics.averageSuccessRate * 100).toFixed(1).padStart(6)}% ║
║ Improvement Rate: ${(metrics.improvementRate * 100).toFixed(1).padStart(6)}% ║
║ Patterns Extracted: ${metrics.patternsExtracted.toString().padStart(8)} ║
║ Active Agents: ${profiles.length.toString().padStart(8)} ║
╠════════════════════════════════════════════════════════════╣
║ TOP PERFORMERS (Last 30 Days) ║
╠════════════════════════════════════════════════════════════╣
${generateTopPerformers(profiles)}
╠════════════════════════════════════════════════════════════╣
║ ALERTS & ANOMALIES ║
╠════════════════════════════════════════════════════════════╣
${generateAlerts(profiles)}
╚════════════════════════════════════════════════════════════╝
`;
return dashboard;
}
# Agent Performance Report
**Generated:** {{timestamp}}
**Period:** {{start_date}} to {{end_date}}
## Executive Summary
- **System Success Rate:** {{success_rate}}%
- **Total Tasks:** {{total_tasks}}
- **Improvement:** {{improvement}}% vs previous period
- **Alert Count:** {{alert_count}}
## Agent Performance Matrix
| Agent | Tasks | Success | Avg Duration | Efficiency | Trend |
|-------|-------|---------|--------------|------------|-------|
| code-reviewer | 47 | 95.7% | 5.2 min | 0.87 | ↗ +12% |
| implementation-specialist | 38 | 92.1% | 18.3 min | 1.05 | ↗ +5% |
| test-strategist | 29 | 89.7% | 8.1 min | 0.92 | → 0% |
| documentation-writer | 22 | 100% | 6.5 min | 0.78 | ↗ +8% |
## Domain Performance
| Domain | Tasks | Success | Best Agent | Worst Agent |
|--------|-------|---------|------------|-------------|
| Backend | 52 | 96.2% | code-reviewer | ui-specialist |
| Frontend | 38 | 88.2% | ui-specialist | code-reviewer |
| Database | 24 | 91.7% | schema-designer | test-strategist |
## Trends & Insights
### Improving Agents
- **code-reviewer**: +12% success rate (backend specialization strengthening)
- **documentation-writer**: +8% (consistency improving)
### Declining Agents
- **test-strategist**: -5% (struggling with complex integration tests)
### Emerging Patterns
- Backend tasks completing 15% faster than estimated
- Frontend complexity requiring more iterations
## Recommendations
1. Route all backend reviews to code-reviewer (95%+ success rate)
2. Pair test-strategist with integration-expert for complex tests
3. Update test-strategist prompts based on recent failure patterns
4. Continue monitoring frontend complexity trends
function calculateMovingAverage(
events: LearningEvent[],
windowSize: number = 10
): number[] {
const sorted = events.sort((a, b) => a.timestamp.getTime() - b.timestamp.getTime());
const averages: number[] = [];
for (let i = windowSize - 1; i < sorted.length; i++) {
const window = sorted.slice(i - windowSize + 1, i + 1);
const successRate = calculateSuccessRate(window);
averages.push(successRate);
}
return averages;
}
function detectTrend(values: number[]): 'improving' | 'declining' | 'stable' {
if (values.length < 3) return 'stable';
const recent = values.slice(-5);
const older = values.slice(-10, -5);
if (older.length === 0) return 'stable';
const recentAvg = recent.reduce((sum, v) => sum + v, 0) / recent.length;
const olderAvg = older.reduce((sum, v) => sum + v, 0) / older.length;
const change = (recentAvg - olderAvg) / olderAvg;
if (change > 0.1) return 'improving';
if (change < -0.1) return 'declining';
return 'stable';
}
function detectOutliers(events: LearningEvent[]): LearningEvent[] {
// Detect events with unusual duration
const durations = events.map(e => e.outcome.duration);
const mean = durations.reduce((sum, d) => sum + d, 0) / durations.length;
const variance = durations.reduce((sum, d) => sum + Math.pow(d - mean, 2), 0) / durations.length;
const stdDev = Math.sqrt(variance);
const outliers = events.filter(e => {
const zScore = Math.abs(e.outcome.duration - mean) / stdDev;
return zScore > 3; // More than 3 standard deviations
});
return outliers;
}
function detectPerformanceCliffs(profile: AgentProfile): boolean {
const recentSuccess = profile.recentPerformance.recentSuccesses / profile.recentPerformance.recentTasks;
const overallSuccess = profile.successRate;
const drop = overallSuccess - recentSuccess;
// Alert if recent performance is >30% worse than overall
return drop > 0.3;
}
interface Alert {
severity: 'critical' | 'warning' | 'info';
agent: string;
type: string;
message: string;
metric: number;
threshold: number;
}
function generateAlerts(profiles: AgentProfile[]): Alert[] {
const alerts: Alert[] = [];
for (const profile of profiles) {
// Critical: Success rate drop
if (profile.recentPerformance.trend < -0.3 && profile.totalTasks > 10) {
alerts.push({
severity: 'critical',
agent: profile.agentName,
type: 'performance_cliff',
message: `Success rate dropped significantly (trend: ${profile.recentPerformance.trend.toFixed(2)})`,
metric: profile.recentPerformance.trend,
threshold: -0.3
});
}
// Warning: Low success rate
if (profile.successRate < 0.7 && profile.totalTasks > 5) {
alerts.push({
severity: 'warning',
agent: profile.agentName,
type: 'low_success_rate',
message: `Success rate below threshold (${(profile.successRate * 100).toFixed(1)}%)`,
metric: profile.successRate,
threshold: 0.7
});
}
// Warning: High variance
if (profile.weaknessPatterns.length > 5) {
alerts.push({
severity: 'warning',
agent: profile.agentName,
type: 'multiple_weaknesses',
message: `Agent has ${profile.weaknessPatterns.length} weakness patterns`,
metric: profile.weaknessPatterns.length,
threshold: 5
});
}
// Info: Hot streak
if (profile.recentPerformance.trend > 0.3 && profile.totalTasks > 5) {
alerts.push({
severity: 'info',
agent: profile.agentName,
type: 'hot_streak',
message: `Agent on hot streak (trend: +${(profile.recentPerformance.trend * 100).toFixed(0)}%)`,
metric: profile.recentPerformance.trend,
threshold: 0.3
});
}
}
return alerts.sort((a, b) => {
const severityOrder = { critical: 0, warning: 1, info: 2 };
return severityOrder[a.severity] - severityOrder[b.severity];
});
}
function compareAgents(agent1: string, agent2: string): ComparisonReport {
const system = getLearningSystem();
const profile1 = system.getProfile(agent1);
const profile2 = system.getProfile(agent2);
if (!profile1 || !profile2) {
throw new Error('Agent not found');
}
return {
agents: [agent1, agent2],
metrics: {
successRate: [profile1.successRate, profile2.successRate],
totalTasks: [profile1.totalTasks, profile2.totalTasks],
avgDuration: [profile1.averageDuration, profile2.averageDuration],
specialization: [profile1.specialization, profile2.specialization]
},
winner: {
successRate: profile1.successRate > profile2.successRate ? agent1 : agent2,
efficiency: profile1.averageDuration < profile2.averageDuration ? agent1 : agent2,
experience: profile1.totalTasks > profile2.totalTasks ? agent1 : agent2
},
recommendation: profile1.successRate > profile2.successRate ? agent1 : agent2
};
}
function checkDataQuality(): DataQualityReport {
const system = getLearningSystem();
const issues: string[] = [];
// Check for incomplete profiles
for (const profile of system.profiles.values()) {
if (profile.totalTasks > 0 && profile.specialization.length === 0) {
issues.push(`${profile.agentName}: No specialization despite ${profile.totalTasks} tasks`);
}
if (profile.strengthPatterns.length === 0 && profile.totalTasks > 10) {
issues.push(`${profile.agentName}: No strength patterns despite ${profile.totalTasks} tasks`);
}
}
// Check for stale patterns
const now = Date.now();
for (const pattern of system.getAllPatterns()) {
const daysSince = (now - pattern.lastSeen.getTime()) / (1000 * 60 * 60 * 24);
if (daysSince > 60) {
issues.push(`Pattern ${pattern.id}: Not seen in ${daysSince.toFixed(0)} days`);
}
}
return {
healthy: issues.length === 0,
issues,
score: Math.max(0, 1 - issues.length / 10)
};
}
{
"timestamp": "2025-12-29T10:30:00Z",
"system": {
"totalEvents": 156,
"successRate": 0.923,
"improvementRate": 0.08,
"activeAgents": 12,
"patternsExtracted": 34
},
"topAgents": [
{
"name": "code-reviewer",
"successRate": 0.957,
"tasks": 47,
"trend": "improving"
}
],
"alerts": [
{
"severity": "warning",
"agent": "test-strategist",
"message": "Success rate below threshold (68.5%)"
}
]
}
Agent,Tasks,Success_Rate,Avg_Duration_Min,Efficiency,Trend,Specialization
code-reviewer,47,95.7,5.2,0.87,improving,backend|api
implementation-specialist,38,92.1,18.3,1.05,improving,backend|frontend
test-strategist,29,89.7,8.1,0.92,stable,testing|qa
# Generate dashboard
node jira-orchestrator/lib/performance-tracker.js dashboard
# Generate report
node jira-orchestrator/lib/performance-tracker.js report --period=30d
# Check alerts
node jira-orchestrator/lib/performance-tracker.js alerts --severity=warning
# Compare agents
node jira-orchestrator/lib/performance-tracker.js compare \
--agent1=code-reviewer \
--agent2=qa-validator
# Watch mode (updates every 30 seconds)
watch -n 30 'node jira-orchestrator/lib/performance-tracker.js dashboard'
Remember: Performance tracking is not about blame—it's about continuous improvement. Use data to empower agents, not punish them. Focus on trends, not individual failures. Celebrate progress and learn from setbacks.
— Golden Armada ⚓
Designs feature architectures by analyzing existing codebase patterns and conventions, then providing comprehensive implementation blueprints with specific files to create/modify, component designs, data flows, and build sequences