Analyze LangGraph application architecture, identify bottlenecks, and propose multiple improvement strategies
Analyzes LangGraph application architecture to identify bottlenecks and generate 3-5 diverse improvement proposals. Triggers when performance improvement is needed, latency/cost/accuracy targets aren't met, or comparing multiple architectural options.
/plugin marketplace add hiroshi75/ccplugins/plugin install hiroshi75-langgraph-master-plugin-langgraph-master-plugin@hiroshi75/ccpluginsThis skill inherits all available tools. When active, it can use any tool Claude has access to.
A skill for analyzing LangGraph application architecture, identifying bottlenecks, and proposing multiple improvement strategies.
This skill analyzes existing LangGraph applications and proposes graph structure improvements:
Important:
Use this skill in the following situations:
When performance improvement of existing applications is needed
When considering architecture-level improvements
When you want to compare multiple improvement options
Purpose: Prepare for performance measurement
Actions:
.langgraph-master/evaluation/ or specified directory)Output: Evaluation program ready
Purpose: Establish baseline
Actions:
Output: baseline_performance.json
Purpose: Understand current architecture
Actions:
Identify graph definitions with Serena MCP
find_symbolgraph.py, main.py, etc.)Analyze node and edge structure
get_symbols_overviewUnderstand each node's role
Output: Graph structure documentation
Purpose: Identify performance problem areas
Actions:
Latency Bottlenecks
Cost Issues
Accuracy Issues
Output: List of issues
Purpose: Identify applicable LangGraph patterns
Actions:
Consider patterns based on problems
Reference langgraph-master skill
Output: List of applicable patterns
Purpose: Create 3-5 diverse improvement proposals (all candidates for parallel exploration)
Actions:
Create improvement proposals based on each pattern
Evaluate improvement proposals
Important: Output all improvement proposals. The arch-tune command will implement and evaluate all proposals in parallel.
Output: Improvement proposal document (including all proposals)
Purpose: Organize analysis results and proposals
Actions:
improvement_proposals.md (with priorities)Important: Output all proposals to improvement_proposals.md. The arch-tune command will read these and implement/evaluate them in parallel.
Output:
analysis_report.md - Current state analysis and issuesimprovement_proposals.md - All improvement proposals (Proposal 1, 2, 3, ...){
"iterations": 5,
"test_cases": 20,
"metrics": {
"accuracy": {
"mean": 75.0,
"std": 3.2,
"min": 70.0,
"max": 80.0
},
"latency": {
"mean": 3.5,
"std": 0.4,
"min": 3.1,
"max": 4.2
},
"cost": {
"mean": 0.020,
"std": 0.002,
"min": 0.018,
"max": 0.023
}
}
}
# Architecture Analysis Report
Execution Date: 2024-11-24 10:00:00
## Current Performance
| Metric | Mean | Std Dev | Target | Gap |
|--------|------|---------|--------|-----|
| Accuracy | 75.0% | 3.2% | 90.0% | -15.0% |
| Latency | 3.5s | 0.4s | 2.0s | +1.5s |
| Cost | $0.020 | $0.002 | $0.010 | +$0.010 |
## Graph Structure
### Current Configuration
\```
analyze_intent ā retrieve_docs ā generate_response
\```
- **Node Count**: 3
- **Edge Type**: Sequential only
- **Parallel Processing**: None
- **Conditional Branching**: None
### Node Details
#### analyze_intent
- **Role**: Classify user input intent
- **LLM**: Claude 3.5 Sonnet
- **Average Execution Time**: 0.5s
#### retrieve_docs
- **Role**: Search related documents
- **Processing**: Vector DB query + reranking
- **Average Execution Time**: 1.5s
#### generate_response
- **Role**: Generate final response
- **LLM**: Claude 3.5 Sonnet
- **Average Execution Time**: 1.5s
## Issues
### 1. Latency Bottleneck from Sequential Processing
- **Issue**: analyze_intent and retrieve_docs are sequential
- **Impact**: Total 2.0s delay (57% of total)
- **Improvement Potential**: -0.8s or more reduction possible through parallelization
### 2. All Requests Follow Same Flow
- **Issue**: Simple and complex questions go through same processing
- **Impact**: Unnecessary retrieve_docs execution (wasted Cost and Latency)
- **Improvement Potential**: -50% reduction possible for simple cases through routing
### 3. Use of Low-Relevance Documents
- **Issue**: retrieve_docs returns only top-k (no reranking)
- **Impact**: Low Accuracy (75%)
- **Improvement Potential**: +10-15% improvement possible through multi-stage RAG
## Applicable Architecture Patterns
1. **Parallelization** - Parallelize analyze_intent and retrieve_docs
2. **Routing** - Branch processing flow based on intent
3. **Subgraph** - Dedicated subgraph for RAG processing (retrieve ā rerank ā select)
4. **Orchestrator-Worker** - Execute multiple retrievers in parallel and integrate results
# Architecture Improvement Proposals
Proposal Date: 2024-11-24 10:30:00
## Proposal 1: Parallel Document Retrieval + Intent Analysis
### Changes
**Current**:
\```
analyze_intent ā retrieve_docs ā generate_response
\```
**After Change**:
\```
START ā [analyze_intent, retrieve_docs] ā generate_response
ā parallel execution ā
\```
### Implementation Details
1. Add parallel edges to StateGraph
2. Add join node to wait for both results
3. generate_response receives both results
### Expected Effects
| Metric | Current | Expected | Change | Change Rate |
|--------|---------|----------|--------|-------------|
| Accuracy | 75.0% | 75.0% | ±0 | - |
| Latency | 3.5s | 2.7s | -0.8s | -23% |
| Cost | $0.020 | $0.020 | ±0 | - |
### Implementation Complexity
- **Level**: Low
- **Estimated Time**: 1-2 hours
- **Risk**: Low (no changes to existing nodes required)
### Recommendation Level
āāāā (High) - Effective for Latency improvement with low risk
---
## Proposal 2: Intent-Based Routing
### Changes
**Current**:
\```
analyze_intent ā retrieve_docs ā generate_response
\```
**After Change**:
\```
analyze_intent
āā simple_intent ā simple_response (lightweight)
āā complex_intent ā retrieve_docs ā generate_response
\```
### Implementation Details
1. Conditional branching based on analyze_intent output
2. Create new simple_response node (using Haiku)
3. Routing with conditional_edges
### Expected Effects
| Metric | Current | Expected | Change | Change Rate |
|--------|---------|----------|--------|-------------|
| Accuracy | 75.0% | 82.0% | +7.0% | +9% |
| Latency | 3.5s | 2.8s | -0.7s | -20% |
| Cost | $0.020 | $0.014 | -$0.006 | -30% |
**Assumption**: 40% simple cases, 60% complex cases
### Implementation Complexity
- **Level**: Medium
- **Estimated Time**: 2-3 hours
- **Risk**: Medium (adding routing logic)
### Recommendation Level
āāāāā (Highest) - Balanced improvement across all metrics
---
## Proposal 3: Multi-Stage RAG with Reranking Subgraph
### Changes
**Current**:
\```
analyze_intent ā retrieve_docs ā generate_response
\```
**After Change**:
\```
analyze_intent ā [RAG Subgraph] ā generate_response
ā
retrieve (k=20)
ā
rerank (top-5)
ā
select (best context)
\```
### Implementation Details
1. Convert RAG processing to dedicated subgraph
2. Retrieve more candidates in retrieve node (k=20)
3. Evaluate relevance in rerank node (Cross-Encoder)
4. Select optimal context in select node
### Expected Effects
| Metric | Current | Expected | Change | Change Rate |
|--------|---------|----------|--------|-------------|
| Accuracy | 75.0% | 88.0% | +13.0% | +17% |
| Latency | 3.5s | 3.8s | +0.3s | +9% |
| Cost | $0.020 | $0.022 | +$0.002 | +10% |
### Implementation Complexity
- **Level**: Medium-High
- **Estimated Time**: 3-4 hours
- **Risk**: Medium (introducing new model, subgraph management)
### Recommendation Level
āāā (Medium) - Effective when Accuracy is priority, Latency will degrade
---
## Recommendations
**Note**: The following recommendations are for reference. The arch-tune command will **implement and evaluate all Proposals above in parallel** and select the best option based on actual results.
### š„ First Recommendation: Proposal 2 (Intent-Based Routing)
**Reasons**:
- Balanced improvement across all metrics
- Implementation complexity is manageable at medium level
- High ROI (effect vs cost)
**Next Steps**:
1. Run parallel exploration with arch-tune command
2. Implement and evaluate Proposals 1, 2, 3 simultaneously
3. Select best option based on actual results
### š„ Second Recommendation: Proposal 1 (Parallel Retrieval)
**Reasons**:
- Simple implementation with low risk
- Reliable Latency improvement
- Can be combined with Proposal 2
### š Reference: Proposal 3 (Multi-Stage RAG)
**Reasons**:
- Effective when Accuracy is most important
- Only when Latency trade-off is acceptable
find_symbol: Search graph definitionsget_symbols_overview: Understand node structuresearch_for_pattern: Search specific patternsAnalysis Only
Evaluation Environment
Serena MCP
Creating algorithmic art using p5.js with seeded randomness and interactive parameter exploration. Use this when users request creating art using code, generative art, algorithmic art, flow fields, or particle systems. Create original algorithmic art rather than copying existing artists' work to avoid copyright violations.
Applies Anthropic's official brand colors and typography to any sort of artifact that may benefit from having Anthropic's look-and-feel. Use it when brand colors or style guidelines, visual formatting, or company design standards apply.
Create beautiful visual art in .png and .pdf documents using design philosophy. You should use this skill when the user asks to create a poster, piece of art, design, or other static piece. Create original visual designs, never copying existing artists' work to avoid copyright violations.