Assesses AI system type (RAG, agents, prompts, LLM evaluation) and applies patterns for hallucination guards, context budgets, injection defenses, and cost tracking.
npx claudepluginhub gadaalabs/claude-code-on-steroidsThis skill uses the workspace's default tool permissions.
**NEXUS** — *A nexus is the central point where all connections converge.*
Creates isolated Git worktrees for feature branches with prioritized directory selection, gitignore safety checks, auto project setup for Node/Python/Rust/Go, and baseline verification.
Executes implementation plans in current session by dispatching fresh subagents per independent task, with two-stage reviews: spec compliance then code quality.
Dispatches parallel agents to independently tackle 2+ tasks like separate test failures or subsystems without shared state or dependencies.
NEXUS — A nexus is the central point where all connections converge. When invoked: assesses system type (RAG / agent / prompt / evaluation), loads the relevant pattern file, and applies AI-specific engineering discipline — hallucination guards, context budgets, injection defenses, cost tracking.
Core principle: LLM applications have unique failure modes — hallucination, prompt injection, context overflow, cost explosion. Engineer systems, not just prompts.
Announce at start: "Running NEXUS for AI application patterns."
SYSTEM TYPE ASSESSMENT:
"What are you building/debugging?"
A) RAG / knowledge retrieval system
B) Autonomous agent / tool-using agent
C) Prompt engineering / LLM integration
D) LLM evaluation / benchmarking
E) Multi-agent system
F) Debugging a hallucination / quality problem
G) Cost/latency optimization
Type → Section mapping:
huntervector skillAfter identifying type, ask: "What model are you using and what's the context window limit?"
Load patterns: patterns/rag-architecture.md
Key decisions in order:
Rule: Test retrieval quality (precision/recall) before testing generation quality.
Load patterns: patterns/agent-patterns.md
| Pattern | Best For | Iteration Limit |
|---|---|---|
| ReAct | Factual QA, tool use | 10 |
| Plan-Execute | Multi-step tasks | 5 plans |
| Reflection | Quality-critical output | 3 cycles |
| Multi-Agent Debate | High-stakes decisions | 3 rounds |
| Tool Routing | Multiple specialized tools | N/A |
Always set max iteration limits. Agents without limits will loop indefinitely on failure.
Load patterns: patterns/prompt-engineering.md
Process:
Rule: Never deploy a prompt tested on fewer than 10 diverse examples.
Load patterns: patterns/llm-evaluation.md
| Metric type | Method | Use when |
|---|---|---|
| Exact match | string equality | Factual QA, code gen |
| F1 score | token overlap | Extractive QA |
| Semantic similarity | cosine >0.8 | Open-ended QA |
| Rubric-based | LLM grades | Complex tasks |
| Hallucination | fact verification + self-consistency | High-stakes output |
| Human preference | blind A/B, win rate >0.55 | Model comparison |
Cost budgets: p99 latency <5s, cost per 1k requests <$10.
Never:
Always:
| Skill | Integration |
|---|---|
forge | Write eval tests before model/prompt changes |
hunter | Debug hallucination, retrieval failures |
sentinel | Verify eval metrics before claiming success |
chronicle | Store prompt patterns that worked |
vector | Route queries to appropriate model tier |