From prompt-engineering
Optimize prompts for LLMs through systematic design, evaluation, and iterative refinement. Use when doing prompt engineering, prompt design, prompt optimization, system prompt creation, LLM optimization, or improving prompt templates. NOT for building MCP servers (use mcp-server) or creating skills (use skill-creator).
npx claudepluginhub viktorbezdek/skillstack --plugin prompt-engineeringThis skill uses the workspace's default tool permissions.
Transform vague AI instructions into precision-engineered prompts that reliably produce
Compares coding agents like Claude Code and Aider on custom YAML-defined codebase tasks using git worktrees, measuring pass rate, cost, time, and consistency.
Designs and optimizes AI agent action spaces, tool definitions, observation formats, error recovery, and context for higher task completion rates.
Designs, implements, and audits WCAG 2.2 AA accessible UIs for Web (ARIA/HTML5), iOS (SwiftUI traits), and Android (Compose semantics). Audits code for compliance gaps.
Transform vague AI instructions into precision-engineered prompts that reliably produce high-quality outputs. This skill combines proven techniques, systematic evaluation, and iterative refinement to create prompts for any LLM platform.
Assess the request and pick the right mode:
User request arrives
│
├─ "Make this prompt better" / has existing prompt
│ └─ OPTIMIZE MODE: Analyze → Apply techniques → Deliver improved prompt
│
├─ "Create a prompt for X" / needs new prompt from scratch
│ ├─ Simple task → AUTO DESIGN: Apply techniques → Deliver
│ └─ Complex task → INTERACTIVE DESIGN: Ask 2-3 questions → Design → Test
│
├─ "Evaluate/test this prompt" / quality assessment
│ └─ EVALUATE MODE: Score → Identify weaknesses → Suggest improvements
│
└─ "Help me understand prompt engineering" / learning
└─ EDUCATE: Teach relevant techniques with examples
Apply this framework for every prompt optimization task.
Read the prompt (or request) and answer:
Score the prompt against these dimensions:
Select techniques based on what the diagnosis reveals. Here are the core techniques
(see references/TECHNIQUES.md for the full catalog with detailed examples):
Role Assignment — Give the LLM a specific expert identity with credentials and methodology. Use when domain expertise matters. The more specific the role, the better the output quality.
Context Layering — Provide essential background in a structured format: Background → Goal → Constraints → Output Format. Remove anything the LLM doesn't need.
Chain-of-Thought — Ask the LLM to reason step by step. Critical for complex analytical, mathematical, or multi-factor reasoning tasks. Without this, LLMs often skip to conclusions.
Few-Shot Examples — Show 2-3 input→output pairs that demonstrate the pattern you want. This is the single most powerful technique for controlling output format and style.
Task Decomposition — Break complex tasks into sequential stages where each stage feeds the next. Prevents the LLM from trying to do everything at once and dropping quality.
Constraints & Guardrails — Define what NOT to do, set length limits, specify format requirements. LLMs perform better with clear boundaries than with open-ended freedom.
Output Specification — Define the exact structure, format, and content requirements of the output. Be explicit: headers, sections, length, style, tone.
When delivering:
references/PLATFORMS.md)The most common pattern. Works for 80% of prompt optimization needs.
You are a [SPECIFIC EXPERT] with expertise in [DOMAIN].
Context:
[ESSENTIAL BACKGROUND — 2-4 lines max]
Task:
[CLEAR, SPECIFIC OBJECTIVE]
Requirements:
- [CONSTRAINT 1]
- [CONSTRAINT 2]
Output format:
[EXACT STRUCTURE EXPECTED]
Use for tasks requiring consistent format AND complex reasoning.
[ROLE AND CONTEXT]
Here are examples of the expected analysis:
Example 1:
Input: [SAMPLE]
Reasoning: [STEP-BY-STEP THOUGHT PROCESS]
Output: [RESULT]
Example 2:
Input: [SAMPLE]
Reasoning: [STEP-BY-STEP THOUGHT PROCESS]
Output: [RESULT]
Now analyze the following. Think through your reasoning step by step before
providing your final output.
Input: [ACTUAL TASK]
Use for complex tasks that benefit from decomposition.
Complete this analysis in three stages:
Stage 1 — Research:
[GATHER AND ORGANIZE INFORMATION]
Present findings as: [FORMAT]
Stage 2 — Analysis:
Using the research from Stage 1, [ANALYZE SPECIFIC ASPECTS]
Present analysis as: [FORMAT]
Stage 3 — Synthesis:
Based on your analysis, [PRODUCE FINAL DELIVERABLE]
Format: [FINAL OUTPUT SPECIFICATION]
When the user's prompt references specific company data, projects, documents, or internal information, enrich the prompt with real context before optimizing.
When to enrich:
How to enrich: Use available tools (web search, connected integrations like Google Drive, Asana, Jira, Confluence, Slack, etc.) to pull relevant context. Synthesize the key facts — goals, metrics, stakeholders, constraints, timelines — and inject them into the prompt's context section.
The goal is to transform a generic prompt into one grounded in the user's actual situation. Don't dump raw data — distill what the LLM actually needs to produce a useful output.
When asked to evaluate a prompt (or when testing an optimized prompt), assess across
these dimensions. See references/EVALUATION.md for the full methodology.
| Dimension | What to Check |
|---|---|
| Clarity | Could this be misinterpreted? Vague terms? Ambiguity? |
| Specificity | Are outputs constrained enough? Format defined? |
| Completeness | Role + Context + Task + Format + Examples present? |
| Efficiency | Token-efficient? No redundancy? Every line earns its place? |
| Robustness | Will it work across input variations? Edge cases handled? |
For systematic testing, use the LLM itself as an evaluator:
Evaluate the following response against these criteria.
Score each 1-5 with brief justification.
Criteria:
1. Accuracy — factual correctness
2. Relevance — addresses the actual question
3. Completeness — covers all aspects
4. Clarity — well-organized and readable
Response to evaluate:
[PASTE RESPONSE]
For each criterion: Score (1-5) | Evidence | Improvement suggestion
Overall: __/20
When comparing two prompt versions:
Different LLMs respond to different prompting styles. Key differences:
See references/PLATFORMS.md for detailed platform optimization guides.
Reusable prompt templates for common use cases are in references/TEMPLATES.md:
When optimizing, watch for and fix these common problems:
Vague instructions → Add specificity: who, what, how, format, length, audience No examples → Add 2-3 few-shot examples showing desired output pattern Buried intent → Move the actual task to the top; context supports, doesn't obscure Kitchen sink → Remove requirements that don't serve the core goal No output spec → Define exact format, structure, and length expectations Assumed knowledge → Add necessary context the LLM wouldn't have Contradictions → Resolve conflicting requirements; flag trade-offs Over-engineering → Simple tasks need simple prompts. Don't add complexity for its own sake.
Load these as needed for deeper guidance:
| File | When to Read |
|---|---|
references/TECHNIQUES.md | Full technique catalog with detailed examples |
references/EVALUATION.md | Comprehensive evaluation methodologies and rubrics |
references/TEMPLATES.md | Reusable prompt patterns for common use cases |
references/PLATFORMS.md | Platform-specific optimization (Claude, GPT, Gemini) |