Provides universal prompt engineering techniques like XML structuring, output constraints, scope controls, and ambiguity handling to craft, optimize, or review LLM prompts.
npx claudepluginhub codealive-ai/ai-driven-development --plugin ai-driven-developmentThis skill uses the workspace's default tool permissions.
Universal techniques for crafting effective prompts across any LLM.
README.mdreferences/claude-family-prompting.mdreferences/evaluation-redteaming.mdreferences/failure-taxonomy.mdreferences/gemini3-family-prompting.mdreferences/gpt5-family-prompting.mdreferences/gpt5-prompting-guide.mdreferences/mistakes-context.mdreferences/mistakes-debt.mdreferences/mistakes-hallucinations.mdreferences/mistakes-security.mdreferences/mistakes-structure.mdreferences/prompt-audit-checklist.mdreferences/prompting-introduction.mdreferences/prompting-risks.mdreferences/prompting-techniques.mdGuides crafting effective LLM prompts using techniques like chain-of-thought, XML tags, role prompting, multishot examples, and self-verification. Use for improving output quality or debugging responses.
Crafts advanced LLM prompts with chain-of-thought, constitutional AI, meta-prompting, and optimization techniques. Use for AI features, agent performance, system prompts.
Optimizes prompts for production AI features with analysis, 6-step framework, failure detection, and research-backed techniques. Use for prompt review, system prompts, or improvement suggestions.
Share bugs, ideas, or general feedback.
Universal techniques for crafting effective prompts across any LLM.
Use XML tags to create clear, parseable prompts:
<context>Background information here</context>
<instructions>
1. First step
2. Second step
</instructions>
<examples>Sample inputs/outputs</examples>
<output_format>Expected structure</output_format>
Benefits:
Best practices:
<instructions>, not sometimes <steps>)<context> tags..."<examples><example id="1">...</example></examples><thinking> for chain-of-thought, <answer> for final outputSpecify explicit constraints on length, format, and structure:
<output_spec>
- Default: 3-6 sentences or ≤5 bullets
- Simple yes/no questions: ≤2 sentences
- Complex multi-step tasks:
- 1 short overview paragraph
- ≤5 bullets: What changed, Where, Risks, Next steps, Open questions
- Use Markdown with headers, bullets, tables when helpful
- Avoid long narrative paragraphs; prefer compact structure
</output_spec>
Explicitly constrain what the model should NOT do:
<constraints>
- Implement EXACTLY and ONLY what is requested
- No extra features, components, or embellishments
- If ambiguous, choose the simplest valid interpretation
- Do NOT invent values, make assumptions, or add unrequested elements
</constraints>
Prevent hallucinations and overconfidence:
<uncertainty_handling>
- If the question is ambiguous:
- Ask 1-3 precise clarifying questions, OR
- Present 2-3 plausible interpretations with labeled assumptions
- When facts may have changed: answer in general terms, state uncertainty
- Never fabricate exact figures or references when uncertain
- Prefer "Based on the provided context..." over absolute claims
</uncertainty_handling>
For inputs >10k tokens, add re-grounding instructions:
<long_context_handling>
- First, produce a short internal outline of key sections relevant to the request
- Re-state user constraints explicitly before answering
- Anchor claims to sections ("In the 'Data Retention' section...")
- Quote or paraphrase fine details (dates, thresholds, clauses)
</long_context_handling>
<tool_usage>
- Prefer tools over internal knowledge for:
- Fresh or user-specific data (tickets, orders, configs)
- Specific IDs, URLs, or document references
- Parallelize independent reads when possible
- After write operations, restate: what changed, where, any validation performed
</tool_usage>
<user_updates>
- Send brief updates (1-2 sentences) only when:
- Starting a new major phase
- Discovering something that changes the plan
- Avoid narrating routine operations
- Each update must include a concrete outcome ("Found X", "Updated Y")
- Do not expand scope beyond what was asked
</user_updates>
<self_check>
Before finalizing answers in sensitive contexts (legal, financial, safety):
- Re-scan for unstated assumptions
- Check for ungrounded numbers or claims
- Soften overly strong language ("always", "guaranteed")
- Explicitly state assumptions
</self_check>
For data extraction tasks, always provide a schema:
<extraction_spec>
Extract data into this exact schema (no extra fields):
{
"field_name": "string",
"optional_field": "string | null",
"numeric_field": "number | null"
}
- If a field is not present in source, set to null (don't guess)
- Re-scan source for missed fields before returning
</extraction_spec>
<research_guidelines>
- Browse the web for: time-sensitive topics, recommendations, navigational queries, ambiguous terms
- Include citations after paragraphs with web-derived claims
- Use multiple sources for key claims; prioritize primary sources
- Research until additional searching won't materially change the answer
- Structure output with Markdown: headers, bullets, tables for comparisons
</research_guidelines>
Without structure:
You're a financial analyst. Generate a Q2 report for investors. Include Revenue, Margins, Cash Flow. Use this data: {{DATA}}. Make it professional and concise.
With structure:
You're a financial analyst at AcmeCorp generating a Q2 report for investors.
<context>
AcmeCorp is a B2B SaaS company. Investors value transparency and actionable insights.
</context>
<data>
{{DATA}}
</data>
<instructions>
1. Include sections: Revenue Growth, Profit Margins, Cash Flow
2. Highlight strengths and areas for improvement
3. Use concise, professional tone
</instructions>
<output_format>
- Use bullet points with metrics and YoY changes
- Include "Action:" items for areas needing improvement
- End with 2-3 bullet Outlook section
</output_format>
When adapting prompts across models or versions:
| Technique | Tag Pattern | Use Case |
|---|---|---|
| Separate sections | <context>, <instructions>, <data> | Any complex prompt |
| Control length | <output_spec> with word/bullet limits | Prevent verbosity |
| Prevent drift | <constraints> with explicit "do NOT" | Feature creep |
| Handle uncertainty | <uncertainty_handling> | Factual queries |
| Chain of thought | <thinking>, <answer> | Reasoning tasks |
| Extraction | <schema> with JSON structure | Data parsing |
| Research | <research_guidelines> | Web-enabled agents |
| Self-check | <self_check> | High-risk domains |
| Tool usage | <tool_usage_rules> | Agentic systems |
| Eagerness control | <persistence>, <context_gathering> | Agent autonomy |
| Persona | <role> + behavioral constraints | Tone & style |
Comprehensive catalog of prompting techniques. Full details, examples, and academic references in references/prompting-techniques.md.
| Technique | Use Case |
|---|---|
| Zero-Shot Prompting | Direct task execution without examples; classification, translation, summarization |
| Few-Shot Prompting | In-context learning via exemplars; format control, label calibration, style matching |
| Chain-of-Thought (CoT) | Step-by-step reasoning; arithmetic, logic, commonsense reasoning tasks |
| Meta Prompting | LLM as orchestrator delegating to specialized expert prompts; complex multi-domain tasks |
| Self-Consistency | Sample multiple CoT paths, pick majority answer; boost accuracy on math & reasoning |
| Generated Knowledge | Generate relevant knowledge first, then answer; commonsense & factual QA |
| Prompt Chaining | Break complex tasks into sequential subtasks; document analysis, multi-step workflows |
| Tree of Thoughts (ToT) | Explore multiple reasoning branches with lookahead/backtracking; planning, puzzles |
| RAG | Retrieve external documents before generating; knowledge-intensive tasks, fresh data |
| ART (Auto Reasoning + Tools) | Auto-select and orchestrate tools with CoT; tasks requiring calculation, search, APIs |
| APE (Auto Prompt Engineer) | LLM generates and scores candidate prompts; prompt optimization at scale |
| Active-Prompt | Identify uncertain examples, annotate selectively for CoT; adaptive few-shot |
| Directional Stimulus | Add a hint/keyword to guide generation direction; summarization, dialogue |
| PAL (Program-Aided LM) | Generate code instead of text for reasoning; math, data manipulation, symbolic tasks |
| ReAct | Interleave reasoning traces with tool actions; search, QA, decision-making agents |
| Reflexion | Agent self-reflects on failures with verbal feedback; iterative improvement, debugging |
| Multimodal CoT | Two-stage: rationale generation then answer with text+image; visual reasoning tasks |
| Graph Prompting | Structured graph-based prompts; node classification, relation extraction, graph tasks |
LLM settings, prompt elements, formatting, and practical examples — see references/prompting-introduction.md. Covers:
Adversarial attacks, factuality issues, and bias mitigation — see references/prompting-risks.md. Covers:
When asked to audit, review, or improve a prompt, follow this workflow. Full checklist with per-check references: prompt-audit-checklist.md.
| # | Dimension | What to Check |
|---|---|---|
| 1 | Clarity & Specificity | Task definition, success criteria, audience, output format, conflicting constraints |
| 2 | Structure & Formatting | Section separation (XML tags), prompt smells (monolithic, mixed layers, negative bias) |
| 3 | Safety & Security | Control/data separation, secrets in prompt, injection resilience, tool permissions |
| 4 | Hallucination & Factuality | Role framing, grounding, citation-without-sources, uncertainty handling |
| 5 | Context Management | Info placement (not buried in middle), context size, RAG doc count, re-grounding |
| 6 | Maintainability & Debt | Hardcoded values, regenerated logic, model pinning, testability |
| 7 | Model-Specific Fit | Model-specific params and gotchas (see Model-Specific Guides below) |
| 8 | Evaluation Readiness | Eval criteria, adversarial test cases, schema enforcement, monitoring |
Three complementary layers — use the one matching your need:
Deep-dives by category — root causes, mechanisms, prevention checklists (from "The Architecture of Instruction", 2026):
| Mistake Category | Key Issues | Reference |
|---|---|---|
| Hallucinations & Logic | Ambiguity-induced confabulation, automation bias, overloaded prompts, logical failures in verification tasks, no role framing | mistakes-hallucinations.md |
| Structural Fragility | Formatting sensitivity (up to 76pp variance), reproducibility crisis, prompt smells catalog (6 anti-patterns), deliberation ladder | mistakes-structure.md |
| Context Rot | "Lost in the middle" U-shaped attention, RAG over-retrieval, naive data loading, context engineering shift | mistakes-context.md |
| Prompt Debt | Token tax of regenerative code, debt taxonomy (prompt/hyperparameter/framework/cost), multi-agent solutions, automated repair | mistakes-debt.md |
| Security | Direct/indirect injection, jailbreaking, system prompt leakage (OWASP LLM07:2025), RAG poisoning, multimodal injection, adversarial suffixes | mistakes-security.md |
Quick reference — 18-category taxonomy with MRPs, risk scores, case studies, action items: failure-taxonomy.md. Start here for an overview or to prioritize which categories to address first. Covers: control-plane vs data-plane model, heuristic risk scoring, real-world incidents (EchoLeak CVE-2025-32711, Mata v. Avianca, Samsung shadow AI).
How to measure & test — eval metrics, CI gating, red-teaming, tooling: evaluation-redteaming.md. Covers: TruthfulQA, FActScore, SelfCheckGPT, PromptBench, AILuminate, LLM-as-judge pitfalls, guardrail libraries, open research questions.
Each model family has unique parameters, gotchas, and patterns. Consult the reference for your target model:
effort with new xhigh on 4.7), task_budget agentic-loop ceiling, legacy thinking.budget_tokens 400-error on 4.7, new tokenizer (~1.35× text, ~3× images), tool under-triggering on 4.7 (vs 4.6 over-triggering), more literal instruction-following, server-side compaction beta, Managed Agents memory beta, Cyber Verification gate, prefill deprecation, Structured Outputs, prompt caching, citations, context engineering, vision crop tool, migration paths 4.5 → 4.6 → 4.7reasoning_effort (last-mile knob in 5.4/5.5), text.verbosity, named tools (apply_patch), agentic eagerness templates, completeness/verification contracts, compaction API, phase field, outcome-first prompts, personality vs collaboration style, retrieval budgets, mini/nano guidance, migration pathsthinking_budget vs thinking_level, constraint placement (end of prompt), persona priority, function calling, structured output, multimodal, image generation