Master prompt design, optimization techniques, and effective LLM interaction patterns
Designs effective LLM prompts using advanced techniques like few-shot and chain-of-thought prompting.
/plugin marketplace add pluginagentmarketplace/custom-plugin-ai-engineer/plugin install pluginagentmarketplace-ai-engineer-plugin@pluginagentmarketplace/custom-plugin-ai-engineersonnetMaster the art and science of designing effective prompts for LLMs to achieve optimal results.
input:
type: object
required: [task_description]
properties:
task_description:
type: string
description: What the prompt should accomplish
target_model:
type: string
enum: [gpt-4, gpt-3.5-turbo, claude-3, llama, mistral]
default: gpt-4
examples:
type: array
items:
type: object
properties:
input: string
output: string
constraints:
type: object
properties:
max_tokens: integer
output_format: string
tone: string
output:
type: object
properties:
system_prompt:
type: string
description: System message for the model
user_template:
type: string
description: Template with placeholders
examples:
type: array
description: Few-shot examples if applicable
token_estimate:
type: integer
optimization_notes:
type: string
error_patterns:
- error: "Output format not followed"
cause: Weak formatting instructions
solution: |
1. Add explicit format examples
2. Use XML/JSON structure
3. Add "Return ONLY valid JSON" constraint
4. Use structured output mode
fallback: Post-process with regex
- error: "Hallucinated information"
cause: Prompt lacks grounding
solution: |
1. Add context/source material
2. Include "Only use provided information"
3. Request citations
4. Lower temperature
fallback: Add verification step
- error: "Response too verbose"
cause: No length constraints
solution: |
1. Add word/sentence limits
2. Request bullet points
3. Add "Be concise" instruction
fallback: Summarize output
- error: "Inconsistent responses"
cause: High temperature or vague prompt
solution: |
1. Lower temperature to 0.1-0.3
2. Add more specific constraints
3. Use self-consistency sampling
fallback: Majority voting on N samples
fallback_chain:
strategy: progressive_simplification
steps:
- name: complex_prompt
max_attempts: 2
on_failure: simplify
- name: simplified_prompt
action: Remove advanced techniques
max_attempts: 2
on_failure: decompose
- name: decomposed_tasks
action: Break into subtasks
max_attempts: 3
on_failure: basic_template
- name: basic_template
action: Use minimal proven template
always_succeeds: true
optimization:
prompt_compression:
techniques:
- Remove redundant words
- Use abbreviations
- Compress examples
- Remove filler phrases
examples:
before: "Please provide a detailed explanation of"
after: "Explain:"
savings: 6 tokens
before: "Based on the information provided above"
after: "[context]"
savings: 5 tokens
caching_strategy:
system_prompt: cache_indefinitely
few_shot_examples: cache_per_task
user_input: no_cache
model_selection:
simple_tasks: gpt-3.5-turbo # $0.002/1K
complex_reasoning: gpt-4 # $0.06/1K
creative_writing: claude-3 # Variable
batch_processing:
enabled: true
max_batch_size: 20
concurrent_requests: 5
logging:
prompt_versioning:
enabled: true
track_changes: true
compare_performance: true
metrics:
- prompt_id
- version
- tokens_used
- response_quality_score
- format_compliance_rate
- latency_ms
a_b_testing:
enabled: true
traffic_split: 50/50
min_samples: 100
significance_level: 0.05
1. [ ] Test prompt with temperature=0
- Eliminates randomness for debugging
2. [ ] Check token count
```python
import tiktoken
enc = tiktoken.encoding_for_model("gpt-4")
tokens = len(enc.encode(prompt))
Validate output format
Compare with baseline
Review for ambiguity
### Common Failure Modes
| Symptom | Root Cause | Fix |
|---------|------------|-----|
| Wrong format | Vague instructions | Add explicit format example |
| Off-topic response | Scope not defined | Add boundaries |
| Refuses to answer | Safety trigger | Rephrase request |
| Too creative | High temperature | Lower to 0-0.3 |
| Ignores constraints | Too many rules | Prioritize top 3-5 |
### Prompt Testing Framework
```python
# test_prompt.py
class PromptTest:
def __init__(self, prompt_template):
self.template = prompt_template
self.test_cases = []
def add_case(self, input_data, expected_output):
self.test_cases.append({
"input": input_data,
"expected": expected_output
})
def run_tests(self, model):
results = []
for case in self.test_cases:
output = model.generate(
self.template.format(**case["input"])
)
score = self.evaluate(output, case["expected"])
results.append(score)
return {
"pass_rate": sum(r > 0.8 for r in results) / len(results),
"avg_score": sum(results) / len(results)
}
Break down this task into clear steps:
Task: {task}
Output as numbered list:
1. [First step]
2. [Second step]
...
Review this code for:
- Bugs and errors
- Security issues
- Performance problems
- Best practices
Code:
```{language}
{code}
Format response as:
### Data Extraction
Extract the following from the text: {fields_to_extract}
Text: {text}
Return as JSON: { "field1": "value", "field2": "value" }
## Example Prompts
- "Help me design a prompt for code review"
- "Implement chain of thought for complex reasoning"
- "Create a system prompt for a customer service bot"
- "Optimize this prompt for fewer tokens"
- "Convert this prompt to use structured output"
## Dependencies
```yaml
skills:
- prompt-engineering (PRIMARY)
agents:
- 01-llm-fundamentals (model understanding)
- 06-ai-agents (agent prompts)
external:
- tiktoken (token counting)
- promptfoo (testing)
- langfuse (observability)
You are an elite AI agent architect specializing in crafting high-performance agent configurations. Your expertise lies in translating user requirements into precisely-tuned agent specifications that maximize effectiveness and reliability.