From pm-advanced
Structures AI/ML product planning with canvas for user problems, model/task selection, data needs, evaluation metrics, and responsible AI checks. For LLM integrations and AI features.
npx claudepluginhub mohitagw15856/pm-claude-skills --plugin pm-advancedThis skill uses the workspace's default tool permissions.
Define AI products with the same rigour as any product decision — but with additional layers for data, model, evaluation, and responsible AI. This canvas prevents the most common AI product failure: building a technically impressive feature that doesn't solve a real problem.
Guides AI-native product development addressing agency-control tradeoffs, calibration loops, CCCD framework, and eval strategies for AI agents and LLM features.
Audits pre-launch AI features across 6 dimensions—model selection, data quality, cost, monitoring, failure UX, optimization—grading readiness and blocking shipment of broken products.
Architects AI wrapper products around OpenAI/Anthropic APIs into focused, monetizable tools. Covers prompt engineering, cost optimization, rate limiting, UX, and business strategy. Activates on AI wrapper/GPT product mentions.
Share bugs, ideas, or general feedback.
Define AI products with the same rigour as any product decision — but with additional layers for data, model, evaluation, and responsible AI. This canvas prevents the most common AI product failure: building a technically impressive feature that doesn't solve a real problem.
Before building, flag if any of these apply:
PM Owner: [Name] ML/AI Lead: [Name] Status: Discovery / Design / Build / Evaluation / Live
User problem being solved:
[What specific situation is the user in? What job are they trying to get done?]
Why AI?
[What makes this problem require AI vs a deterministic solution? If the answer is "because we can," stop here.]
Success for the user looks like:
[What outcome does the user experience when the AI feature is working well?]
Task type:
Model approach:
Rationale for chosen approach: [Why this, not alternatives]
| Data Type | Source | Volume | Quality Status | Bias Risk |
|---|---|---|---|---|
| [Training data] | [Where it comes from] | [Volume] | [Audit status] | H/M/L |
| [Evaluation data] | [Where it comes from] | [Volume] | [Audit status] | H/M/L |
Data gaps: [What's missing and plan to get it] Privacy considerations: [Any PII in training or inference data] Data ownership: [Do we own this data? Can we use it for training?]
Primary metric: [The number that defines success — accuracy, F1, BLEU, user rating, task completion rate] Minimum acceptable threshold: [Below X, the feature does not ship] Human evaluation plan: [How will humans review model outputs? Sampling rate? Review panel?]
| Evaluation Type | Method | Cadence | Owner |
|---|---|---|---|
| Offline (pre-launch) | [Test set, benchmark] | Pre-launch | ML Lead |
| Online (post-launch) | [A/B test, user feedback] | Weekly | PM + ML |
| Adversarial | [Red-team, edge cases] | Pre-launch | Safety reviewer |
How is AI output presented?
Confidence and uncertainty handling:
Fallback plan:
Rollout: [% of users, with staged expansion criteria] Monitoring metrics:
Model refresh cadence: [How often is the model retrained or updated?] Drift detection: [How will you know when model performance degrades in production?]