From magic-powers
Use when building defensibility into an AI product — designing data collection strategies that compound over time, domain-specific dataset building, proprietary data as competitive moat vs base models, and when data beats prompt engineering.
npx claudepluginhub kienbui1995/magic-powers --plugin magic-powersThis skill uses the workspace's default tool permissions.
- "What stops ChatGPT from just doing what we do?" — need a real answer
Generates design tokens/docs from CSS/Tailwind/styled-components codebases, audits visual consistency across 10 dimensions, detects AI slop in UI.
Records polished WebM UI demo videos of web apps using Playwright with cursor overlay, natural pacing, and three-phase scripting. Activates for demo, walkthrough, screen recording, or tutorial requests.
Delivers idiomatic Kotlin patterns for null safety, immutability, sealed classes, coroutines, Flows, extensions, DSL builders, and Gradle DSL. Use when writing, reviewing, refactoring, or designing Kotlin code.
The commoditization timeline:
Year 1 (2024): "We built X using GPT-4" — works, early movers win
Year 2 (2025): GPT-4 gets cheaper, Gemini/Claude match quality, copycats launch
Year 3 (2026): Base models add X as a native feature, wrapper products die
The survivors: products that built something base models CAN'T easily replicate
The only durable moats for AI products:
For B2B tools (highest moat potential):
# Every user interaction is training data — collect systematically
def log_ai_interaction(user_id, input, ai_output, user_action):
training_db.insert({
"user_id": user_id,
"input": input,
"ai_output": ai_output,
"user_accepted": user_action == "accept",
"user_edited": user_action == "edit",
"final_output": get_final_output(), # what user actually used
"timestamp": now(),
"metadata": get_user_context() # role, company, industry
})
# This becomes:
# 1. Fine-tuning dataset (accepted outputs = positive examples)
# 2. Preference data for RLHF (edited outputs show what good looks like)
# 3. Domain-specific benchmarks (what quality means in YOUR domain)
For consumer tools (network effect potential):
Collective intelligence strategy:
- Aggregate anonymized patterns across all users
- "Users who did X also found Y useful"
- "This type of content gets 3x more engagement on [platform]"
- Individual AI learns from collective knowledge without exposing individual data
Decision matrix:
Prompt engineering (default — try first):
Use when: You have <1000 examples, behavior can be described in text
Cost: $0 (just tokens)
Maintenance: Update prompt when behavior drifts
Time to implement: Hours
RAG (for knowledge problems):
Use when: Need to inject domain-specific knowledge at runtime
Use when: Knowledge changes frequently (can't retrain)
Cost: Embedding API + vector DB (~$50-200/month)
Best for: Document Q&A, knowledge bases, up-to-date data
Time to implement: 1-2 days
Fine-tuning (for style/behavior problems):
Use when: You have >1000 high-quality examples
Use when: Style/format consistency matters more than knowledge
Use when: Need faster/cheaper inference (smaller model, same quality)
Cost: $500-2000 for training run, cheaper per-token after
Best for: Consistent output format, domain-specific tone, specialized tasks
Time to implement: 1-2 weeks (data prep is 80% of work)
Hybrid (RAG + fine-tuned model):
Use when: Need both knowledge and behavior consistency
Most powerful but most complex
Systematic data collection for competitive advantage:
class DomainDatasetBuilder:
"""Build proprietary dataset through product usage"""
def __init__(self, domain: str):
self.domain = domain # "legal", "finance", "marketing", etc.
self.dataset = []
def capture_positive_example(self, input, output, quality_score):
"""Capture when user accepts/keeps AI output"""
if quality_score >= 4: # user rated highly
self.dataset.append({
"type": "positive",
"input": input,
"output": output, # what user actually used
"domain": self.domain,
"quality": quality_score
})
def capture_preference_pair(self, input, preferred_output, rejected_output):
"""When user edits AI output, you have a preference pair"""
self.dataset.append({
"type": "preference",
"input": input,
"preferred": preferred_output, # user's version
"rejected": rejected_output, # AI's original version
"domain": self.domain
})
def dataset_readiness(self):
positives = len([d for d in self.dataset if d["type"] == "positive"])
preferences = len([d for d in self.dataset if d["type"] == "preference"])
print(f"Fine-tuning ready: {positives >= 1000} ({positives}/1000 positives)")
print(f"RLHF ready: {preferences >= 500} ({preferences}/500 preferences)")
The most powerful moat — more users = better product:
Data network effect types:
Individual learning:
Each user's data improves their own experience
"The more you use it, the smarter it gets for you"
Example: Gmail Smart Compose learns your writing style
Collective learning:
All users' data improves everyone's experience
"More users = better for everyone"
Example: Waze (more drivers = better traffic prediction)
For AI products, aim for BOTH:
Individual: Remember my preferences, my context, my past work
Collective: Anonymized patterns make recommendations better for all
rag-architecture for building the knowledge layer on top of proprietary dataai-product-retention (data personalization = retention mechanism)ai-product-positioning (data moat = core of differentiation story)@solo-ai-builder assesses data moat before committing to product direction