From ai-engineer
Use this skill when a research paper or research team output needs to be transformed into a working full-stack product. Activate when the task involves bridging research findings into production software — including writing a PRD, selecting a tech stack, designing AI/ML integration, building UI (delegated to auto-website-builder), backend development, and scaling the system. This skill acts as the engineering lead: it consults the research lead at every critical junction, coordinates the researcher team, and delegates the complete web presence and UI build to the auto-website-builder sub-skill.
npx claudepluginhub aviskaar/open-org --plugin ai-engineer# AI Engineer Engineering lead for research-to-product builds: orchestrate the full stack alongside the research lead and researcher team, delegating the web presence and UI to `auto-website-builder`. --- ## Overview The AI Engineer skill is the engineering counterpart to the research pipeline. Where `lead-researcher` orchestrates the science, this skill orchestrates the build. It runs alongside the research team — not after them — consulting the research lead at every critical junction and translating research outputs into product requirements, architecture decisions, working code, and...
/SKILLGuides implementation of defense-in-depth security architectures, compliance (SOC2, ISO27001, GDPR, HIPAA), threat modeling, risk assessments, SecOps, incident response, and SDLC security integration.
/SKILLEvaluates LLMs on 60+ benchmarks (MMLU, HumanEval, GSM8K) using lm-eval harness. Provides CLI commands for HuggingFace/vLLM models, task lists, and evaluation checklists.
/SKILLApplies systematic debugging strategies to track down bugs, performance issues, and unexpected behavior using checklists, scientific method, and testing techniques.
/SKILLSummarizes content from URLs, local files, podcasts, and YouTube videos. Extracts transcripts with --extract-only flag. Supports AI models, lengths, and JSON output.
/SKILLRuns `yarn extract-errors` on React project to detect new error messages needing codes, reports them, and verifies existing codes are up to date.
/SKILLManages major dependency upgrades via compatibility analysis, staged rollouts with npm/yarn, and testing for frameworks like React.
Engineering lead for research-to-product builds: orchestrate the full stack alongside the research lead and researcher team, delegating the web presence and UI to auto-website-builder.
The AI Engineer skill is the engineering counterpart to the research pipeline. Where lead-researcher orchestrates the science, this skill orchestrates the build. It runs alongside the research team — not after them — consulting the research lead at every critical junction and translating research outputs into product requirements, architecture decisions, working code, and production-ready services.
The AI Engineer does not build the UI alone. All web presence, brand, and frontend work is delegated to the auto-website-builder sub-skill. The AI Engineer's role in Stage 5 is to commission, brief, review, and integrate — not to implement the UI from scratch.
Pipeline stages:
1. Research Onboarding & Researcher Consultation
↓
2. PRD Creation
↓
3. Tech Stack Architecture
↓
4. AI/ML Integration Design [← continuous research-team touchpoint]
↓
5. UI & Web Presence [→ delegate to: auto-website-builder]
↕ (parallel)
6. Backend Development
↓
7. Integration, Testing & QA
↓
8. Scaling & Production Hardening
↓
9. Handoff & Knowledge Transfer
All stages involve active collaboration with the research lead. Stages 1, 4, and 8 have explicit decision gates requiring research-lead sign-off before proceeding.
The AI Engineer orchestrates the following sub-skills. Invoke them at the stages indicated — do not duplicate their work inline.
| Sub-skill | When to invoke | What to hand off | What to receive back |
|---|---|---|---|
auto-website-builder | Stage 5 | Product brief, ICP, competitor list, brand constraints, AI feature descriptions, backend API endpoints | Complete Next.js site, brand system, all page content, SVG logo, design tokens |
lead-researcher | As needed during Stage 1–2 | Research question, paper title/link | Research brief, literature synthesis, hypothesis |
literature-synthesis | If no synthesis exists at Stage 1 | Research topic and paper list | Structured synthesis document |
research-paper-review | If a specific paper needs critique | Paper title/link, differentiation question | Review report with gap analysis |
Briefing discipline: When invoking a sub-skill, always provide a written brief. Never hand off verbally or with ambiguous context. The brief for auto-website-builder is specified in Stage 5 below.
Trigger: Always first. Do not write a single line of product spec or code before completing this stage.
| # | Question | Why it matters |
|---|---|---|
| 1 | What is the core research contribution? (One sentence) | Anchors the entire product definition |
| 2 | What is the paper or research artifact? (Title, link, or summary) | Feeds into literature-synthesis and research-paper-review if needed |
| 3 | Who is the intended end-user of the product? | Drives UI/UX decisions |
| 4 | What is the key model, algorithm, or method to embed in the product? | Gates AI integration design in Stage 4 |
| 5 | Are there existing baselines, datasets, or trained models available? | Determines build vs. integrate decisions |
| 6 | What are the hard constraints? (latency, cost, privacy, compliance) | Eliminates tech stack options early |
| 7 | What is the definition of a successful MVP? | Sets the scope for stages 5–7 |
| 8 | What are the compute and deployment environment constraints? | Drives cloud and infra decisions |
Produce a Research-to-Product Brief (markdown, ~1 page):
Decision gate: Get explicit sign-off from the research lead before proceeding to Stage 2.
Reference: references/prd-template.md for full PRD structure.
Trigger: After Stage 1 sign-off.
As a [user], I want to [action] so that [outcome]. Cover the core AI-powered workflow end-to-end.Before finalizing the PRD:
PRD.md — full product requirements documentRESEARCH-DEPENDENCIES.md — dependency table extracted from the PRDReference: references/tech-stack-guide.md for decision frameworks and recommended stacks.
Trigger: After PRD is approved.
Evaluate each layer of the stack against:
| Layer | Decision | Options to consider |
|---|---|---|
| AI/ML serving | Inference framework and API | FastAPI + vLLM, Triton, Hugging Face TGI, custom PyTorch server |
| Backend | Language and web framework | Python/FastAPI, Node/Express, Go/Gin |
| Database | Primary store, vector store, cache | PostgreSQL, MongoDB, Pinecone/Weaviate, Redis |
| Frontend | UI framework | Next.js, React + Vite, Svelte |
| Auth | Authentication and authorization | Clerk, Auth0, Supabase Auth, custom JWT |
| Queue / async | Task queue and message broker | Celery + Redis, BullMQ, RabbitMQ, Kafka |
| Infra | Cloud provider and compute | AWS, GCP, Azure; GPU instance type |
| CI/CD | Build, test, deploy pipeline | GitHub Actions, CircleCI, ArgoCD |
| Observability | Logging, metrics, tracing | Grafana + Prometheus + Loki, Datadog, OpenTelemetry |
ARCHITECTURE.md — full stack diagram (text-based) + per-layer decisions with rationaleADRs/ (Architecture Decision Records) — one markdown file per significant decision, especially for AI/ML serving and data storageTrigger: After architecture is confirmed. This is the highest-collaboration stage with the research team.
Before writing integration code, agree in writing with the researcher team on:
| Contract | Detail |
|---|---|
| Model API | Input format, output format, schema, versioning |
| Inference endpoint | gRPC vs. REST, authentication, rate limits |
| Model artifact | Location, format (ONNX, PyTorch, HF), versioning |
| Fallback behavior | What happens when the model returns low-confidence or errors |
| Evaluation metrics | How model quality is monitored in production |
| Retraining triggers | When and how the model is updated |
Decision gate: Present integration design to research lead. Confirm model interface contract is signed off before building dependent UI/backend layers.
AI-INTEGRATION.md — interface contract, integration pattern, data flow diagramSub-skill: auto-website-builder
Trigger: After AI integration design is confirmed (parallel with Stage 6 where feasible).
The AI Engineer does not build the UI directly. This stage has three responsibilities: write a precise brief for auto-website-builder, review its output against research and product requirements, and integrate the generated frontend with the backend and AI layers.
Compose a written brief covering every input auto-website-builder needs. Do not invoke it before the brief is complete.
| Brief field | Source | Notes |
|---|---|---|
| What does the product do? (1–3 sentences) | PRD problem statement | Translate from research jargon to user language |
| Primary buyer and end user | PRD Stage 1 ICP | |
| Biggest pain eliminated | PRD user stories | Lead with benefit, not feature |
| 3 direct or indirect competitors | Research brief / PRD | |
| Industry vertical | PRD | |
| B2B, B2C, or developer-facing? | PRD | |
| Product stage | PRD | MVP / Early access / GA |
| Existing name, logo, or brand assets | Stage 1 intake | Provide if researcher team has brand constraints |
| Primary goal of the site | PRD goals section | Leads / signups / downloads / docs traffic |
| AI feature descriptions (for product page) | Stage 4 AI-INTEGRATION.md | Plain-language descriptions of what the AI does; avoid model internals |
| Backend API endpoints (for docs / implementation page) | Stage 6 OpenAPI spec | Share endpoint list so auto-website-builder can generate accurate implementation steps |
| Hard constraints | PRD constraints section | Privacy policy requirements, compliance badges, on-prem availability |
| Research paper or publication link (if public) | Stage 1 | For credibility / "Built on research" section |
Hand off the completed brief and let auto-website-builder run its full pipeline (Phases 1–7). Do not interrupt or override its brand, messaging, or code generation decisions unless they conflict with a constraint in the brief.
Mandatory review checkpoints — after auto-website-builder delivers its output, the AI Engineer must verify:
| Checkpoint | What to check | Action if failed |
|---|---|---|
| AI feature accuracy | Does the product page accurately describe the AI/ML component? No overclaiming, no underclaiming. | Provide corrected copy to auto-website-builder for revision |
| Research fidelity | Are any research-derived claims (accuracy numbers, benchmarks, paper citations) correct? | Escalate to research lead for approval before launch |
| API documentation accuracy | Do implementation steps and docs match the actual backend API endpoints and auth model? | Update with correct endpoint details |
| Compliance section | Does the privacy policy cover the actual data the product collects? | Flag gaps; advise user to have legal review |
| Brand alignment | Do brand constraints from Stage 1 (e.g., researcher team's existing color scheme) conflict with generated brand? | Surface conflict; defer to research lead |
After auto-website-builder delivers the Next.js codebase, the AI Engineer extends it with AI-specific components that require engineering knowledge to implement:
| Component | Purpose | Implementation notes |
|---|---|---|
| Streaming output display | Render incremental model responses | Use SSE or WebSocket; add incremental <TextStream> component |
| Confidence / uncertainty indicator | Surface model confidence scores | Validate display thresholds with research lead before shipping |
| Async job status poller | Track long-running inference jobs | Poll GET /jobs/{id} or use WebSocket push |
| Model error states | Distinguish model errors from system errors | Separate error copy: "Our AI couldn't process this" vs "Service unavailable" |
| Feedback capture | Thumbs up/down or correction input | Only add if research team needs production feedback for model improvement |
| API key / auth flow | Connect frontend auth to backend | Wire Clerk/Auth0 tokens to backend API authorization header |
auto-website-builder with all required fieldsauto-website-builder output reviewed against all 5 checkpoints aboveauto-website-builder (brand, all pages, copy, design tokens, SVG logo)Trigger: After PRD and architecture confirmed (parallel with Stage 5 where feasible).
| Module | Responsibility |
|---|---|
| Auth | User identity, session management, role-based access |
| Model gateway | Wraps the AI integration client; handles routing, retries, rate limits |
| Data layer | CRUD operations, ORM/query builder, migrations |
| Job queue | Async task management for heavy inference or batch jobs |
| Webhooks / events | Notify frontend or external systems of async results |
| Admin API | Internal endpoints for monitoring, model management, feature flags |
Trigger: After Stages 5 and 6 are functionally complete.
| Level | Scope | Tools |
|---|---|---|
| Unit | Individual functions and modules | pytest, Jest, Vitest |
| Integration | Service-to-service, DB, model API | pytest, Supertest |
| End-to-end | Full user journey through UI | Playwright, Cypress |
| AI/ML quality | Model output correctness in product context | Custom eval suite (consult research team) |
| Load | Throughput and latency under expected peak load | k6, Locust |
| Security | OWASP Top 10 basics, auth boundary checks | Manual + automated scan |
Reference: references/scaling-playbook.md for patterns and runbooks.
Trigger: After Stage 7 QA pass.
| Dimension | Target | Approach |
|---|---|---|
| Inference throughput | Requests/sec under peak | Model batching, GPU auto-scaling, request queuing |
| Backend throughput | API requests/sec | Horizontal pod autoscaling, connection pooling |
| Data volume | Storage growth rate | Partitioning, archival strategy, index optimization |
| Latency | P95 and P99 targets from PRD | CDN for static assets, caching layer, async offload |
| Availability | Uptime SLA | Multi-AZ or multi-region deployment, health checks, circuit breakers |
| Cost | Cost per inference / cost per user | Spot instances, request batching, model quantization |
Decision gate: Present scaling plan and hardening checklist to research lead. Confirm model rollback and retraining integration points before going live.
RUNBOOK.md — operational runbook (deploy, scale, roll back, incident response)Trigger: After Stage 8 production readiness is confirmed.
At every stage, any decision that affects:
…must be surfaced to the research lead before being implemented. Do not silently override research constraints with engineering pragmatism.
Maintain an ENGINEERING-LOG.md alongside the Research Log. After each stage:
## Stage N — [Name] — [Date]
Status: complete / in-progress / blocked
Key decisions: [list with rationale]
Research team touchpoints: [summary of what was discussed and agreed]
Open items: [list]
Escalate to the research lead immediately when:
Engineering velocity does not justify silently degrading the AI/ML component's fidelity. If a deadline forces a trade-off, surface it explicitly to the research lead and document the decision.
| User intent | Entry point | Notes |
|---|---|---|
| "We have a paper, build the product" | Stage 1 → full pipeline | Run research-paper-review in parallel with Stage 1; auto-website-builder runs at Stage 5 |
| "PRD exists, build it" | Stage 3 → full pipeline | Confirm research dependencies table exists; brief auto-website-builder at Stage 5 |
| "Stack is chosen, build AI integration + app" | Stage 4 → 5 → 6 → 7 → 8 | Verify interface contract with research team before Stage 4; run auto-website-builder at Stage 5 in parallel with Stage 6 |
| "Just build the website/marketing site" | Stage 5 only | Write the brief from PRD and invoke auto-website-builder directly |
| "MVP built, make it production-ready" | Stage 7 → 8 → 9 | Run QA first to identify gaps before hardening |
| "Scale an existing deployment" | Stage 8 directly | Use scaling-playbook reference |
| Stage | Artifact | Owner |
|---|---|---|
| 1 | Research-to-Product Brief (approved by research lead) | AI Engineer |
| 2 | PRD.md, RESEARCH-DEPENDENCIES.md | AI Engineer |
| 3 | ARCHITECTURE.md, ADRs/ | AI Engineer |
| 4 | AI-INTEGRATION.md, model client module, integration tests | AI Engineer |
| 5 | Next.js site (all pages, brand, copy) from auto-website-builder; AI-specific component extensions; integration review report | auto-website-builder → AI Engineer integrates |
| 6 | Backend service, OpenAPI spec, DB migrations, test suite | AI Engineer |
| 7 | QA report, edge case test suite, load test results | AI Engineer + researcher team |
| 8 | RUNBOOK.md, IaC, observability config, cost model | AI Engineer |
| 9 | Engineering handoff doc, research integration guide, open items register | AI Engineer |
| All | ENGINEERING-LOG.md with stage-by-stage entries | AI Engineer |