Plugin

north-starr-genai

Agentic AI development workflow for Claude Code — AI-specific skills, agents, hooks, and project context for teams building AI automations

What's Inside

Agents18

agentic-designer

/agentic-designer

Design UI/UX patterns for AI-powered interfaces. Produces interaction specs for conversational UI, dashboards, approval workflows, confidence display, streaming UX, and error states. Spawned during BUILD when the plan includes a user-facing AI interface. Runs on a separate thread.

ai-architect

/ai-architect

Technical design agent for AI stories. Produces architecture decisions, model selection, cost envelopes, and routes to invert and cost-estimator. Reads prior decisions from DECISIONS.md. Runs on a separate thread.

ai-invert-analyst

/ai-invert-analyst

AI-specific inversion analysis agent. Given a requirement or feature description, produces `.plans/INVERT-<name>.md` covering prompt fragility, hallucination, cost, drift, data pipeline, guardrails, and observability. Runs on a separate thread. Invoked via `/ai-invert` skill or orchestrator dispatch on Q1/Q2 gate hits.

ai-ops

/ai-ops

Configure monitoring, alerting, and observability for AI automations. Designs dashboards, cost tracking, accuracy drift detection, and alerting rules. Runs on a separate thread.

auto-improver

/auto-improver

Autonomously improve any skill or agent prompt using a measure-change-test hill-climbing loop. Runs the target repeatedly, scores output against a yes/no checklist, makes one small change per round, keeps improvements, reverts regressions. Runs on a separate thread. Invoked via `/autoimprove` skill.

Skills24

ai-invert

/ai-invert

Run AI-specific inversion analysis on a requirement before implementation. Dispatches the `ai-invert-analyst` agent on a separate thread. Use before complex or high-stakes AI tasks that touch prompts, models, RAG, or AI-powered outputs.

ai-test

/ai-test

Generate executable pytest test files for AI outputs. Produces assertion-based tests for deterministic AI components (classification, extraction, routing, structured output) that run in CI/CD. Complements /eval-suite which produces statistical evaluation datasets for non-deterministic outputs.

analyze-code

/analyze-code

Analyze code modules and files for refactoring opportunities, code smells, and architectural pattern violations in any language or framework. Use this skill when the user asks to "analyze code smells", "find refactoring opportunities", "check for code quality issues", or "review architecture" for a specific module or file.

assess

/assess

Classify project type, recommend approach, identify needed agents, estimate complexity, and flag risks. Runs BEFORE /decompose to help North Starr adapt its pipeline to what is being built.

autoimprove

/autoimprove

Autonomously improve any skill or agent prompt via measure-change-test hill-climbing. Dispatches the `auto-improver` agent on a separate thread. Use when a skill gives inconsistent results, when asked to "improve/optimize/autoresearch" a skill, or when output quality needs iterative tightening.

Hooks1

Event Hooks

File writes

2 hooks across 2 events

Stats

Version0.16.0

ReleasedMay 9, 2026

LanguageShell

Stars1

MaintenanceExcellent

LicenseMIT

Last CommitMay 9, 2026

AddedApr 17, 2026

Actions

View on GitHub View README Plugin Marketplace JSON

Available In

north-starr-genai1

Safety Signals

Caution

Modifies files

Hook triggers on file write and edit operations

Uses power tools

Uses Bash, Write, or Edit tools

Tags

North Starr GenAI

Your North Starr for AI Development | v0.15.0

An agentic AI development agency framework — North Starr plans, designs, validates, and orchestrates while Claude Code writes code in YOUR codebase. Works with any project: RAG pipelines, agent harnesses, multi-agent systems, prompt chains, or AI platform components.

Agent Interaction Map

┌──────────┐ │ /assess │ ← classifies project type └────┬─────┘ │ ┌────┴─────┐ │/discover │ ← elicits requirements (if needed) └────┬─────┘ │ ┌────┴──────┐ │/decompose │ ← PRD → stories └────┬──────┘ │ ┌──────┴──────┐ │ ORCHESTRATOR │ └──────┬──────┘ │ ┌──────────┴──────────┐ │ chief-ai-po │ TRIAGE └──────────┬──────────┘ │ ┌──────────┴──────────┐ │ ai-architect │ DESIGN └────┬──────────┬─────┘ │ │ ┌─────────┘ └──────────┐ ▼ ▼ ┌──────────┐ ┌───────────────┐ │ ai-invert│ │cost-estimator │ └────┬─────┘ └───────┬───────┘ └───────────────┬───────────────┘ ▼ ┌──────────────────┐ │ genai-layoutplan │ PLAN (tags tasks with specialists) └────────┬─────────┘ │ ┌────────────────┼────────────────┬──────────────┐ ▼ ▼ ▼ ▼ ┌────────────┐ ┌────────────┐ ┌───────────────┐ ┌──────────┐ │ prompt- │ │ rag- │ │ integration- │ │ agentic- │ │ engineer │ │ advisor │ │ planner │ │ designer │ └─────┬──────┘ └─────┬──────┘ └───────┬───────┘ └────┬─────┘ └───────────────┼─────────────────┼───────────────┘ ▼ BUILD ┌─────────────┼──────────────┐ ▼ ▼ ▼ ┌────────────┐ ┌────────────┐ ┌───────────┐ │ eval- │ │ guardrails-│ │ ai-ops │ │ designer │ │ designer │ │ (monitor) │ └─────┬──────┘ └──────┬─────┘ └─────┬─────┘ │ ┌────┴─────┐ │ │ │ prompt- │ │ │ │adversary │ │ │ └──────────┘ │ └───────────────┼────────────┘ ▼ HARDEN ┌────────────┐ │demo-builder│ DELIVER └────────────┘ Feedback loops: eval fails ──→ prompt-engineer (fix prompt) guardrails fail ──→ ai-architect (fix design) cost overrun ──→ ai-architect (cheaper model) same gate fails twice ──→ HUMAN escalation

What It Does

North Starr GenAI is the brain of an AI development agency. It doesn't generate code — it generates the specs, designs, evaluations, and guardrails that make AI code production-grade.

CLIENT (you) → gives requirement NORTH STARR (brain) → plans, designs, validates, orchestrates, quality-gates CLAUDE CODE (hands) → reads North Starr's specs + writes code in YOUR codebase /genai-bootstrap → makes Claude Code aware of your specific codebase patterns

The Pipeline

Requirement → /assess (classify project type) → /discover (elicit requirements if needed) → /decompose (PRD → stories with AI safety criteria) → /orchestrate (start the pipeline) → TRIAGE: chief-ai-po refines story → DESIGN: ai-architect → ADR + cost envelope → PLAN: genai-layoutplan → tasks with specialist tags → BUILD: specialists produce specs → Claude Code implements → HARDEN: eval + guardrails + ops validate (ALL must pass) → DELIVER: demo-builder packages for client

The AI Complexity Gate

Before ANY code change, the gate catches AI-specific risks:

Question

Why

Is current behavior covered by evals?

Eval-first discipline

Does this touch a production prompt or model config?

Prompt changes are high-risk

Does this change what data the model sees?

Data changes alter model behavior

Does this affect a client-facing output?

Client-visible changes need baselines

Could this change cost at scale?

Cost is a first-class concern

Based on answers, it routes through: ASSESS → BUILD (specialists auto-spawn) → HARDEN (validators auto-run) → COMPLETE → LEARN.

Routing Hooks (Claude Code only)

The plugin ships two hooks that fire automatically: