Skill

ai-governance-review

Review AI/ML features for governance compliance — risk classification, bias assessment, transparency, and guardrail verification.

Install

npx claudepluginhub hpsgd/turtlestack --plugin grc-lead

Tool Access

This skill is limited to using the following tools:

ReadBashGlobGrep

Preview

Review AI governance for $ARGUMENTS.

SKILL.md

Similar Skills

cache-components

Guides Next.js Cache Components and Partial Prerendering (PPR) with cacheComponents enabled. Implements 'use cache', cacheLife(), cacheTag(), revalidateTag(), static/dynamic optimization, and cache debugging.

cache-components

139.0k

claude-opus-4-5-migration

2 files

Migrates code, prompts, and API calls from Claude Sonnet 4.0/4.5 or Opus 4.1 to Opus 4.5, updating model strings on Anthropic, AWS, GCP, Azure platforms.

claude-opus-4-5-migration

83.2k

bmad-help

Analyzes BMad project state from catalog CSV, configs, artifacts, and query to recommend next skills or answer questions. Useful for help requests, 'what next', or starting BMad.

bmad-pro-skills

43.8k

Stats

Parent Repo Stars0

Parent Repo Forks0

Last CommitApr 16, 2026

Actions

View Source View Plugin View on GitHub View README

Risk level	Criteria	Examples
Low	Human reviews output, no decisions about individuals, limited blast radius	Content generation, code completion, summarisation
Medium	Customer-facing responses, human approves decisions, moderate blast radius	Chatbot responses, data analysis, recommendations
High	Decisions affecting individuals, human makes final decision, wide blast radius	Financial decisions, access control, hiring/screening
Prohibited	Autonomous decisions affecting rights, safety, or wellbeing without human oversight	Fully autonomous hiring, unsupervised medical diagnosis

Risk level

Criteria

Examples

Low

Human reviews output, no decisions about individuals, limited blast radius

Content generation, code completion, summarisation

Medium

Customer-facing responses, human approves decisions, moderate blast radius

Chatbot responses, data analysis, recommendations

High

Decisions affecting individuals, human makes final decision, wide blast radius

Financial decisions, access control, hiring/screening

Prohibited

Autonomous decisions affecting rights, safety, or wellbeing without human oversight

Fully autonomous hiring, unsupervised medical diagnosis

Risk	Questions to answer	Evidence to check
Bias	Does the model produce different outcomes for different demographic groups? Is the training/context data representative? When input data includes demographic-adjacent fields (age, location, name, gender) and the use case affects individuals (hiring, lending, access control), rate bias risk as High minimum.	Evaluation results across segments, training data composition, demographic-adjacent field inventory
Hallucination	Does the model present false information as fact? Is output grounded in retrieved context?	Factual accuracy eval, citation rate, grounding mechanism
Privacy	Does PII enter prompts? Can the model leak PII from context? Is consent obtained?	Grep for PII in prompt templates, data flow analysis
Transparency	Do users know AI is involved? Can they understand how decisions are made?	User disclosure, explainability mechanism
Dependency	Single model/provider? What happens if the provider is unavailable?	Provider count, fallback mechanism, vendor risk
Cost	Per-request cost? Monthly budget? Rate limits in place?	Cost monitoring, budget alerts, rate limiting code
Security	Prompt injection resistance? Data exfiltration prevention? Input/output validation?	Input sanitisation, output filtering, sandboxing

Risk

Questions to answer

Evidence to check

Bias

Does the model produce different outcomes for different demographic groups? Is the training/context data representative? When input data includes demographic-adjacent fields (age, location, name, gender) and the use case affects individuals (hiring, lending, access control), rate bias risk as High minimum.

Evaluation results across segments, training data composition, demographic-adjacent field inventory

Hallucination

Does the model present false information as fact? Is output grounded in retrieved context?

Factual accuracy eval, citation rate, grounding mechanism

Privacy

Does PII enter prompts? Can the model leak PII from context? Is consent obtained?

Grep for PII in prompt templates, data flow analysis

Transparency

Do users know AI is involved? Can they understand how decisions are made?

User disclosure, explainability mechanism

Dependency

Single model/provider? What happens if the provider is unavailable?

Provider count, fallback mechanism, vendor risk

Cost

Per-request cost? Monthly budget? Rate limits in place?

Cost monitoring, budget alerts, rate limiting code

Security

Prompt injection resistance? Data exfiltration prevention? Input/output validation?

Input sanitisation, output filtering, sandboxing

Requirement	Low	Medium	High	Evidence to check
Human review of output	Recommended	Required	Mandatory	Review workflow, approval gates
Evaluation suite	Basic	Comprehensive	Comprehensive + adversarial	Eval scripts, test cases, results
Bias testing	Optional	Required	Required + external audit	Bias eval results, demographic splits
Audit trail	Logs	Logs + I/O recording	Full provenance trail	Logging config, storage, retention
Fallback mechanism	Graceful error	Alternative path	Human takeover	Error handling code, fallback logic
Cost controls	Monthly budget	Per-request limits	Per-request + approval for high-cost	Rate limiting, budget alerts

Requirement

Low

Medium

High

Evidence to check

Human review of output

Recommended

Required

Mandatory

Review workflow, approval gates

Evaluation suite

Basic

Comprehensive

Comprehensive + adversarial

Eval scripts, test cases, results

Bias testing

Optional

Required

Required + external audit

Bias eval results, demographic splits

Audit trail

Logs

Logs + I/O recording

Full provenance trail

Logging config, storage, retention

Fallback mechanism

Graceful error

Alternative path

Human takeover

Error handling code, fallback logic

Cost controls

Monthly budget

Per-request limits

Per-request + approval for high-cost

Rate limiting, budget alerts

Guardrail	Status	Evidence
Input validation	PRESENT / ABSENT	[file:line or "not found"]
Output validation	PRESENT / ABSENT	[file:line]
Rate limiting	PRESENT / ABSENT	[file:line]
PII filtering	PRESENT / ABSENT	[file:line]
Fallback mechanism	PRESENT / ABSENT	[file:line]
Cost monitoring	PRESENT / ABSENT	[file:line or dashboard]

Guardrail

Status

Evidence

Input validation

PRESENT / ABSENT

[file:line or "not found"]

Output validation

PRESENT / ABSENT

[file:line]

Rate limiting

PRESENT / ABSENT

[file:line]

PII filtering

PRESENT / ABSENT

[file:line]

Fallback mechanism

PRESENT / ABSENT

[file:line]

Cost monitoring

PRESENT / ABSENT

[file:line or dashboard]

Check	Status	Evidence
Documented owner	Has someone accountable for this model's behaviour?	Team/person named
Evaluation suite	Does an eval set exist? When was it last run?	Eval scripts, results, date
Cost budget	Is there a defined per-request and monthly cost limit?	Budget config, alerts
Version control	Are prompts versioned in the repo (not edited in dashboards)?	Prompt files in git
Change process	Do prompt/model changes run through eval before deployment?	CI pipeline, review process

Check

Status

Evidence

Documented owner

Has someone accountable for this model's behaviour?

Team/person named

Evaluation suite

Does an eval set exist? When was it last run?

Eval scripts, results, date

Cost budget

Is there a defined per-request and monthly cost limit?

Budget config, alerts

Version control

Are prompts versioned in the repo (not edited in dashboards)?

Prompt files in git

Change process

Do prompt/model changes run through eval before deployment?

CI pipeline, review process

Question	Finding
What data enters AI prompts?	[list data types]
Does any PII enter prompts?	[yes/no — if yes, is consent obtained?]
Where is prompt/response data stored?	[location, retention period]
Are AI outputs used to make decisions about individuals?	[yes/no — if yes, what is the review process?]
Can individuals request deletion of their data from AI systems?	[yes/no — mechanism]

Question

Finding

What data enters AI prompts?

[list data types]

Does any PII enter prompts?

[yes/no — if yes, is consent obtained?]

Where is prompt/response data stored?

[location, retention period]

Are AI outputs used to make decisions about individuals?

[yes/no — if yes, what is the review process?]

Can individuals request deletion of their data from AI systems?

[yes/no — mechanism]

# AI Governance Review: [feature/use case] ## Classification - **Use case:** [description] - **Risk level:** [Low / Medium / High / Prohibited] - **Reasoning:** [why this classification] - **Review date:** [date] - **Reviewer:** [who performed this review] ## Deployment Decision - **Decision:** [APPROVED / CONDITIONALLY APPROVED / BLOCKED] - **Conditions (if conditional):** [specific gaps that must be closed before deployment] - **Blocking gaps (if blocked):** [which findings prevent deployment and what remediation is required] ## Risk Assessment | Risk category | Level | Controls in place | Gaps | |---|---|---|---| | Bias | Low/Medium/High (High minimum if demographic-adjacent + individual decisions) | [controls] | [gaps] | | Hallucination | Low/Medium/High | [controls] | [gaps] | | Privacy | Low/Medium/High | [controls] | [gaps] | | Transparency | Low/Medium/High | [controls] | [gaps] | | Dependency | Low/Medium/High | [controls] | [gaps] | | Cost | Low/Medium/High | [controls] | [gaps] | | Security | Low/Medium/High | [controls] | [gaps] | ## Requirements Compliance | Requirement | Required | Status | Evidence | |---|---|---|---| | Human review | [level] | MET/PARTIAL/GAP | [evidence] | | Evaluation suite | [level] | MET/PARTIAL/GAP | [evidence] | | Bias testing | [level] | MET/PARTIAL/GAP | [evidence] | | Audit trail | [level] | MET/PARTIAL/GAP | [evidence] | | Fallback | [level] | MET/PARTIAL/GAP | [evidence] | | Cost controls | [level] | MET/PARTIAL/GAP | [evidence] | ## Technical Guardrails | Guardrail | Status | Evidence | |---|---|---| | [guardrail] | PRESENT/ABSENT | [file:line] | ## Model Governance | Check | Status | Evidence | |---|---|---| | [check] | MET/GAP | [detail] | ## Data Governance | Question | Finding | |---|---| | [question] | [answer] | ## Remediation Plan | Gap | Severity | Remediation | Owner | Target date | |---|---|---|---|---| | [gap] | [level] | [action] | [person] | [date] | ## Review Schedule - **Next review:** [date or trigger] - **Review triggers:** [model changes, new data sources, regulatory updates, incidents]

Risk level	Criteria	Examples
Low	Human reviews output, no decisions about individuals, limited blast radius	Content generation, code completion, summarisation
Medium	Customer-facing responses, human approves decisions, moderate blast radius	Chatbot responses, data analysis, recommendations
High	Decisions affecting individuals, human makes final decision, wide blast radius	Financial decisions, access control, hiring/screening
Prohibited	Autonomous decisions affecting rights, safety, or wellbeing without human oversight	Fully autonomous hiring, unsupervised medical diagnosis

Risk level

Criteria

Examples

Low

Human reviews output, no decisions about individuals, limited blast radius

Content generation, code completion, summarisation

Medium

Customer-facing responses, human approves decisions, moderate blast radius

Chatbot responses, data analysis, recommendations

High

Decisions affecting individuals, human makes final decision, wide blast radius

Financial decisions, access control, hiring/screening

Prohibited

Autonomous decisions affecting rights, safety, or wellbeing without human oversight

Fully autonomous hiring, unsupervised medical diagnosis

Risk	Questions to answer	Evidence to check
Bias	Does the model produce different outcomes for different demographic groups? Is the training/context data representative? When input data includes demographic-adjacent fields (age, location, name, gender) and the use case affects individuals (hiring, lending, access control), rate bias risk as High minimum.	Evaluation results across segments, training data composition, demographic-adjacent field inventory
Hallucination	Does the model present false information as fact? Is output grounded in retrieved context?	Factual accuracy eval, citation rate, grounding mechanism
Privacy	Does PII enter prompts? Can the model leak PII from context? Is consent obtained?	Grep for PII in prompt templates, data flow analysis
Transparency	Do users know AI is involved? Can they understand how decisions are made?	User disclosure, explainability mechanism
Dependency	Single model/provider? What happens if the provider is unavailable?	Provider count, fallback mechanism, vendor risk
Cost	Per-request cost? Monthly budget? Rate limits in place?	Cost monitoring, budget alerts, rate limiting code
Security	Prompt injection resistance? Data exfiltration prevention? Input/output validation?	Input sanitisation, output filtering, sandboxing

Risk

Questions to answer

Evidence to check

Bias

Evaluation results across segments, training data composition, demographic-adjacent field inventory

Hallucination

Does the model present false information as fact? Is output grounded in retrieved context?

Factual accuracy eval, citation rate, grounding mechanism

Privacy

Does PII enter prompts? Can the model leak PII from context? Is consent obtained?

Grep for PII in prompt templates, data flow analysis

Transparency

Do users know AI is involved? Can they understand how decisions are made?

User disclosure, explainability mechanism

Dependency

Single model/provider? What happens if the provider is unavailable?

Provider count, fallback mechanism, vendor risk

Cost

Per-request cost? Monthly budget? Rate limits in place?

Cost monitoring, budget alerts, rate limiting code

Security

Prompt injection resistance? Data exfiltration prevention? Input/output validation?

Input sanitisation, output filtering, sandboxing

Requirement	Low	Medium	High	Evidence to check
Human review of output	Recommended	Required	Mandatory	Review workflow, approval gates
Evaluation suite	Basic	Comprehensive	Comprehensive + adversarial	Eval scripts, test cases, results
Bias testing	Optional	Required	Required + external audit	Bias eval results, demographic splits
Audit trail	Logs	Logs + I/O recording	Full provenance trail	Logging config, storage, retention
Fallback mechanism	Graceful error	Alternative path	Human takeover	Error handling code, fallback logic
Cost controls	Monthly budget	Per-request limits	Per-request + approval for high-cost	Rate limiting, budget alerts

Requirement

Low

Medium

High

Evidence to check

Human review of output

Recommended

Required

Mandatory

Review workflow, approval gates

Evaluation suite

Basic

Comprehensive

Comprehensive + adversarial

Eval scripts, test cases, results

Bias testing

Optional

Required

Required + external audit

Bias eval results, demographic splits

Audit trail

Logs

Logs + I/O recording

Full provenance trail

Logging config, storage, retention

Fallback mechanism

Graceful error

Alternative path

Human takeover

Error handling code, fallback logic

Cost controls

Monthly budget

Per-request limits

Per-request + approval for high-cost

Rate limiting, budget alerts

Guardrail	Status	Evidence
Input validation	PRESENT / ABSENT	[file:line or "not found"]
Output validation	PRESENT / ABSENT	[file:line]
Rate limiting	PRESENT / ABSENT	[file:line]
PII filtering	PRESENT / ABSENT	[file:line]
Fallback mechanism	PRESENT / ABSENT	[file:line]
Cost monitoring	PRESENT / ABSENT	[file:line or dashboard]

Guardrail

Status

Evidence

Input validation

PRESENT / ABSENT

[file:line or "not found"]

Output validation

PRESENT / ABSENT

[file:line]

Rate limiting

PRESENT / ABSENT

[file:line]

PII filtering

PRESENT / ABSENT

[file:line]

Fallback mechanism

PRESENT / ABSENT

[file:line]

Cost monitoring

PRESENT / ABSENT

[file:line or dashboard]

Check	Status	Evidence
Documented owner	Has someone accountable for this model's behaviour?	Team/person named
Evaluation suite	Does an eval set exist? When was it last run?	Eval scripts, results, date
Cost budget	Is there a defined per-request and monthly cost limit?	Budget config, alerts
Version control	Are prompts versioned in the repo (not edited in dashboards)?	Prompt files in git
Change process	Do prompt/model changes run through eval before deployment?	CI pipeline, review process

Check

Status

Evidence

Documented owner

Has someone accountable for this model's behaviour?

Team/person named

Evaluation suite

Does an eval set exist? When was it last run?

Eval scripts, results, date

Cost budget

Is there a defined per-request and monthly cost limit?

Budget config, alerts

Version control

Are prompts versioned in the repo (not edited in dashboards)?

Prompt files in git

Change process

Do prompt/model changes run through eval before deployment?

CI pipeline, review process

Question	Finding
What data enters AI prompts?	[list data types]
Does any PII enter prompts?	[yes/no — if yes, is consent obtained?]
Where is prompt/response data stored?	[location, retention period]
Are AI outputs used to make decisions about individuals?	[yes/no — if yes, what is the review process?]
Can individuals request deletion of their data from AI systems?	[yes/no — mechanism]

Question

Finding

What data enters AI prompts?

[list data types]

Does any PII enter prompts?

[yes/no — if yes, is consent obtained?]

Where is prompt/response data stored?

[location, retention period]

Are AI outputs used to make decisions about individuals?

[yes/no — if yes, what is the review process?]

Can individuals request deletion of their data from AI systems?

[yes/no — mechanism]

ai-governance-review

Install

Tool Access

Preview

SKILL.md

Similar Skills

ai-governance-review

Install

Tool Access

Preview

SKILL.md

Process (sequential — do not skip steps)

Step 1: AI Use Case Classification (MANDATORY)

Step 2: Risk Identification

Step 3: Requirements Check by Risk Level

Step 4: Technical Guardrail Verification

Step 5: Model Governance Check

Step 6: Data Governance for AI

Step 7: Findings and Remediation

Anti-Patterns (NEVER do these)

Output Format

Related Skills

Similar Skills

Process (sequential — do not skip steps)

Step 1: AI Use Case Classification (MANDATORY)

Step 2: Risk Identification

Step 3: Requirements Check by Risk Level

Step 4: Technical Guardrail Verification

Step 5: Model Governance Check

Step 6: Data Governance for AI

Step 7: Findings and Remediation

Anti-Patterns (NEVER do these)

Output Format

Related Skills