Plugin

flaker

Name: flaker
Author: mizchi

Set up flaker test intelligence CLI in repositories with GitHub Actions integration for Playwright E2E/VRT, then manage flaky tests through daily sampling runs, metrics review, advisory or required CI gates, test promotions and demotions, PR budget tuning, nightly triage, and quarantine tagging.

npx claudepluginhub mizchi/flaker --plugin flaker

Component Overview

Skills

Component Details

Skills (2)

flaker-management

/flaker-management

Operate @mizchi/flaker after setup. Use when the user asks how to run flaker day-to-day, review sampling and flaky metrics, design advisory vs required CI gates, promote or demote Playwright E2E or VRT checks, tune PR time budgets, run nightly triage, or manage quarantine and `@flaky` tags in an OSS repository. Targets @mizchi/flaker 0.7.0+ (declarative apply model).

flaker-setup

/flaker-setup

Set up @mizchi/flaker on a new repository. Use when the user asks to introduce flaker, configure flaker.toml, integrate flaker into GitHub Actions, or "start using flaker on this project". Encodes the declarative apply-based onboarding flow for @mizchi/flaker 0.7.0+ (declarative apply model).

README

flaker

flaker is a test-intelligence toolkit for:

sampling a smaller local test run from history and changed files
detecting flaky tests in noisy CI environments
measuring how well local sampled runs predict CI
embedding the same core logic in MoonBit as a library

It is designed for repositories where:

the full test suite is too expensive to run on every change
CI failures are noisy because flaky tests are mixed with real regressions
developers need a smaller local test run that still correlates well with CI

flaker helps answer:

Which tests should I run for this change?
How much can I shrink local execution without losing too much confidence?
Which tests are actually flaky?
How well does local sampled execution predict CI outcomes?

Upgrading from 0.0.x / 0.1.x? See docs/how-to-use.md#config-migration for the full key rename map. Starting with 0.2.0, the CLI refuses to start on legacy configs and points to the migration guide.

Upgrading from 0.4.x? See docs/migration-0.4-to-0.5.md or docs/migration-0.4-to-0.5.ja.md. 0.5.x keeps existing profiles working, but the recommended user-facing commands are now gate-oriented.

Install as a CLI

pnpm add -D @mizchi/flaker

Or run it without installing:

pnpm dlx @mizchi/flaker --help

Requirements:

Node.js 24+
pnpm 10+

Install as a Claude Code plugin

This repo also ships a Claude Code plugin with two skills:

flaker-setup Introduce flaker on a fresh repository. Day 0 → Week 4 onboarding flow, decision points, copy-paste commands, and pitfalls.
flaker-management Operate flaker after setup. Advisory vs required gating, nightly triage, quarantine, flaky tag management, and staged Playwright E2E / VRT rollout.

# In Claude Code
/plugin marketplace add mizchi/flaker
/plugin install flaker@flaker

Then ask the agent something like:

"新しいプロジェクトに flaker をセットアップしたい"
"flaker の advisory を required に上げる条件を決めたい"
"E2E VRT の nightly triage を設計したい"

The setup reference checklist lives at docs/new-project-checklist.ja.md and docs/new-project-checklist.md. The 0.4.x -> 0.5.x migration guide lives at docs/migration-0.4-to-0.5.ja.md and docs/migration-0.4-to-0.5.md. The user guide lives at docs/usage-guide.ja.md and docs/usage-guide.md. The operations guide lives at docs/operations-guide.ja.md and docs/operations-guide.md. The operations quick start lives at docs/flaker-management-quickstart.ja.md and docs/flaker-management-quickstart.md.

Use as a MoonBit Library

flaker also publishes a MoonBit library surface at mizchi/flaker.

The root package re-exports both:

pure computation APIs
the shared contract types they consume and return

If you prefer a stricter import boundary, the same types are still available from mizchi/flaker/contracts.

import {
  "mizchi/flaker" @flaker,
}

test "sample from historical runs" {
  let meta = @flaker.build_sampling_meta(
    [
      @flaker.SamplingHistoryRowInput::{
        suite: "tests/login.spec.ts",
        test_name: "login works",
        task_id: Some("web-login"),
        filter: None,
        variant: None,
        test_id: None,
        status: "passed",
        retry_count: 0,
        duration_ms: 1200,
        created_at: "2026-04-03T00:00:00.000Z",
      },
    ],
    [
      @flaker.SamplingListedTestInput::{
        suite: "tests/login.spec.ts",
        test_name: "login works",
        task_id: Some("web-login"),
        filter: None,
        variant: None,
        test_id: None,
      },
    ],
  )

  let sampled = @flaker.sample_weighted(meta, count=1, seed=1UL)
  assert_eq(sampled.length(), 1)
}

The root library surface intentionally re-exports pure logic only:

flaky detection: detect_flaky
sampling: build_sampling_meta, sample_random, sample_weighted, sample_hybrid
affected analysis: resolve_affected, build_affected_report, build_affected_report_from_input
stable identity: create_stable_test_id, resolve_test_identity
graph helpers: find_affected_nodes, expand_transitive, topological_sort
report reducers: summarize_report, classify_report_diff, aggregate_report
policy: summarize_quarantine, compute_quarantine_exit_code, run_config_check
metrics: build_sampling_kpi

Contracts remain separate so the API boundary stays explicit and reusable from other packages.

Experimental Direct MoonBit CLI

View full README on GitHub

Similar Plugins

regression-test-tracker

1.9k

Track and run regression tests to ensure new changes don't break existing functionality

Stats

Version0.3.0

Stars11

Forks1

MaintenanceExcellent

LicenseMIT

Last CommitMay 4, 2026

AddedApr 10, 2026

Actions

View on GitHub View README Plugin Marketplace JSON

Available In

flaker11

Help us improve

Share bugs, ideas, or general feedback.

Back to Plugins

flaker

flaker is a test-intelligence toolkit for:

sampling a smaller local test run from history and changed files
detecting flaky tests in noisy CI environments
measuring how well local sampled runs predict CI
embedding the same core logic in MoonBit as a library

It is designed for repositories where:

the full test suite is too expensive to run on every change
CI failures are noisy because flaky tests are mixed with real regressions
developers need a smaller local test run that still correlates well with CI

flaker helps answer:

Which tests should I run for this change?
How much can I shrink local execution without losing too much confidence?
Which tests are actually flaky?
How well does local sampled execution predict CI outcomes?

Upgrading from 0.0.x / 0.1.x? See docs/how-to-use.md#config-migration for the full key rename map. Starting with 0.2.0, the CLI refuses to start on legacy configs and points to the migration guide.

Upgrading from 0.4.x? See docs/migration-0.4-to-0.5.md or docs/migration-0.4-to-0.5.ja.md. 0.5.x keeps existing profiles working, but the recommended user-facing commands are now gate-oriented.

Install as a CLI

pnpm add -D @mizchi/flaker

Or run it without installing:

pnpm dlx @mizchi/flaker --help

Requirements:

Node.js 24+
pnpm 10+

Install as a Claude Code plugin

This repo also ships a Claude Code plugin with two skills:

flaker-setup Introduce flaker on a fresh repository. Day 0 → Week 4 onboarding flow, decision points, copy-paste commands, and pitfalls.
flaker-management Operate flaker after setup. Advisory vs required gating, nightly triage, quarantine, flaky tag management, and staged Playwright E2E / VRT rollout.

# In Claude Code
/plugin marketplace add mizchi/flaker
/plugin install flaker@flaker

Then ask the agent something like:

"新しいプロジェクトに flaker をセットアップしたい"
"flaker の advisory を required に上げる条件を決めたい"
"E2E VRT の nightly triage を設計したい"

Use as a MoonBit Library

flaker also publishes a MoonBit library surface at mizchi/flaker.

The root package re-exports both:

pure computation APIs
the shared contract types they consume and return

If you prefer a stricter import boundary, the same types are still available from mizchi/flaker/contracts.

import {
  "mizchi/flaker" @flaker,
}

test "sample from historical runs" {
  let meta = @flaker.build_sampling_meta(
    [
      @flaker.SamplingHistoryRowInput::{
        suite: "tests/login.spec.ts",
        test_name: "login works",
        task_id: Some("web-login"),
        filter: None,
        variant: None,
        test_id: None,
        status: "passed",
        retry_count: 0,
        duration_ms: 1200,
        created_at: "2026-04-03T00:00:00.000Z",
      },
    ],
    [
      @flaker.SamplingListedTestInput::{
        suite: "tests/login.spec.ts",
        test_name: "login works",
        task_id: Some("web-login"),
        filter: None,
        variant: None,
        test_id: None,
      },
    ],
  )

  let sampled = @flaker.sample_weighted(meta, count=1, seed=1UL)
  assert_eq(sampled.length(), 1)
}

The root library surface intentionally re-exports pure logic only:

flaky detection: detect_flaky
sampling: build_sampling_meta, sample_random, sample_weighted, sample_hybrid
affected analysis: resolve_affected, build_affected_report, build_affected_report_from_input
stable identity: create_stable_test_id, resolve_test_identity
graph helpers: find_affected_nodes, expand_transitive, topological_sort
report reducers: summarize_report, classify_report_diff, aggregate_report
policy: summarize_quarantine, compute_quarantine_exit_code, run_config_check
metrics: build_sampling_kpi

Contracts remain separate so the API boundary stays explicit and reusable from other packages.

flaker

Component Overview

Component Details

Skills (2)

README

flaker

Install as a CLI

Install as a Claude Code plugin

Use as a MoonBit Library

Experimental Direct MoonBit CLI

Similar Plugins

regression-test-tracker

Help us improve

Help us improve

flaker

Component Overview

Component Details

Skills (2)

README

flaker

Install as a CLI

Install as a Claude Code plugin

Use as a MoonBit Library

Experimental Direct MoonBit CLI

Similar Plugins

regression-test-tracker

Help us improve

ordis-quality-engineering

pw

test-quality-analysis

testing-plugin

test-results-analyzer