Skill

watch

Starts EvalView watch mode to auto-re-run regression checks when project files change, displaying live scorecards with pass/fail and score deltas.

Python

testing

developer-tools

Popularity

Stars

114

Forks

Invocation

How this skill is triggered — by the user, by Claude, or both

Slash command

/evalview:watch

User invocable

Model invocable

Inline context

Default effort

Context Preview

The summary Claude sees in its skill listing — used to decide when to auto-load this skill

Use this skill when the user wants continuous regression monitoring during development. Watch mode observes file changes and automatically re-runs `evalview check` with debounced triggers.

SKILL.md

62 lines · ~577 tokens

Stats

LanguagePython

Stars114

Forks20

MaintenanceExcellent

Last CommitJun 15, 2026

Actions

View Source View Plugin View on GitHub View README

Watch Mode

Use this skill when the user wants continuous regression monitoring during development. Watch mode observes file changes and automatically re-runs evalview check with debounced triggers.

What this does

EvalView's watch mode uses watchdog to monitor directories for file changes (.py, .yaml, .yml, .json, .md, .txt, .toml, .cfg, .ini). When a change is detected, it runs a regression check via the gate() API and displays a live scorecard with pass/fail status, score deltas, tool changes, and streak tracking.

How to start watch mode

Watch mode is a CLI command (not an MCP tool). Help the user run it:

evalview watch

Common options

--quick — Skip LLM judge, deterministic checks only ($0 cost, sub-second)
--path src/ --path tests/ — Watch specific directories (default: current directory)
--test "my-test" — Only check a specific test by name
--test-dir tests/evalview — Path to test cases directory (default: tests)
--interval 1 — Debounce interval in seconds (default: 2.0)
--fail-on REGRESSION,TOOLS_CHANGED — Comma-separated statuses that count as failure (default: REGRESSION)
--sound — Terminal bell on regression

Examples

# Basic: watch everything, full checks
evalview watch

# Fast development loop: no LLM judge, 1-second debounce
evalview watch --quick --interval 1

# Watch specific directories and one test
evalview watch --path src/ --path tests/ --test "calculator-division"

# Strict mode: fail on any behavioral change
evalview watch --fail-on REGRESSION,TOOLS_CHANGED,OUTPUT_CHANGED --sound

Prerequisites

Watch mode requires the watchdog package. If not installed:

pip install evalview[watch]

Notes

Watch mode excludes .evalview/, .git/, venv/, node_modules/, __pycache__/, and other common non-source directories automatically.
The initial check runs immediately on startup before watching begins.
Results include a live scorecard with pass counts, regression counts, health percentage, and streak info.
--quick mode is ideal for tight development loops since it costs nothing and runs in sub-second time.

watch

Popularity

Invocation

Context Preview

SKILL.md

watch

Popularity

Invocation

Context Preview

SKILL.md

Watch Mode

What this does

How to start watch mode

Common options

Examples

Prerequisites

Notes

Similar Skills

Watch Mode

What this does

How to start watch mode

Common options

Examples

Prerequisites

Notes

Similar Skills