Skill

newsroom-graph-audit

Markdown-driven validation + cache-regeneration capability for an installed newsroom. It walks the newsroom's primitive zones, validates each artifact's frontmatter against the bundled schemas, mechanically regenerates the canonical-vs-cache fields per the cardinality rules, runs the two-path provenance integrity audit over FROZEN edition snapshots, checks the reserved status-vocabulary separation (R61), and checks tag-vocabulary compliance — then writes a drift report. Manifest-first: it reads MANIFEST.md before globbing zones, and treats the meta/*-index caches as subordinate to the manifest (manifest wins on conflict). All JUDGMENT (tag normalization, provenance pass/fail interpretation, orphan/editorial-debt routing) stays in agent reasoning; the one bundled script does ONLY the mechanical cache recompute (E3 fence). Idempotent: a re-run on a clean newsroom produces a zero-drift report and no artifact writes. Use when the user asks to "audit the newsroom graph", "regenerate caches", "regenerate the meta indexes", "check provenance", "verify the provenance chain", "check canonical-vs-cache drift", "check status vocabulary", "check tag compliance", "find orphans", "graph audit", or any request about newsroom graph completeness. Also runs as a post-step after a distribution publish, as a preflight in the managing-editor briefing, and as the installer's P5 zero-drift dry-run baseline.

Invocation

How this skill is triggered — by the user, by Claude, or both

Slash command

/newsroom-os:newsroom-graph-audit

User invocable

Model invocable

Inline context

Default effort

Tool Access

This skill is limited to the following tools:

ReadWriteBashGrepGlob

Context Preview

The summary Claude sees in its skill listing — used to decide when to auto-load this skill

The auditor that keeps an installed newsroom's graph in shape over time. It is a

Supporting Files

README.mdreferences/audit-runbook.mdreferences/cache-regen-runbook.mdreferences/drift-report-format.mdreferences/provenance-walk-runbook.mdscripts/regen_caches.py

SKILL.md

251 lines · ~3.5k tokens

Stats

LanguagePython

Parent stars0

MaintenanceExcellent

Last CommitJun 27, 2026

Actions

View Source View Plugin View on GitHub View README

Stats

Actions

newsroom-graph-audit

The auditor that keeps an installed newsroom's graph in shape over time. It is a skill, not a CLI — the agent reads the invocation, picks the scope from how the user phrased the trigger, runs the relevant passes, and writes a drift report. All intelligence is the agent reading frontmatter and making judgments. The single bundled script does only the mechanical cache recompute (E3 fence below).

This skill operates on an installed newsroom (a project-workspace-contract@2 workspace). It reads the newsroom's MANIFEST.md, the installed schema + contract copies the installer wrote, and the installed meta/*-index caches. Its own runtime references resolve inside this plugin dir (R63): the audit-pass runbooks under this skill's references/ and the bundled cache-recompute script under scripts/. It conforms to the schemas/contracts the substrate ships (bundled children under newsroom-install/templates/substrate/); the operative copies at audit time are the newsroom's installed ones.

Phase 0 — manifest-first (R62, AC11)

Read MANIFEST.md at the newsroom root FIRST, before any files-first globbing of the primitive zone folders. The manifest is the routing surface: it lists which artifacts exist, their kind / status / path / edges. Discover artifacts via the manifest; use the zone folders (newsroom/dossiers/, newsroom/story-arcs/, newsroom/wire/, newsroom/commissions/, newsroom/campaigns/, newsroom/editions/, newsroom/editorial-plans/) only to resolve the concrete files the manifest points at. Files-first discovery — globbing a zone or ls-ing a directory to learn what exists — is the refused anti-pattern.

The meta/*-index caches (kind: newsroom-index) are DERIVED caches subordinate to MANIFEST.md: on any disagreement, the manifest wins and the index is stale and corrected, never trusted over the manifest. The audit regenerates the indexes; it never hand-authors them.

What the audit does (six passes)

The agent runs these passes; the per-pass procedure lives in the bundled runbooks under references/ (read the relevant one before executing). The passes run in this order so later comparisons aren't confused by stale caches.

(a) Frontmatter schema validation

For each artifact the manifest lists, validate frontmatter against the bundled schema for its kind (wire-note, dossier, story-arc, editorial-plan, commission, campaign, edition): required + conditionally-required fields present, enum membership, the validation rules each schema states. Surface violations as drift; never auto-fix a canonical field — flag it for the canonical writer (topic-editor, managing-editor, distribution, or the user).

(b) Canonical→cache regeneration (the MECHANICAL pass — E3)

Regenerate the cache fields from their canonical sources per cardinality-rules.md, then rebuild the meta/*-index caches. This is the only pass that writes, and it writes only cache fields (and the indexes) — never a canonical field. The mechanical recompute is delegated to the bundled scripts/regen_caches.py (E3 fence): it re-derives the cache fields and detects drift, with no editorial judgment. The graph-audit-owned cache fields are enumerated in cardinality-rules.md §"Graph-audit-owned cache fields". Respect any frozen_at: marker on archived campaigns (skip cache regen for frozen campaigns).

(c) Provenance integrity audit (two typed paths, FROZEN snapshots)

For each published edition, walk the provenance chain over the edition's FROZEN snapshot (included_story_arcs / included_quick_notes), selecting the path by item type:

Full-story path (per included_story_arcs[] entry): Edition → Commission → Story-Arc → Dossier → Wire → anchor. Every step must resolve to the next; the terminal wire must resolve to a concrete url / corpus_note / raw anchor.
Quick-note path (per included_quick_notes[] entry): Edition → included_quick_note → Wire (≥1 of wire_refs[]) → anchor. No arc, no required dossier; an empty wire_refs[] is a dead-end. When memory_disposition: promote_to_dossier, additionally check promoted_dossier resolves (enrichment edge, not a required link).

A dead-end on whichever path applies is reviewer-refusable — the audit refuses (records a HIGH-severity finding) but does NOT autonomously transition any status: / *_state: or act as a workflow gate beyond that refusal (R32 / R68 / P9). It validates the FROZEN snapshot, distinct from the live caches. Wikilinks that break over time (an upstream arc later renamed/retired) are flag-not-fail — reported, but they do not invalidate the historical edition. Path selection + walk behavior are in references/provenance-walk-runbook.md and io-contracts.md.

(d) Status-vocabulary separation (R61)

Check the reserved-status: rule on every artifact: status: carries only the closed enum draft | review | greenlit | published | archived | deprecated; each primitive's own lifecycle lives on its dedicated *_state: field (dossier_state, arc_state, plan_state, campaign_state, delivery_state); a *_state value written into status: (or vice versa), or a *_state: value appearing in a manifest routing status:, is DUAL-VOCABULARY-DRIFT — reviewer-refusable.

(e) Tag-vocabulary compliance

Check every frontmatter tags: array against the installation's controlled vocabulary (knowledge/tag-vocabulary.md). Unknown and deprecated tags are drift; the agent proposes normalizations (judgment — never auto-rewrites a tag in an artifact) and routes proposed new terms to the vocabulary proposal queue for human approval (Commandment XII).

(f) Orphan / dead-end / editorial-debt detection

Flag orphan dossiers, stale arc proposals, unfolded wire notes, stranded campaigns, reservation churn, dangling canonical wikilinks, and quick-note disposition dead-ends. Canon-broken findings (reservation churn, canonical dangling) contribute to drift; editorial-debt findings route to the managing-editor awareness loop.

The E3 fence — script does ONLY mechanical work

A hard line: the bundled script does only mechanical cache recompute; ALL judgment stays in agent reasoning. scripts/regen_caches.py re-derives the cache fields from canonical frontmatter and detects drift — nothing else. Tag normalization, provenance pass/fail interpretation, dossier proposals, orphan/editorial-debt routing, and tier/relevance judgment are agent reasoning, never script logic (P13 / R32). A reviewer (the helper-script-auditor) refuses any script that smuggles judgment in. If a deployment prefers not to ship code, pass (b) MAY be executed as the documented mechanical routine in references/cache-regen-runbook.md instead — same mechanical result, no judgment.

Authority + safety

The audit writes only the cache fields enumerated in cardinality-rules.md plus the meta/*-index caches. It never writes a canonical field — drift on a canonical field is flagged for its canonical writer.
The audit respects frozen_at: on campaigns (skips cache regen; keeps frozen caches stable).
The audit writes a drift report on every run (even clean) for auditability.

Idempotency + exit semantics

Re-running on an already-clean newsroom MUST produce a drift report with drift_count: 0 / exit_status: clean, zero artifact writes, and zero index writes (no-op detection). This is exactly the installer's P5 zero-drift baseline on the empty newsroom.

Final state	Meaning
`clean`	no drift, no cache writes, no validation failures; indexes in sync
`drift-detected`	≥1 of: drift finding, cache write performed, schema violation, provenance dead-end, R61 drift, tag violation, dangling canonical link

If the audit cannot complete (e.g. a parse error blocks the run), it writes whatever it has, records the failure, and returns drift-detected with a HIGH-severity "audit-itself-failed" entry.

Scopes

The agent picks the scope from the trigger phrasing: a single artifact type (dossier, story-arc, wire-note, editorial-plan, commission, campaign, edition), or all (the full sweep — default when ambiguous). The provenance integrity audit (pass c) keys off editions; the R61 + tag passes (d, e) run across all artifacts.

Reading order

This file.
references/audit-runbook.md — the per-pass procedure + scope resolution.
references/cache-regen-runbook.md — the mechanical recompute recipe (and the no-code documented-routine alternative to the script).
references/provenance-walk-runbook.md — the two-path walk + frozen-snapshot handling.
references/drift-report-format.md — drift report schema, location, severity.
The installed newsroom's schemas + cardinality-rules.md + io-contracts.md (the operative copies the installer wrote) for what to validate against.

Language Handling (P12)

The audit's human-facing output language is config-driven, never hardcoded — the drift report prose, the proposed tag normalizations, and the surfaced findings follow the installed config/company-context.md §7 output_language. Respond to the user in the conversation's language unless asked otherwise.

Detect the conversation language; respond in the user's language unless explicitly asked otherwise.
Write the drift report prose and the finding descriptions in the install's output_language (company-context §7) — honor that config declaration; do not hardcode a language.
Translate internal checklist labels and section headings naturally into the output language — do not force English headings into a non-English drift report.
Preserve legal, financial, and technical terms in their original language. Tag normalization proposals reference the installed controlled vocabulary (knowledge/tag-vocabulary.md) verbatim — never translate or transliterate a controlled-vocabulary term.
Preserve Unicode characters natively (ü, ö, ä, ß, etc.); never substitute transliterations (ue, oe, ae, ss).
Frontmatter field names and enum values — including the seven manifest kind values, every *_state, and the reserved status: enum — stay English regardless of content language (they are the interoperability surface this skill validates).

End-of-run — self-improvement (P15)

Standard runs end normally: the drift report (clean or drift-detected) + any regenerated caches + the surfaced findings routed to their canonical writers. Do not fire a generic "anything to add?" prompt. (Graph-audit carries genuine agent judgment — tag-normalization proposals, provenance pass/fail interpretation, orphan / editorial-debt routing — so it is not the stateless-utility exception class P15 allows to omit the prompt; the conditional prompt applies.)

When a deviation specific to graph-audit occurred during the run, surface it and ask whether to fold the change back into SKILL.md / the bundled references (the audit runbook, the cache-regen runbook, the provenance-walk runbook, the drift-report format) before going idle. Graph-audit-specific triggers:

a provenance walk hit a path shape the two-path runbook does not cleanly cover, or a flag-not-fail vs reviewer-refusable call was ambiguous;
the mechanical cache-recompute (the E3-fenced scripts/regen_caches.py or its no-code documented-routine alternative) disagreed with a manual expectation, suggesting a cardinality-rules edge case;
a tag-compliance finding needed a normalization the controlled vocabulary does not yet hold (route the proposal to the user-gated vocabulary queue);
an R61 status-vocabulary or orphan/editorial-debt finding revealed a check the drift-report format does not surface clearly.

Name the concrete artifact / pass and the reference doc the fix would touch. If nothing deviated, end without the prompt.

[[newsroom-os/skills/newsroom-install/templates/substrate/contracts/cardinality-rules.md]]
[[newsroom-os/skills/newsroom-install/templates/substrate/contracts/io-contracts.md]]
[[newsroom-os/skills/newsroom-install/templates/substrate/governance/governance.md]]
[[newsroom-os/skills/newsroom-graph-audit/scripts/regen_caches.py]]

newsroom-graph-audit

Invocation

Tool Access

Context Preview

Supporting Files

SKILL.md

newsroom-graph-audit

Invocation

Tool Access

Context Preview

Supporting Files

SKILL.md

newsroom-graph-audit

Phase 0 — manifest-first (R62, AC11)

What the audit does (six passes)

(a) Frontmatter schema validation

(b) Canonical→cache regeneration (the MECHANICAL pass — E3)

(c) Provenance integrity audit (two typed paths, FROZEN snapshots)

(d) Status-vocabulary separation (R61)

(e) Tag-vocabulary compliance

(f) Orphan / dead-end / editorial-debt detection

The E3 fence — script does ONLY mechanical work

Authority + safety

Idempotency + exit semantics

Scopes

Reading order

Language Handling (P12)

End-of-run — self-improvement (P15)

Related

Similar Skills

newsroom-graph-audit

Phase 0 — manifest-first (R62, AC11)

What the audit does (six passes)

(a) Frontmatter schema validation

(b) Canonical→cache regeneration (the MECHANICAL pass — E3)

(c) Provenance integrity audit (two typed paths, FROZEN snapshots)

(d) Status-vocabulary separation (R61)

(e) Tag-vocabulary compliance

(f) Orphan / dead-end / editorial-debt detection

The E3 fence — script does ONLY mechanical work

Authority + safety

Idempotency + exit semantics

Scopes

Reading order

Language Handling (P12)

End-of-run — self-improvement (P15)

Related

Similar Skills