From mozilla-bigquery-etl-skills
Use this skill to create or update README.md files for BigQuery ETL tables in the mozilla bigquery-etl repository. Follows layout conventions derived from comparing README files across the repo — rich style with emoji headings, Mermaid data flow diagram, graduated example queries, and concise metadata overview table. Requires schema.yaml with complete descriptions (run schema-enricher first if needed) and a complete metadata.yaml.
npx claudepluginhub mozilla/bigquery-etl-skills --plugin bigquery-etl-skillsThis skill uses the workspace's default tool permissions.
**Prerequisites:** Run `schema-enricher` first if schema.yaml is missing descriptions; ensure metadata.yaml is present and complete.
Creates isolated Git worktrees for feature branches with prioritized directory selection, gitignore safety checks, auto project setup for Node/Python/Rust/Go, and baseline verification.
Executes implementation plans in current session by dispatching fresh subagents per independent task, with two-stage reviews: spec compliance then code quality.
Dispatches parallel agents to independently tackle 2+ tasks like separate test failures or subsystems without shared state or dependencies.
Prerequisites: Run schema-enricher first if schema.yaml is missing descriptions; ensure metadata.yaml is present and complete.
When to use: Creating or updating README.md for any shared dataset, derived table, or table with multiple downstream consumers
BEFORE generating any README, review the following:
Layout conventions: READ references/layout_conventions.md
README template: READ assets/readme_template.md and COPY its structure
{placeholder} from the source filesRead all three files before writing anything:
sql/<project>/<dataset>/<table>/query.sql → source tables, GROUP BY dimensions, metrics, @param
sql/<project>/<dataset>/<table>/metadata.yaml → DAG, partitioning, clustering, retention, owners
sql/<project>/<dataset>/<table>/schema.yaml → field names, types, descriptions for Key Fields section
If only query.py exists (no query.sql): note it — the Data Flow and How It Works sections may be incomplete or require manual input. Fill what is possible from metadata.yaml and schema.yaml.
Extract and record:
@param_name for Implementation Notes_v1)ls sql/<project>/<dataset>/<table>/README.md
READ assets/readme_template.md and fill every placeholder:
📌 Overview table — use metadata.yaml for DAG/partition/cluster/retention/owner; derive Version from directory name.
🗺️ Data Flow — Mermaid flowchart TD with exactly 3 nodes:
**This query** with filter and GROUP BY descriptionPartitioned table with time and cluster annotation🧠 How It Works — 4–5 numbered steps. Step 5 MUST explicitly state data inclusion/exclusion policy:
🧾 Key Fields — two sub-tables (Dimensions, Metrics). Use {a\|b\|c} shorthand for related field families. Group dimensions by: Date & Geo, Browser, Search, [Product] config, User. Omit dimension rows not applicable to this table.
🧩 Example Queries — exactly 3, graduated:
Rules:
SAFE_DIVIDE() for ratios — never raw divisionGROUP BY 1, 2 shorthand-- N. Description🔧 Implementation Notes — 3–5 bullets extracted from query.sql logic.
📌 Notes & Conventions — bullet definitions for key fields from schema.yaml descriptions.
🗃️ Schema & Related Tables — one section; combine schema.yaml link + upstream + downstream.
Before finalizing, verify:
GROUP BY 1, 2 shorthandIf over 170 lines, trim by: shortening SQL examples, collapsing Notes & Conventions bullets, abbreviating How It Works steps.
Write the README.md to:
sql/<project>/<dataset>/<table>/README.md
Then read back the written file and confirm:
{placeholder} tokens remain unfilled (exception: if only query.py exists, Data Flow and How It Works may be partially filled — note which sections and why)flowchart TD syntaxReport:
| Skill | When to invoke |
|---|---|
schema-enricher | Run first if schema.yaml is missing descriptions — needed for Notes & Conventions |
create-pr | After README.md is written — stages, commits, and opens a draft PR |
Table has multiple downstream consumers OR is a shared dataset?
→ Rich style (this skill)
Table is a UDF, static reference, or simple single-consumer table?
→ Minimal style: title + ## Description with 5–10 bullet points
→ Do not use this skill for minimal style
Create a README.md for telemetry_derived.newtab_daily_interactions_aggregates_v1
Update the README.md for firefox_desktop_derived.newtab_clients_daily_v2 — add missing example queries
Generate README for ads_derived.impressions_v1