From Claude-Data-Wrangler
Create a data dictionary for a dataset (CSV, JSON, JSONL, Parquet, Excel) that documents every column/field — name, type, description, units, example values, nulls allowed, source. Use when a dataset has no accompanying documentation and the user wants one generated.
npx claudepluginhub danielrosehill/claude-code-plugins --plugin Claude-Data-WranglerThis skill uses the workspace's default tool permissions.
Generate a data dictionary file alongside a dataset.
Conducts multi-round deep research on GitHub repos via API and web searches, generating markdown reports with executive summaries, timelines, metrics, and Mermaid diagrams.
Share bugs, ideas, or general feedback.
Generate a data dictionary file alongside a dataset.
data_dictionary.{md,yaml,json,csv} in its folder.Default to data_dictionary.md in the dataset's folder. Offer alternatives:
data_dictionary.yaml for structured/programmatic use.data_dictionary.json for schema-validator workflows.data_dictionary.csv for spreadsheet editing.# Data Dictionary: <dataset filename>
**Source**: <file path or URL>
**Format**: <csv|json|jsonl|parquet|xlsx>
**Rows**: <count>
**Generated**: <YYYY-MM-DD>
**Generated by**: Claude-Data-Wrangler add-data-dictionary skill
## Columns
| Column | Type | Description | Units | Nullable | Example | Notes |
|---|---|---|---|---|---|---|
| country | string | Country name as supplied in source | — | No | "France" | Standardised via standardise-country-names skill |
| iso3166_alpha2 | string | ISO 3166-1 alpha-2 country code | — | No | "FR" | Derived column |
| revenue_numeric | float | Revenue parsed from original text | USD | Yes | 4270000.0 | Original format: `$4.27M`, scale suffix applied |
| ... | ... | ... | ... | ... | ... | ... |
## Provenance / Transformations
- <YYYY-MM-DD>: Source file ingested.
- <YYYY-MM-DD>: Country names standardised (skill: standardise-country-names).
- <YYYY-MM-DD>: ISO 3166 codes added (skill: add-iso3166).
- <YYYY-MM-DD>: Currency enrichment applied (skill: enrich-with-currency).
## Known issues / limitations
- <list unresolved rows, ambiguous mappings, etc.>
dtype<TODO: describe> placeholders and let the user fill in later.update-data-dictionary).pip install pandas pyarrow openpyxl
json-restructure skill if the user wants restructuring.bytes and note "binary data, not profiled".