From Claude-Data-Wrangler
Scan one or more flat data files (CSV, Parquet, JSON, JSONL, Excel) to assess data cleanliness and identify columns likely to fail SQL ingestion — inconsistent types, mixed delimiters, malformed dates, nullability mismatches, duplicate keys, encoding issues, and out-of-range values. Produces a ranked issue report with concrete remediation suggestions.
npx claudepluginhub danielrosehill/claude-code-plugins --plugin Claude-Data-WranglerThis skill uses the workspace's default tool permissions.
Audit flat data files and flag issues that would block SQL ingestion or cause analytical errors.
Conducts multi-round deep research on GitHub repos via API and web searches, generating markdown reports with executive summaries, timelines, metrics, and Mermaid diagrams.
Share bugs, ideas, or general feedback.
Audit flat data files and flag issues that would block SQL ingestion or cause analytical errors.
sql-load to catch issues early.object/string column that should be numeric or date, flag rows that don't conform and report the fraction."N/A", "null", "-", "", "#N/A") that are not real NaN.DD/MM/YYYY vs MM/DD/YYYY vs YYYY-MM-DD). Flag ambiguous orderings.{"Active": 42, "active": 17, "ACTIVE": 3}.é for é, ’ for ').country_code mixing US and USA).country_code populated but country_name null, or vice versa.end_date < start_date, total != sum(components) when components are present.chardet for encoding, dateutil for date parsing attempts.cleanliness_report.md:
standardise-country-names, text-to-numeric, etc.).cleanliness_report.json for programmatic consumption.pip install pandas pyarrow openpyxl chardet python-dateutil
json-restructure to flatten before deeper analysis.value column in an EAV table) — recognise the pattern and treat it differently rather than flagging as "mixed types".