From Claude-Data-Wrangler
Package a dataset as Parquet and/or JSONL for storage, distribution, or upload to data platforms (Hugging Face, S3, Wasabi, etc.). Handles partitioning, compression, schema enforcement, and side-by-side emission of both formats. Use when the user wants to produce analytics-friendly or ML-friendly files from a CSV/JSON/Excel source.
npx claudepluginhub danielrosehill/claude-code-plugins --plugin Claude-Data-WranglerThis skill uses the workspace's default tool permissions.
Produce Parquet and/or JSONL output for a dataset, optionally partitioned and compressed.
Conducts multi-round deep research on GitHub repos via API and web searches, generating markdown reports with executive summaries, timelines, metrics, and Mermaid diagrams.
Share bugs, ideas, or general feedback.
Produce Parquet and/or JSONL output for a dataset, optionally partitioned and compressed.
hf-dataset-push skill.snappy (default, fast), zstd (better ratio), gzip (max portability).pyarrow. Offer 64k / 256k rows for large datasets.country, year). Produces a directory layout (country=FR/data.parquet).gzip, zstd — produce .jsonl.gz / .jsonl.zst.<stem>.parquet (or <stem>/ directory for partitioned).<stem>.jsonl or <stem>.jsonl.gz.pip install pandas pyarrow
# optional
pip install zstandard
null. Empty strings are not null.pyarrow.parquet.ParquetWriter and chunked JSONL writes.json-restructure) before packaging.