From duckdb-skills
Converts data files between formats like CSV, Parquet, JSON, Excel, GeoJSON using DuckDB. Supports remote inputs and binary outputs Claude cannot generate natively.
How this skill is triggered — by the user, by Claude, or both
Slash command
/duckdb-skills:convert-fileThis skill is limited to the following tools:
The summary Claude sees in its skill listing — used to decide when to auto-load this skill
You are helping the user convert a data file from one format to another using DuckDB.
You are helping the user convert a data file from one format to another using DuckDB.
Input file: $0
Output file: ${1:-}
Input: $0. If it's a bare filename (no /), resolve to a full path with find "$PWD" -name "$0" -not -path '*/.git/*' 2>/dev/null | head -1.
Output: If $1 is provided, use it as the output path. If not, default to the same stem as the input with a .parquet extension (e.g., data.csv → data.parquet).
Infer the output format from the output file extension:
| Extension | Format clause |
|---|---|
.parquet, .pq | (default, no clause needed) |
.csv | (FORMAT csv, HEADER) |
.tsv | (FORMAT csv, HEADER, DELIMITER '\t') |
.json | (FORMAT json, ARRAY true) |
.jsonl, .ndjson | (FORMAT json, ARRAY false) |
.xlsx | (FORMAT xlsx) — requires INSTALL excel; LOAD excel; |
.geojson | (FORMAT GDAL, DRIVER 'GeoJSON') — requires LOAD spatial; |
.gpkg | (FORMAT GDAL, DRIVER 'GPKG') — requires LOAD spatial; |
.shp | (FORMAT GDAL, DRIVER 'ESRI Shapefile') — requires LOAD spatial; |
Run a single DuckDB command. Prepend extension loads as needed based on both the input and output formats.
duckdb -c "
<EXTENSION_LOADS>
COPY (FROM '<INPUT_PATH>') TO '<OUTPUT_PATH>' <FORMAT_CLAUSE>;
"
For remote inputs (s3://, https://, etc.), prepend the same protocol setup as read-file:
| Protocol | Prepend |
|---|---|
s3:// | LOAD httpfs; CREATE SECRET (TYPE S3, PROVIDER credential_chain); |
gs:// / gcs:// | LOAD httpfs; CREATE SECRET (TYPE GCS, PROVIDER credential_chain); |
https:// / http:// | LOAD httpfs; |
If the user mentions partitioning (e.g., "partition by year"), add PARTITION_BY (col) to the format clause. This only works with Parquet and CSV output.
If the user mentions compression (e.g., "use zstd"), add CODEC 'zstd' for Parquet output.
On success, report:
ls -lh)On failure:
duckdb: command not found → delegate to /duckdb-skills:install-duckdb/duckdb-skills:read-file first to inspect itnpx claudepluginhub duckdb/duckdb-skills --plugin duckdb-skillsConvert tabular data between CSV, TSV, Excel, JSONL, Parquet, and other formats with auto-detection, indexing, and verification using qsv tools.
Migrates data between formats (CSV/JSON/Excel/Parquet), databases (MySQL to PostgreSQL/SQLite), and API imports using pandas, pgloader, with backups, dry runs, and integrity checks.
Reads and explores Parquet, CSV, JSON, Arrow IPC, Avro files locally, from S3/GCS using datafusion-cli for schema inspection, row counts, and data previews.