Skill

data-processing

Use when working with structured data files (CSV, JSON, YAML, TOML, Parquet) — querying, transforming, filtering, aggregating, or converting between formats

Popularity

Stars

Invocation

How this skill is triggered — by the user, by Claude, or both

Slash command

/cli-power-skills:data-processing

User invocable

Model invocable

Inline context

Default effort

Tool Access

This skill is limited to the following tools:

Bash(jq*)Bash(yq*)Bash(gron*)Bash(mlr*)Bash(xsv*)Bash(duckdb*)ReadGlobGrep

Context Preview

The summary Claude sees in its skill listing — used to decide when to auto-load this skill

- Querying, filtering, or transforming JSON files

SKILL.md

157 lines · ~1.3k tokens

Stats

Stars1

MaintenanceExcellent

Last CommitApr 14, 2026

Actions

View Source View Plugin View on GitHub View README

Stats

Actions

Data Processing

When to Use

Querying, filtering, or transforming JSON files
Reading or converting YAML, TOML, or XML config files
Analyzing, aggregating, or joining CSV/TSV/Parquet files
Running SQL queries against local data files without a database server
Converting between data formats (JSON to YAML, CSV to JSON, etc.)
Exploring deeply nested JSON structures

Tools

Tool	Purpose	Structured output
jq	Query and transform JSON	Native JSON
yq	Query and transform YAML, TOML, XML	`-o json` for JSON output
gron	Flatten JSON into greppable `path.to.key = value` lines	`--ungron` to reverse back to JSON
miller (mlr)	Transform CSV/JSON/TSV records with awk-like verbs	`--json` for JSON output
xsv	Fast CSV slicing, searching, joining, statistics	CSV native (pipe to `xsv table` for display)
DuckDB	SQL queries on CSV, JSON, Parquet files	`-json` flag for JSON output

Patterns

JSON: Filter array elements by field value

jq '.items[] | select(.status == "active")' data.json

JSON: Extract specific fields from array

jq '[.users[] | {name: .name, email: .email}]' data.json

JSON: Count items grouped by field

jq '[.events[] | .type] | group_by(.) | map({type: .[0], count: length})' data.json

YAML: Read a nested value

yq '.spec.containers[0].image' deployment.yaml

YAML: Convert entire file to JSON

yq -o json config.yaml

TOML: Read a value

yq -p toml '.database.host' config.toml

JSON: Explore unknown structure by grepping paths

gron data.json | grep -i "error"

CSV: Column statistics (min, max, mean, stddev)

xsv stats data.csv | xsv table

CSV: Search rows matching a pattern in a column

xsv search -s status "active" data.csv

CSV: Select specific columns

xsv select name,email,created_at users.csv

CSV: Sort by column

xsv sort -s revenue -R sales.csv

SQL: Query a CSV file

duckdb -c "SELECT department, COUNT(*) as cnt, AVG(salary) as avg_sal FROM 'employees.csv' GROUP BY department ORDER BY avg_sal DESC"

SQL: Query a JSON file

duckdb -c "SELECT * FROM read_json_auto('events.json') WHERE type = 'error' LIMIT 20"

SQL: Query Parquet files

duckdb -c "SELECT * FROM 'data/*.parquet' WHERE created_at > '2026-01-01'"

CSV/JSON: Transform records with miller

mlr --csv filter '$revenue > 1000' then sort-by -nr revenue sales.csv

Format conversion: CSV to JSON

mlr --icsv --ojson cat data.csv

Pipelines

YAML config → JSON → SQL query

yq -o json config.yaml | duckdb -c "SELECT key, value FROM read_json_auto('/dev/stdin') WHERE env = 'production'"

Each stage: yq converts YAML to JSON, DuckDB runs SQL on the JSON stream.

Grep nested JSON paths → reconstruct matching subset

gron large.json | grep "\.errors\[" | gron --ungron

Each stage: gron flattens JSON to paths, grep filters, ungron reconstructs valid JSON from matches.

CSV filter → aggregate with SQL

xsv search -s region "EU" sales.csv | duckdb -c "SELECT product, SUM(revenue) as total FROM read_csv_auto('/dev/stdin') GROUP BY product ORDER BY total DESC"

Each stage: xsv filters rows by region, DuckDB aggregates the filtered stream.

Join two CSV files with SQL

duckdb -c "SELECT u.name, u.email, o.total, o.date FROM 'users.csv' u JOIN 'orders.csv' o ON u.id = o.user_id ORDER BY o.date DESC"

Multi-format pipeline: JSON → CSV → stats

jq -r '.records[] | [.name, .score] | @csv' data.json | xsv stats

Each stage: jq extracts fields to CSV format, xsv computes statistics.

Prefer Over

Prefer DuckDB over Python/pandas for ad-hoc SQL queries on files — single command, no script, handles large files
Prefer jq over Python json module for one-off JSON transforms — single pipeline vs. multi-line script
Prefer xsv over awk/cut for CSV operations — correct CSV parsing, handles quoted fields and escapes
Prefer miller over awk for format-aware record transformations — understands CSV/JSON headers natively
Prefer yq over custom parsers for config file reads — handles YAML, TOML, XML with consistent syntax

Do NOT Use When

Data is already in a running database — query the database directly
File is under 10 lines — just use the Read tool and process in-context
Task requires complex multi-step logic with conditionals — write a Python script instead
JSON is simple enough to read by eye — use the Read tool, don't over-engineer
Working with binary or non-structured data formats

data-processing

Popularity

Invocation

Tool Access

Context Preview

SKILL.md

data-processing

Popularity

Invocation

Tool Access

Context Preview

SKILL.md

Data Processing

When to Use

Tools

Patterns

JSON: Filter array elements by field value

JSON: Extract specific fields from array

JSON: Count items grouped by field

YAML: Read a nested value

YAML: Convert entire file to JSON

TOML: Read a value

JSON: Explore unknown structure by grepping paths

CSV: Column statistics (min, max, mean, stddev)

CSV: Search rows matching a pattern in a column

CSV: Select specific columns

CSV: Sort by column

SQL: Query a CSV file

SQL: Query a JSON file

SQL: Query Parquet files

CSV/JSON: Transform records with miller

Format conversion: CSV to JSON

Pipelines

YAML config → JSON → SQL query

Grep nested JSON paths → reconstruct matching subset

CSV filter → aggregate with SQL

Join two CSV files with SQL

Multi-format pipeline: JSON → CSV → stats

Prefer Over

Do NOT Use When

Similar Skills

Data Processing

When to Use

Tools

Patterns

JSON: Filter array elements by field value

JSON: Extract specific fields from array

JSON: Count items grouped by field

YAML: Read a nested value

YAML: Convert entire file to JSON

TOML: Read a value

JSON: Explore unknown structure by grepping paths

CSV: Column statistics (min, max, mean, stddev)

CSV: Search rows matching a pattern in a column

CSV: Select specific columns

CSV: Sort by column

SQL: Query a CSV file

SQL: Query a JSON file

SQL: Query Parquet files

CSV/JSON: Transform records with miller

Format conversion: CSV to JSON

Pipelines

YAML config → JSON → SQL query

Grep nested JSON paths → reconstruct matching subset

CSV filter → aggregate with SQL

Join two CSV files with SQL

Multi-format pipeline: JSON → CSV → stats

Prefer Over

Do NOT Use When

Similar Skills