From datafusion-skills
Visualize and analyze Apache DataFusion query execution plans. Shows logical/physical plans, detects bottlenecks like full scans/sorts/joins/repartitions, suggests optimizations. Supports EXPLAIN ANALYZE.
How this skill is triggered — by the user, by Claude, or both
Slash command
/datafusion-skills:explain-planThis skill is limited to the following tools:
The summary Claude sees in its skill listing — used to decide when to auto-load this skill
You are helping the user understand and optimize query execution plans in Apache DataFusion.
You are helping the user understand and optimize query execution plans in Apache DataFusion.
Input: $@
command -v datafusion-cli
If not found, delegate to /datafusion-skills:install-datafusion.
STATE_DIR=""
test -f .datafusion-skills/state.sql && STATE_DIR=".datafusion-skills"
PROJECT_ROOT="$(git rev-parse --show-toplevel 2>/dev/null || echo "$PWD")"
PROJECT_ID="$(echo "$PROJECT_ROOT" | tr '/' '-')"
test -f "$HOME/.datafusion-skills/$PROJECT_ID/state.sql" && STATE_DIR="$HOME/.datafusion-skills/$PROJECT_ID"
--analyze is present → use EXPLAIN ANALYZE (actually runs the query, shows real metrics)EXPLAIN (shows the plan without execution)Extract the SQL query (remove --analyze flag if present).
If the input is natural language, generate SQL first (see /datafusion-skills:query for SQL generation guidelines).
Physical plan (default — shows the execution plan as a visual tree):
datafusion-cli ${STATE_DIR:+--file "$STATE_DIR/state.sql"} -c "
EXPLAIN $SQL;
"
Verbose plan (full optimizer trace — initial logical plan, each optimization pass, initial physical plan, final physical plan with stats and schema):
datafusion-cli ${STATE_DIR:+--file "$STATE_DIR/state.sql"} -c "
EXPLAIN VERBOSE $SQL;
"
With actual metrics (if --analyze) (runs the query, reports per-operator row counts, timing, memory, spill stats):
datafusion-cli ${STATE_DIR:+--file "$STATE_DIR/state.sql"} -c "
EXPLAIN ANALYZE $SQL;
"
Parse the execution plan output and provide insights:
Full table scans → Look for TableScan without pushdown predicates
WHERE clauses or partitioningSort operations → SortExec or SortPreservingMergeExec
Hash joins vs merge joins → HashJoinExec vs SortMergeJoinExec
Repartitioning → RepartitionExec
Projection pushdown → Check if only needed columns are read
Predicate pushdown → Check if filters are pushed to the scan level
predicate in TableScan nodesCoalesce partitions → CoalescePartitionsExec
Structure the analysis as:
Brief description of what the plan does.
Present the plan as an indented tree (already DataFusion's default output format).
Actionable suggestions, such as:
If relevant, suggest DataFusion configuration changes:
-- Increase target partitions for more parallelism
SET datafusion.execution.target_partitions = 8;
-- Increase batch size for throughput
SET datafusion.execution.batch_size = 16384;
-- Enable/disable optimizations
SET datafusion.optimizer.enable_round_robin_repartition = true;
To explore these settings, try
/datafusion-skills:datafusion-docs configuration options.
npx claudepluginhub datafusion-contrib/datafusion-skills --plugin datafusion-skillsReads query execution plans to diagnose slow queries, row misestimations, and missing indexes. Walks through EXPLAIN ANALYZE output and provides interactive debugging guidance.
EXPLAIN PLAN analysis, index selection, join strategies, and slow query debugging.
Analyzes SQL queries for slow patterns (N+1, full scans, bad joins), reads EXPLAIN plans, recommends indexes, and rewrites queries with explanations.