Analyzes uploaded Excel/CSV files with DuckDB: schema inspection, SQL queries, stats/summaries/pivots, filtering/joins, multi-sheet support, exports to CSV/JSON/Markdown.
npx claudepluginhub joshuarweaver/cascade-content-creation-misc-1 --plugin bytedance-deer-flow-1This skill uses the workspace's default tool permissions.
This skill analyzes user-uploaded Excel/CSV files using DuckDB — an in-process analytical SQL engine. It supports schema inspection, SQL-based querying, statistical summaries, and result export, all through a single Python script.
Conducts multi-round deep research on GitHub repos via API and web searches, generating markdown reports with executive summaries, timelines, metrics, and Mermaid diagrams.
Dynamically discovers and combines enabled skills into cohesive, unexpected delightful experiences like interactive HTML or themed artifacts. Activates on 'surprise me', inspiration, or boredom cues.
Generates images from structured JSON prompts via Python script execution. Supports reference images and aspect ratios for characters, scenes, products, visuals.
This skill analyzes user-uploaded Excel/CSV files using DuckDB — an in-process analytical SQL engine. It supports schema inspection, SQL-based querying, statistical summaries, and result export, all through a single Python script.
When a user uploads data files and requests analysis, identify:
/mnt/user-data/uploads//mnt/user-dataFirst, inspect the uploaded file to understand its schema:
python /mnt/skills/public/data-analysis/scripts/analyze.py \
--files /mnt/user-data/uploads/data.xlsx \
--action inspect
This returns:
Based on the schema, construct SQL queries to answer the user's questions.
python /mnt/skills/public/data-analysis/scripts/analyze.py \
--files /mnt/user-data/uploads/data.xlsx \
--action query \
--sql "SELECT category, COUNT(*) as count, AVG(amount) as avg_amount FROM Sheet1 GROUP BY category ORDER BY count DESC"
python /mnt/skills/public/data-analysis/scripts/analyze.py \
--files /mnt/user-data/uploads/data.xlsx \
--action summary \
--table Sheet1
This returns for each numeric column: count, mean, std, min, 25%, 50%, 75%, max, null_count. For string columns: count, unique, top value, frequency, null_count.
python /mnt/skills/public/data-analysis/scripts/analyze.py \
--files /mnt/user-data/uploads/data.xlsx \
--action query \
--sql "SELECT * FROM Sheet1 WHERE amount > 1000" \
--output-file /mnt/user-data/outputs/filtered-results.csv
Supported output formats (auto-detected from extension):
.csv — Comma-separated values.json — JSON array of records.md — Markdown table| Parameter | Required | Description |
|---|---|---|
--files | Yes | Space-separated paths to Excel/CSV files |
--action | Yes | One of: inspect, query, summary |
--sql | For query | SQL query to execute |
--table | For summary | Table/sheet name to summarize |
--output-file | No | Path to export results (CSV/JSON/MD) |
[!NOTE] Do NOT read the Python file, just call it with the parameters.
Sheet1, Sales, Revenue)data.csv → data)"2024_Sales"-- Row count
SELECT COUNT(*) FROM Sheet1
-- Distinct values in a column
SELECT DISTINCT category FROM Sheet1
-- Value distribution
SELECT category, COUNT(*) as cnt FROM Sheet1 GROUP BY category ORDER BY cnt DESC
-- Date range
SELECT MIN(date_col), MAX(date_col) FROM Sheet1
-- Revenue by category and month
SELECT category, DATE_TRUNC('month', order_date) as month,
SUM(revenue) as total_revenue
FROM Sales
GROUP BY category, month
ORDER BY month, total_revenue DESC
-- Top 10 customers by spend
SELECT customer_name, SUM(amount) as total_spend
FROM Orders GROUP BY customer_name
ORDER BY total_spend DESC LIMIT 10
-- Join sales with customer info from different files
SELECT s.order_id, s.amount, c.customer_name, c.region
FROM sales s
JOIN customers c ON s.customer_id = c.id
WHERE s.amount > 500
-- Running total and rank
SELECT order_date, amount,
SUM(amount) OVER (ORDER BY order_date) as running_total,
RANK() OVER (ORDER BY amount DESC) as amount_rank
FROM Sales
-- Pivot: monthly revenue by category
SELECT category,
SUM(CASE WHEN MONTH(date) = 1 THEN revenue END) as Jan,
SUM(CASE WHEN MONTH(date) = 2 THEN revenue END) as Feb,
SUM(CASE WHEN MONTH(date) = 3 THEN revenue END) as Mar
FROM Sales
GROUP BY category
User uploads sales_2024.xlsx (with sheets: Orders, Products, Customers) and asks: "Analyze my sales data — show top products by revenue and monthly trends."
python /mnt/skills/public/data-analysis/scripts/analyze.py \
--files /mnt/user-data/uploads/sales_2024.xlsx \
--action inspect
python /mnt/skills/public/data-analysis/scripts/analyze.py \
--files /mnt/user-data/uploads/sales_2024.xlsx \
--action query \
--sql "SELECT p.product_name, SUM(o.quantity * o.unit_price) as total_revenue, SUM(o.quantity) as total_units FROM Orders o JOIN Products p ON o.product_id = p.id GROUP BY p.product_name ORDER BY total_revenue DESC LIMIT 10"
python /mnt/skills/public/data-analysis/scripts/analyze.py \
--files /mnt/user-data/uploads/sales_2024.xlsx \
--action query \
--sql "SELECT DATE_TRUNC('month', order_date) as month, SUM(quantity * unit_price) as revenue FROM Orders GROUP BY month ORDER BY month" \
--output-file /mnt/user-data/outputs/monthly-trends.csv
python /mnt/skills/public/data-analysis/scripts/analyze.py \
--files /mnt/user-data/uploads/sales_2024.xlsx \
--action summary \
--table Orders
Present results to the user with clear explanations of findings, trends, and actionable insights.
User uploads orders.csv and customers.xlsx and asks: "Which region has the highest average order value?"
python /mnt/skills/public/data-analysis/scripts/analyze.py \
--files /mnt/user-data/uploads/orders.csv /mnt/user-data/uploads/customers.xlsx \
--action query \
--sql "SELECT c.region, AVG(o.amount) as avg_order_value, COUNT(*) as order_count FROM orders o JOIN Customers c ON o.customer_id = c.id GROUP BY c.region ORDER BY avg_order_value DESC"
After analysis:
present_files toolThe script automatically caches loaded data to avoid re-parsing files on every call:
/mnt/user-data/workspace/.data-analysis-cache/This is especially useful when running multiple queries against the same data files (inspect → query → summary).
DATE_TRUNC, EXTRACT, etc.)"Column Name"