From fabric-consumption
Analyzes lakehouse data interactively using Fabric Livy sessions and PySpark/Spark SQL for DataFrames, cross-lakehouse joins, Delta time-travel, and unstructured/JSON data. For advanced Python-based Spark analysis.
npx claudepluginhub microsoft/skills-for-fabric --plugin fabric-consumptionThis skill uses the workspace's default tool permissions.
> **Update Check — ONCE PER SESSION (mandatory)**
Manages Microsoft Fabric workspaces/lakehouses, develops PySpark notebooks/data pipelines/orchestration, provisions infrastructure as code for Spark workflows.
Provides Fabric Lakehouse context: Delta tables, schemas, shortcuts, SQL endpoints, access control, and best practices for designing and optimizing data solutions.
Manages Microsoft Fabric resources (workspaces, semantic models, reports, notebooks, lakehouses) using fab CLI commands. For deploying items, running jobs, querying data, managing OneLake files, automating operations.
Share bugs, ideas, or general feedback.
Update Check — ONCE PER SESSION (mandatory) The first time this skill is used in a session, run the check-updates skill before proceeding.
- GitHub Copilot CLI / VS Code: invoke the
check-updatesskill.- Claude Code / Cowork / Cursor / Windsurf / Codex: compare local vs remote package.json version.
- Skip if the check was already performed earlier in this session.
CRITICAL NOTES
- To find the workspace details (including its ID) from workspace name: list all workspaces and, then, use JMESPath filtering
- To find the item details (including its ID) from workspace ID, item type, and item name: list all items of that type in that workspace and, then, use JMESPath filtering
| Task | Reference | Notes |
|---|---|---|
| Fabric Topology & Key Concepts | COMMON-CORE.md § Fabric Topology & Key Concepts | |
| Environment URLs | COMMON-CORE.md § Environment URLs | |
| Authentication & Token Acquisition | COMMON-CORE.md § Authentication & Token Acquisition | Wrong audience = 401; read before any auth issue |
| Core Control-Plane REST APIs | COMMON-CORE.md § Core Control-Plane REST APIs | |
| Pagination | COMMON-CORE.md § Pagination | |
| Long-Running Operations (LRO) | COMMON-CORE.md § Long-Running Operations (LRO) | |
| Rate Limiting & Throttling | COMMON-CORE.md § Rate Limiting & Throttling | |
| OneLake Data Access | COMMON-CORE.md § OneLake Data Access | Requires storage.azure.com token, not Fabric token |
| Job Execution | COMMON-CORE.md § Job Execution | |
| Capacity Management | COMMON-CORE.md § Capacity Management | |
| Gotchas & Troubleshooting | COMMON-CORE.md § Gotchas & Troubleshooting | |
| Best Practices | COMMON-CORE.md § Best Practices | |
| Tool Selection Rationale | COMMON-CLI.md § Tool Selection Rationale | |
| Finding Workspaces and Items in Fabric | COMMON-CLI.md § Finding Workspaces and Items in Fabric | Mandatory — READ link first [needed for finding workspace id by its name or item id by its name, item type, and workspace id] |
| Authentication Recipes | COMMON-CLI.md § Authentication Recipes | az login flows and token acquisition |
Fabric Control-Plane API via az rest | COMMON-CLI.md § Fabric Control-Plane API via az rest | Always pass --resource https://api.fabric.microsoft.com or az rest fails |
| Pagination Pattern | COMMON-CLI.md § Pagination Pattern | |
| Long-Running Operations (LRO) Pattern | COMMON-CLI.md § Long-Running Operations (LRO) Pattern | |
OneLake Data Access via curl | COMMON-CLI.md § OneLake Data Access via curl | Use curl not az rest (different token audience) |
| SQL / TDS Data-Plane Access | COMMON-CLI.md § SQL / TDS Data-Plane Access | sqlcmd (Go) connect, query, CSV export |
| Job Execution (CLI) | COMMON-CLI.md § Job Execution | |
| OneLake Shortcuts | COMMON-CLI.md § OneLake Shortcuts | |
| Capacity Management (CLI) | COMMON-CLI.md § Capacity Management | |
| Composite Recipes | COMMON-CLI.md § Composite Recipes | |
| Gotchas & Troubleshooting (CLI-Specific) | COMMON-CLI.md § Gotchas & Troubleshooting (CLI-Specific) | az rest audience, shell escaping, token expiry |
Quick Reference: az rest Template | COMMON-CLI.md § Quick Reference: az rest Template | |
| Quick Reference: Token Audience / CLI Tool Matrix | COMMON-CLI.md § Quick Reference: Token Audience ↔ CLI Tool Matrix | Which --resource + tool for each service |
| Relationship to SPARK-AUTHORING-CORE.md | SPARK-CONSUMPTION-CORE.md § Relationship to SPARK-AUTHORING-CORE.md | |
| Data Engineering Consumption Capability Matrix | SPARK-CONSUMPTION-CORE.md § Data Engineering Consumption Capability Matrix | |
| OneLake Table APIs (Schema-enabled Lakehouses) | SPARK-CONSUMPTION-CORE.md § OneLake Table APIs (Schema-enabled Lakehouses) | Unity Catalog-compatible metadata; requires storage.azure.com token |
| Livy Session Management | SPARK-CONSUMPTION-CORE.md § Livy Session Management | Session creation, states, lifecycle, termination |
| Interactive Data Exploration | SPARK-CONSUMPTION-CORE.md § Interactive Data Exploration | Statement execution, output retrieval, data discovery |
| PySpark Analytics Patterns | SPARK-CONSUMPTION-CORE.md § PySpark Analytics Patterns | Cross-lakehouse 3-part naming, performance optimization |
| Must/Prefer/Avoid | SKILL.md § Must/Prefer/Avoid | MUST DO / AVOID / PREFER checklists |
| Quick Start | SKILL.md § Quick Start | CLI-specific Livy session setup and data exploration |
| Key Fabric Patterns | SKILL.md § Key Fabric Patterns | Spark pattern quick-reference table |
| Session Cleanup | SKILL.md § Session Cleanup | Clean up idle Livy sessions via CLI |
sqlcmd, not Spark. Only use this skill when the user explicitly requests PySpark, DataFrames, or Spark-specific features.Apply environment detection from COMMON-CORE.md Environment Detection Pattern to set:
$FABRIC_API_BASE and $FABRIC_RESOURCE_SCOPE$FABRIC_API_URL and $LIVY_API_PATH for Livy operationsAuthentication: Use token acquisition from COMMON-CLI.md Environment Detection and API Configuration
Preferred: Use COMMON-CLI.md item discovery patterns (Finding things in Fabric) to find workspaces and items by name.
Fallback (when workspace is already known):
# List workspaces
az rest --method get --resource "$FABRIC_RESOURCE_SCOPE" --url "$FABRIC_API_URL/workspaces" --query "value[].{name:displayName, id:id}" --output table
read -p "Workspace ID: " workspaceId
# List lakehouses in workspace
az rest --method get --resource "$FABRIC_RESOURCE_SCOPE" --url "$FABRIC_API_URL/workspaces/$workspaceId/items?type=Lakehouse" --query "value[].{name:displayName, id:id}" --output table
read -p "Lakehouse ID: " lakehouseId
# Check for existing idle session (avoid resource waste)
sessionId=$(az rest --method get --resource "$FABRIC_RESOURCE_SCOPE" --url "$FABRIC_API_URL/workspaces/$workspaceId/lakehouses/$lakehouseId/$LIVY_API_PATH/sessions" --query "sessions[?state=='idle'][0].id" --output tsv)
# Create if none available - FORCE STARTER POOL USAGE
if [[ -z "$sessionId" ]]; then
cat > /tmp/body.json << 'EOF'
{
"name":"analysis",
"driverMemory":"56g",
"driverCores":8,
"executorMemory":"56g",
"executorCores":8,
"conf": {
"spark.dynamicAllocation.enabled": "true",
"spark.fabric.pool.name": "Starter Pool"
}
}
EOF
sessionId=$(az rest --method post --resource "$FABRIC_RESOURCE_SCOPE" --url "$FABRIC_API_URL/workspaces/$workspaceId/lakehouses/$lakehouseId/$LIVY_API_PATH/sessions" --body @/tmp/body.json --query "id" --output tsv)
echo "⏳ Waiting for starter pool session to be ready..."
# With starter pools, this should be 3-5 seconds
timeout=30 # Reduced from 90s since starter pools are fast
while [ $timeout -gt 0 ]; do
state=$(az rest --resource "$FABRIC_RESOURCE_SCOPE" --url "$FABRIC_API_URL/workspaces/$workspaceId/lakehouses/$lakehouseId/$LIVY_API_PATH/sessions/$sessionId" --query "state" --output tsv)
if [[ "$state" == "idle" ]]; then
echo "✅ Session ready in starter pool!"
break
fi
echo " Session state: $state (${timeout}s remaining)"
sleep 3
timeout=$((timeout - 3))
done
fi
# Execute statement (LLM knows Python/Spark syntax)
cat > /tmp/body.json << 'EOF'
{
"code": "spark.sql(\"SHOW TABLES\").show(); df = spark.table(\"your_table\"); df.describe().show()",
"kind": "pyspark"
}
EOF
az rest --method post --resource "$FABRIC_RESOURCE_SCOPE" --url "$FABRIC_API_URL/workspaces/$workspaceId/lakehouses/$lakehouseId/$LIVY_API_PATH/sessions/$sessionId/statements" --body @/tmp/body.json
| Pattern | Code | Use Case |
|---|---|---|
| Table Discovery | spark.sql("SHOW TABLES") | List available tables |
| Cross-Lakehouse | spark.sql("SELECT * FROM other_workspace.table") | Query across workspaces |
| Delta Features | df.history(), df.readVersion(1) | Time travel, versioning |
| Schema Evolution | df.printSchema() | Understand structure |
# Clean up idle sessions (optional)
az rest --method get --resource "$FABRIC_RESOURCE_SCOPE" --url "$FABRIC_API_URL/workspaces/$workspaceId/lakehouses/$lakehouseId/$LIVY_API_PATH/sessions" --query "sessions[?state=='idle'].id" --output tsv | xargs -I {} az rest --method delete --resource "$FABRIC_RESOURCE_SCOPE" --url "$FABRIC_API_URL/workspaces/$workspaceId/lakehouses/$lakehouseId/$LIVY_API_PATH/sessions/{}"
Focus: This skill provides Fabric-specific REST API patterns. LLM already knows Python/Spark syntax — we focus on Fabric integration, session management, and API endpoints.