From fabric-skills
Analyzes Fabric lakehouse data interactively via Livy API sessions using PySpark and Spark SQL for advanced analytics, DataFrames, cross-joins, Delta time-travel, and JSON data.
npx claudepluginhub microsoft/skills-for-fabric --plugin skills-for-fabricThis skill uses the workspace's default tool permissions.
> **Update Check — ONCE PER SESSION (mandatory)**
Manages Microsoft Fabric workspaces, lakehouses, and notebooks; authors PySpark, Scala, SparkR, SQL code in cells for data engineering workflows, pipelines, and infrastructure provisioning.
Manages Microsoft Fabric resources including workspaces, semantic models, reports, notebooks, lakehouses using fab CLI for deployment, job execution, data querying, OneLake file management, and automation.
Guides Databricks development with Python SDK, Databricks Connect for local Spark, CLI, and REST API. Use for databricks-sdk, databricks-connect, or APIs.
Share bugs, ideas, or general feedback.
Update Check — ONCE PER SESSION (mandatory) The first time this skill is used in a session, run the check-updates skill before proceeding.
- GitHub Copilot CLI / VS Code: invoke the
check-updatesskill.- Claude Code / Cowork / Cursor / Windsurf / Codex: compare local vs remote package.json version.
- Skip if the check was already performed earlier in this session.
CRITICAL NOTES
- To find the workspace details (including its ID) from workspace name: list all workspaces and, then, use JMESPath filtering
- To find the item details (including its ID) from workspace ID, item type, and item name: list all items of that type in that workspace and, then, use JMESPath filtering
| Task | Reference | Notes |
|---|---|---|
| Fabric Topology & Key Concepts | COMMON-CORE.md § Fabric Topology & Key Concepts | |
| Environment URLs | COMMON-CORE.md § Environment URLs | |
| Authentication & Token Acquisition | COMMON-CORE.md § Authentication & Token Acquisition | Wrong audience = 401; read before any auth issue |
| Core Control-Plane REST APIs | COMMON-CORE.md § Core Control-Plane REST APIs | |
| Pagination | COMMON-CORE.md § Pagination | |
| Long-Running Operations (LRO) | COMMON-CORE.md § Long-Running Operations (LRO) | |
| Rate Limiting & Throttling | COMMON-CORE.md § Rate Limiting & Throttling | |
| OneLake Data Access | COMMON-CORE.md § OneLake Data Access | Requires storage.azure.com token, not Fabric token |
| Job Execution | COMMON-CORE.md § Job Execution | |
| Capacity Management | COMMON-CORE.md § Capacity Management | |
| Gotchas & Troubleshooting | COMMON-CORE.md § Gotchas & Troubleshooting | |
| Best Practices | COMMON-CORE.md § Best Practices | |
| Tool Selection Rationale | COMMON-CLI.md § Tool Selection Rationale | |
| Finding Workspaces and Items in Fabric | COMMON-CLI.md § Finding Workspaces and Items in Fabric | Mandatory — READ link first [needed for finding workspace id by its name or item id by its name, item type, and workspace id] |
| Authentication Recipes | COMMON-CLI.md § Authentication Recipes | az login flows and token acquisition |
Fabric Control-Plane API via az rest | COMMON-CLI.md § Fabric Control-Plane API via az rest | Always pass --resource https://api.fabric.microsoft.com or az rest fails |
| Pagination Pattern | COMMON-CLI.md § Pagination Pattern | |
| Long-Running Operations (LRO) Pattern | COMMON-CLI.md § Long-Running Operations (LRO) Pattern | |
OneLake Data Access via curl | COMMON-CLI.md § OneLake Data Access via curl | Use curl not az rest (different token audience) |
| SQL / TDS Data-Plane Access | COMMON-CLI.md § SQL / TDS Data-Plane Access | sqlcmd (Go) connect, query, CSV export |
| Job Execution (CLI) | COMMON-CLI.md § Job Execution | |
| OneLake Shortcuts | COMMON-CLI.md § OneLake Shortcuts | |
| Capacity Management (CLI) | COMMON-CLI.md § Capacity Management | |
| Composite Recipes | COMMON-CLI.md § Composite Recipes | |
| Gotchas & Troubleshooting (CLI-Specific) | COMMON-CLI.md § Gotchas & Troubleshooting (CLI-Specific) | az rest audience, shell escaping, token expiry |
Quick Reference: az rest Template | COMMON-CLI.md § Quick Reference: az rest Template | |
| Quick Reference: Token Audience / CLI Tool Matrix | COMMON-CLI.md § Quick Reference: Token Audience ↔ CLI Tool Matrix | Which --resource + tool for each service |
| Relationship to SPARK-AUTHORING-CORE.md | SPARK-CONSUMPTION-CORE.md § Relationship to SPARK-AUTHORING-CORE.md | |
| Data Engineering Consumption Capability Matrix | SPARK-CONSUMPTION-CORE.md § Data Engineering Consumption Capability Matrix | |
| OneLake Table APIs (Schema-enabled Lakehouses) | SPARK-CONSUMPTION-CORE.md § OneLake Table APIs (Schema-enabled Lakehouses) | Unity Catalog-compatible metadata; requires storage.azure.com token |
| Lakehouse Livy Session Management | SPARK-CONSUMPTION-CORE.md § Livy Session Management | Lakehouse Livy API: session creation, states, lifecycle, termination |
| Interactive Data Exploration | SPARK-CONSUMPTION-CORE.md § Interactive Data Exploration | Statement execution, output retrieval, data discovery |
| PySpark Analytics Patterns | SPARK-CONSUMPTION-CORE.md § PySpark Analytics Patterns | Cross-lakehouse 3-part naming, performance optimization |
| Must/Prefer/Avoid | SKILL.md § Must/Prefer/Avoid | MUST DO / AVOID / PREFER checklists |
| Quick Start | SKILL.md § Quick Start | CLI-specific Lakehouse Livy session setup and data exploration |
| Key Fabric Patterns | SKILL.md § Key Fabric Patterns | Spark pattern quick-reference table |
| Session Cleanup | SKILL.md § Session Cleanup | Clean up idle Lakehouse Livy sessions via CLI |
sqlcmd, not Spark. Only use this skill when the user explicitly requests PySpark, DataFrames, or Spark-specific features./lakehouses/{lhId}/livyapi/.../sessions). Notebook Spark sessions are created internally when running a notebook via the Jobs API (RunNotebook) and are NOT managed through the Livy API. To run a notebook as a job, see SPARK-AUTHORING-CORE.md § Notebook Execution & Job ManagementApply environment detection from COMMON-CORE.md Environment Detection Pattern to set:
$FABRIC_API_BASE and $FABRIC_RESOURCE_SCOPE$FABRIC_API_URL and $LIVY_API_PATH for Livy operationsAuthentication: Use token acquisition from COMMON-CLI.md Environment Detection and API Configuration
Preferred: Use COMMON-CLI.md item discovery patterns (Finding things in Fabric) to find workspaces and items by name.
Fallback (when workspace is already known):
# List workspaces
az rest --method get --resource "$FABRIC_RESOURCE_SCOPE" --url "$FABRIC_API_URL/workspaces" --query "value[].{name:displayName, id:id}" --output table
read -p "Workspace ID: " workspaceId
# List lakehouses in workspace
az rest --method get --resource "$FABRIC_RESOURCE_SCOPE" --url "$FABRIC_API_URL/workspaces/$workspaceId/items?type=Lakehouse" --query "value[].{name:displayName, id:id}" --output table
read -p "Lakehouse ID: " lakehouseId
Two types of Spark sessions in Fabric — This skill manages Lakehouse Livy sessions, created via the public Livy API endpoint (
/lakehouses/{lhId}/livyapi/.../sessions). These are ad-hoc interactive sessions for remote clients. Notebook Spark sessions are a separate mechanism — they are created internally when a Fabric Notebook is executed (via portal or Jobs APIRunNotebook), and are managed through the notebook lifecycle, not the Livy API.
# Check for existing idle Lakehouse Livy session (avoid resource waste)
sessionId=$(az rest --method get --resource "$FABRIC_RESOURCE_SCOPE" --url "$FABRIC_API_URL/workspaces/$workspaceId/lakehouses/$lakehouseId/$LIVY_API_PATH/sessions" --query "sessions[?state=='idle'][0].id" --output tsv)
# Create if none available - FORCE STARTER POOL USAGE
if [[ -z "$sessionId" ]]; then
cat > /tmp/body.json << 'EOF'
{
"name":"analysis",
"driverMemory":"56g",
"driverCores":8,
"executorMemory":"56g",
"executorCores":8,
"conf": {
"spark.dynamicAllocation.enabled": "true",
"spark.fabric.pool.name": "Starter Pool"
}
}
EOF
sessionId=$(az rest --method post --resource "$FABRIC_RESOURCE_SCOPE" --url "$FABRIC_API_URL/workspaces/$workspaceId/lakehouses/$lakehouseId/$LIVY_API_PATH/sessions" --body @/tmp/body.json --query "id" --output tsv)
echo "⏳ Waiting for starter pool session to be ready..."
# With starter pools, this should be 3-5 seconds
timeout=30 # Reduced from 90s since starter pools are fast
while [ $timeout -gt 0 ]; do
state=$(az rest --resource "$FABRIC_RESOURCE_SCOPE" --url "$FABRIC_API_URL/workspaces/$workspaceId/lakehouses/$lakehouseId/$LIVY_API_PATH/sessions/$sessionId" --query "state" --output tsv)
if [[ "$state" == "idle" ]]; then
echo "✅ Session ready in starter pool!"
break
fi
echo " Session state: $state (${timeout}s remaining)"
sleep 3
timeout=$((timeout - 3))
done
fi
# Execute statement (LLM knows Python/Spark syntax)
cat > /tmp/body.json << 'EOF'
{
"code": "spark.sql(\"SHOW TABLES\").show(); df = spark.table(\"your_table\"); df.describe().show()",
"kind": "pyspark"
}
EOF
az rest --method post --resource "$FABRIC_RESOURCE_SCOPE" --url "$FABRIC_API_URL/workspaces/$workspaceId/lakehouses/$lakehouseId/$LIVY_API_PATH/sessions/$sessionId/statements" --body @/tmp/body.json
| Pattern | Code | Use Case |
|---|---|---|
| Table Discovery | spark.sql("SHOW TABLES") | List available tables |
| Cross-Lakehouse | spark.sql("SELECT * FROM other_workspace.table") | Query across workspaces |
| Delta Features | df.history(), df.readVersion(1) | Time travel, versioning |
| Schema Evolution | df.printSchema() | Understand structure |
# Clean up idle Lakehouse Livy sessions (optional)
az rest --method get --resource "$FABRIC_RESOURCE_SCOPE" --url "$FABRIC_API_URL/workspaces/$workspaceId/lakehouses/$lakehouseId/$LIVY_API_PATH/sessions" --query "sessions[?state=='idle'].id" --output tsv | xargs -I {} az rest --method delete --resource "$FABRIC_RESOURCE_SCOPE" --url "$FABRIC_API_URL/workspaces/$workspaceId/lakehouses/$lakehouseId/$LIVY_API_PATH/sessions/{}"
Focus: This skill provides Fabric-specific REST API patterns. LLM already knows Python/Spark syntax — we focus on Fabric integration, session management, and API endpoints.