Skill

dv-python-sdk

Query, create, update, and bulk-import Dataverse records using the official Python SDK. Includes data profiling, CSV import, and Jupyter notebook patterns. Use when: "use the SDK", "query records", "create records", "bulk operations", "upsert", "Python script for Dataverse", "read data", "write data", "upload file", "bulk import", "CSV import", "load data", "data profiling", "data quality", "analyze data", "Jupyter notebook", "pandas", "notebook". Do not use when: creating forms/views (use dv-metadata with Web API), exporting solutions (use dv-solution with PAC CLI).

Install

npx claudepluginhub anthropics/claude-plugins-official --plugin dataverse

Tool Access

This skill uses the workspace's default tool permissions.

Preview

> **This skill uses Python exclusively.** Do not use Node.js, JavaScript, or any other language for Dataverse scripting. If you are about to run `npm install` or write a `.js` file, STOP — you are going off-rails. See the overview skill's Hard Rules.

SKILL.md

Similar Skills

payload

11 files

Guides Payload CMS config (payload.config.ts), collections, fields, hooks, access control, APIs. Debugs validation errors, security, relationships, queries, transactions, hook behavior.

payload

41.6k

data-engineer

Builds scalable data pipelines, modern data warehouses, and real-time streaming architectures using Spark, dbt, Airflow, Kafka, and cloud platforms like Snowflake, BigQuery.

antigravity-bundle-data-engineering

32.8k

airflow-dag-patterns

1 file

Builds production Apache Airflow DAGs with best practices for operators, sensors, testing, and deployment. For data pipelines, workflow orchestration, and batch job scheduling.

antigravity-bundle-data-engineering

32.8k

Stats

Parent Repo Stars8

Parent Repo Forks3

Last CommitMar 20, 2026

Actions

View Source View Plugin View on GitHub View README

Skill: Python SDK

This skill uses Python exclusively. Do not use Node.js, JavaScript, or any other language for Dataverse scripting. If you are about to run npm install or write a .js file, STOP — you are going off-rails. See the overview skill's Hard Rules.

Use the official Microsoft Power Platform Dataverse Client Python SDK for data operations and basic table management in scripts and automation.

Official SDK: https://github.com/microsoft/PowerPlatform-DataverseClient-Python PyPI package: PowerPlatform-Dataverse-Client (this is the only official one — do not use dataverse-api or other unofficial packages) Status: Preview — breaking changes are possible

pip install --upgrade PowerPlatform-Dataverse-Client

Before Writing ANY Script — Check MCP First

If MCP tools are available (list_tables, describe_table, read_query, create_record) and the task is a simple query, single-record read, or small CRUD (≤10 records), use MCP directly — no script needed. Only write a Python script when the task requires bulk operations, multi-step logic, schema creation, or analytics.

SDK-First Rule — When You DO Write a Script

If an operation is in the "supports" list below, you MUST use the SDK — not urllib, requests, or raw HTTP. This is non-negotiable.

Decision flow:

Can MCP handle this without a script? → Use MCP (simple reads, queries, ≤10 record CRUD)
Does the operation involve records (CRUD, queries, bulk)? → Use the SDK
Does it involve tables, columns, or relationships? → Use the SDK
Does it involve publishers or solutions? → Use the SDK (these are standard Dataverse tables)
Does it involve forms, views, option sets, N:N $ref, or $apply? → Use raw Web API (SDK doesn't support these)

Correct imports for SDK scripts:

from auth import get_credential, load_env
from PowerPlatform.Dataverse.client import DataverseClient

WRONG — do NOT use for SDK-supported operations:

from auth import get_token, load_env  # WRONG for SDK-supported ops
import requests  # WRONG for SDK-supported ops

get_token() and requests exist ONLY for the narrow set of operations listed in "What This SDK Does NOT Support" below.

What This SDK Supports

Data CRUD: create, read, update, delete records
Upsert (with alternate key support)
Bulk operations: CreateMultiple, UpdateMultiple, UpsertMultiple
OData queries: select, filter, orderby, expand, top, paging
SQL queries (read-only, via Web API ?sql= parameter)
Table create, delete, and metadata (tables.get(), tables.list())
Relationship metadata: create/delete 1:N and N:N relationship definitions (schema-level)
Alternate key management
File column uploads (chunked for files >128MB)
Context manager support with HTTP connection pooling

What This SDK Does NOT Support

These are the ONLY operations where raw Web API (get_token() + urllib/requests) is acceptable:

Forms (FormXml) — use the Web API directly (see dv-metadata)
Views (SavedQueries) — use the Web API directly
Global option sets — use the Web API directly
N:N record association (linking two records at runtime via $ref POST — distinct from N:N relationship creation, which the SDK handles)
N:N $expand (collection-valued navigation) — use the Web API directly
$apply aggregation — use the Web API directly (or pull data with SDK + pandas)
Memo columns — use the Web API directly
Unbound actions (e.g., InstallSampleData) — use the Web API directly
DeleteMultiple, general OData batching

For anything not in this list, use the SDK. For anything in this list, use scripts/auth.py get_token() for token acquisition.

Field Name Casing Rule

Dataverse uses two different naming conventions for properties. Getting this wrong causes 400 errors.

Property type	Name convention	Example	When used
Structural (columns)	LogicalName (always lowercase)	`new_name`, `new_priority`	`$select`, `$filter`, `$orderby`, record payload keys
Navigation (relationships / lookups)	Navigation Property Name (case-sensitive, must match `$metadata`)	`new_AccountId`, `new_DepartmentId`	`$expand`, `@odata.bind` annotation keys

$select, $filter, $orderby: always lowercase logical names (new_name, new_priority)
$expand navigation properties: Navigation Property Name, case-sensitive (new_AccountId, new_DepartmentId)
@odata.bind keys: Navigation Property Name, case-sensitive (new_AccountId@odata.bind)
Record payloads (create/update data): lowercase logical names for regular fields; @odata.bind keys preserve the navigation property casing

The SDK handles this correctly: it lowercases structural property keys but preserves @odata.bind key casing.

Setup

import os, sys
sys.path.insert(0, os.path.join(os.getcwd(), "scripts"))
from auth import get_credential, load_env
from PowerPlatform.Dataverse.client import DataverseClient

load_env()
client = DataverseClient(
    base_url=os.environ["DATAVERSE_URL"],
    credential=get_credential(),
)

get_credential() returns ClientSecretCredential (if CLIENT_ID + CLIENT_SECRET are in .env) or DeviceCodeCredential (interactive fallback). See scripts/auth.py.

Common Operations

Create a record

guid = client.records.create("new_projectbudget", {
    "new_name": "Q1 Marketing Budget",
    "new_amount": 75000.00,
    "new_status": 100000000,
    # Lookup binding — see "@odata.bind rules" section below
    "new_AccountId@odata.bind": "/accounts(<account-guid>)",
})
print(f"Created: {guid}")

Lookup binding (@odata.bind) notes:

If you just created lookup columns, wait 5-10 seconds before inserting records that reference them. Metadata propagation delays can cause "Invalid property" errors.
Choice (picklist) columns use integer values, not strings: "new_status": 100000000 (not "Draft").
If a record insert fails with "Invalid property", verify the lookup column's logical name and navigation property name by querying EntityDefinitions(LogicalName='...')/Attributes.

Query records (multi-record — returns page iterator)

for page in client.records.get(
    "new_projectbudget",
    select=["new_name", "new_amount", "new_status"],
    filter="new_status eq 100000000",
    orderby=["new_name asc"],
    top=50,
):
    for r in page:
        print(r["new_name"], r["new_amount"])

Fetch a single record by ID

record = client.records.get("new_projectbudget", "<record-guid>",
    select=["new_name", "new_amount"])
print(record["new_name"])

Update a record

client.records.update("new_projectbudget", "<record-guid>",
    {"new_status": 100000001})

Delete a record

client.records.delete("new_projectbudget", "<record-guid>")

Bulk create (SDK uses CreateMultiple internally)

records = [{"new_name": f"Budget {i}"} for i in range(100)]
guids = client.records.create("new_projectbudget", records)
print(f"Created {len(guids)} records")

Bulk update (broadcast same change to multiple records)

client.records.update("new_projectbudget",
    [id1, id2, id3],
    {"new_status": 100000001})

Upsert (with alternate keys)

from PowerPlatform.Dataverse.models.upsert import UpsertItem

client.records.upsert("account", [
    UpsertItem(
        alternate_key={"accountnumber": "ACC-001"},
        record={"name": "Contoso Ltd", "description": "Primary account"},
    ),
])

Create a table

from enum import IntEnum

class BudgetStatus(IntEnum):
    DRAFT = 100000000
    APPROVED = 100000001

info = client.tables.create(
    "new_ProjectBudget",
    {
        "new_Amount": "decimal",
        "new_Status": BudgetStatus,
    },
    solution="MySolution",
    primary_column="new_Name",
)
print(f"Created: {info['table_schema_name']}, entity set: {info['entity_set_name']}")

Add columns to an existing table

created = client.tables.add_columns("new_projectbudget", {
    "new_Notes": "string",
    "new_Active": "bool",
})

Create a lookup (1:N relationship)

result = client.tables.create_lookup_field(
    referencing_table="new_projectbudget",
    lookup_field_name="new_AccountId",
    referenced_table="account",
    display_name="Account",
    solution="MySolution",
)
print(f"Created lookup: {result.lookup_schema_name}")
# The nav property for @odata.bind is the lookup's Navigation Property Name (case-sensitive)

Create a many-to-many relationship

from PowerPlatform.Dataverse.models.relationship import ManyToManyRelationshipMetadata

result = client.tables.create_many_to_many_relationship(
    ManyToManyRelationshipMetadata(
        schema_name="new_employee_project",
        entity1_logical_name="new_employee",
        entity2_logical_name="new_project",
    ),
    solution="MySolution",
)

@odata.bind Rules (Lookup Binding)

When creating or updating a record that sets a lookup field, you must use @odata.bind with the correct navigation property name. Getting this wrong is the #1 cause of 400 errors.

The rule

<NavigationPropertyName>@odata.bind = "/<EntitySetName>(<target-guid>)"

Navigation property name: Case-sensitive, must match the entity's $metadata. For custom lookups this is typically the SchemaName (e.g., new_AccountId). Do NOT use the lowercase logical name.
Entity set name: The plural collection name of the referenced table (e.g., accounts, contacts).

Examples

Lookup field	Correct `@odata.bind` key	Wrong
`new_AccountId` (custom)	`new_AccountId@odata.bind`	~~`new_accountid@odata.bind`~~
`customerid` (system polymorphic)	`customerid_account@odata.bind`	~~`customerid@odata.bind`~~
`parentcustomerid` (system)	`parentcustomerid_account@odata.bind`	~~`_parentcustomerid_value@odata.bind`~~

How to find the navigation property name

After creating a lookup via SDK: result.lookup_schema_name is the navigation property name.

For system tables: Query the relationship metadata:

GET /api/data/v9.2/EntityDefinitions(LogicalName='<entity>')/ManyToOneRelationships?$select=ReferencingEntityNavigationPropertyName,ReferencedEntity

Rule of thumb for custom lookups: The navigation property name matches the lookup field's SchemaName (e.g., new_AccountId). Always verify against $metadata if unsure.

$select with Lookup Columns

When using $select to include a lookup column in query results, use the _<logicalname>_value format:

# Correct — lookup value in $select uses underscore-wrapped format
for page in client.records.get("opportunity",
    select=["name", "estimatedvalue", "_parentaccountid_value"],
    top=10,
):
    for r in page:
        account_id = r.get("_parentaccountid_value")
        account_name = r.get("_parentaccountid_value@OData.Community.Display.V1.FormattedValue")

To get the full related record instead, use $expand:

for page in client.records.get("opportunity",
    select=["name", "estimatedvalue"],
    expand=["parentaccountid"],  # system nav props are lowercase
    top=10,
):
    for r in page:
        account = r.get("parentaccountid")
        if account:
            print(account["name"])

Note: System table navigation properties (e.g., parentaccountid, ownerid) are lowercase. Custom lookup navigation properties are case-sensitive and must match $metadata (e.g., new_AccountId). When in doubt, query the entity's ManyToOneRelationships metadata.

$expand with multiple custom lookups

Use the correct navigation property names (case-sensitive, must match $metadata):

# Expand multiple lookups — e.g., project budgets with account and department details
for page in client.records.get(
    "new_projectbudget",
    select=["new_name", "new_amount", "new_status"],
    filter="new_status eq 100000001",
    expand=["new_AccountId", "new_DepartmentId"],
    orderby=["new_amount desc"],
):
    for budget in page:
        acct = budget.get("new_AccountId") or {}
        dept = budget.get("new_DepartmentId") or {}
        print(f"{budget['new_name']}: {acct.get('name', '')} / {dept.get('new_name', '')}")

Important: expand uses the navigation property name (case-sensitive, e.g., new_AccountId), not the lowercase logical name (new_accountid). Using lowercase causes a 400 error.

$expand on N:N relationships (Web API only)

The SDK does not support $expand on N:N collection-valued navigation properties. This is one of the few cases where raw Web API is required:

# Web API required — SDK does not support N:N $expand
import urllib.request, json
from auth import get_token  # get_token() is correct here — SDK can't do this

token = get_token()
url = f"{env}/api/data/v9.2/new_projectdocuments?$select=new_title&$expand=new_ProjectBudget_Documents($select=new_name)"
req = urllib.request.Request(url, headers={
    "Authorization": f"Bearer {token}",
    "OData-MaxVersion": "4.0", "OData-Version": "4.0", "Accept": "application/json",
})
with urllib.request.urlopen(req) as resp:
    articles = json.loads(resp.read())["value"]

Use the SDK for all queries except N:N expand. Do not fall back to requests or urllib for standard queries, lookups, or 1:N expand — the SDK handles these correctly.

Error Handling

from PowerPlatform.Dataverse.core.errors import HttpError

try:
    guid = client.records.create("account", {"name": "Contoso"})
except HttpError as e:
    print(f"Status {e.status_code}: {e.message}")
    if e.details:
        print(f"Details: {e.details}")
    if e.status_code == 400:
        # Bad request — check field names, @odata.bind format, required fields
        pass
    elif e.status_code == 403:
        # Permission denied — check security roles
        pass
    elif e.status_code == 404:
        # Table or record not found
        pass
    elif e.status_code == 429:
        # Rate limited — the SDK handles retry automatically,
        # but if you hit this, reduce batch sizes or add delays
        pass

Bulk Import from CSV

For loading records from a CSV file, use the SDK directly — it handles batching via CreateMultiple internally.

When to use which tool

Volume	Tool	Why
1–10 records	MCP `create_record`	Simple, conversational
10+ records	Python SDK `client.records.create(table, list)`	Built-in batching, error handling, retry

CSV import pattern

import csv, os, sys
sys.path.insert(0, os.path.join(os.getcwd(), "scripts"))
from auth import get_credential, load_env
from PowerPlatform.Dataverse.client import DataverseClient
from PowerPlatform.Dataverse.core.errors import HttpError

load_env()
client = DataverseClient(
    base_url=os.environ["DATAVERSE_URL"],
    credential=get_credential(),
)

# Read CSV
with open("data/tickets.csv", newline="", encoding="utf-8") as f:
    rows = list(csv.DictReader(f))

# Map CSV columns to Dataverse fields
records = []
for row in rows:
    records.append({
        "new_name": row["Title"],
        "new_description": row["Description"],
        "new_priority": int(row["Priority"]),
        # Lookup binding — use navigation property name (case-sensitive)
        "new_AccountId@odata.bind": f"/accounts({row['AccountGuid']})",
    })

# Bulk create — SDK uses CreateMultiple internally
guids = client.records.create("new_ticket", records)
print(f"Created {len(guids)} tickets")

Required field discovery

Before bulk-creating records in a system table (account, contact, opportunity), create a single test record first:

Build a minimal payload with only the fields you intend to populate
Create it as a single record: client.records.create("account", {...})
If it raises HttpError with status 400, the error message names the missing required field
Some required fields are plugin-enforced (not visible in describe_table) and only triggered by certain field combinations
Once the test record succeeds, delete it, then proceed with the bulk create

Aggregation Queries

For simple aggregation, pull data with the SDK and use pandas (preferred). For server-side $apply aggregation, the SDK does not support it — use the Web API:

# Web API required — SDK does not support $apply
import os, json, urllib.request
sys.path.insert(0, os.path.join(os.getcwd(), "scripts"))
from auth import get_token, load_env  # get_token() is correct here — SDK can't do this

load_env()
env = os.environ["DATAVERSE_URL"].rstrip("/")
token = get_token()

# Example: count opportunities by status
url = f"{env}/api/data/v9.2/opportunities?$apply=groupby((statuscode),aggregate($count as count))"
req = urllib.request.Request(url, headers={
    "Authorization": f"Bearer {token}",
    "OData-MaxVersion": "4.0",
    "OData-Version": "4.0",
    "Accept": "application/json",
})
with urllib.request.urlopen(req) as resp:
    data = json.loads(resp.read())
    for row in data["value"]:
        print(f"Status {row['statuscode']}: {row['count']}")

For complex analytics (duplicates, cross-table joins, filtered aggregates), pull data into pandas:

import pandas as pd

# Pull all records into a DataFrame
all_records = []
for page in client.records.get("opportunity",
    select=["name", "estimatedvalue", "statuscode", "_parentaccountid_value"],
):
    all_records.extend(page)

df = pd.DataFrame(all_records)
# Now use pandas for analysis: groupby, pivot, merge, duplicates, etc.
print(df.groupby("statuscode")["estimatedvalue"].sum())

Jupyter Notebook Setup

For interactive analysis with visualizations.

Prerequisites

pip install --upgrade PowerPlatform-Dataverse-Client pandas matplotlib seaborn azure-identity

Notebook structure

# Cell 1: Setup
import os
from azure.identity import InteractiveBrowserCredential
from PowerPlatform.Dataverse.client import DataverseClient

credential = InteractiveBrowserCredential()
client = DataverseClient(
    base_url="https://<org>.crm.dynamics.com",  # replace with your URL
    credential=credential,
)

# Cell 2: Load data into pandas
import pandas as pd

accounts = []
for page in client.records.get("account",
    select=["name", "industrycode", "revenue", "numberofemployees"],
):
    accounts.extend(page)
df_accounts = pd.DataFrame(accounts)

# Cell 3+: Analysis with pandas, matplotlib, seaborn

Windows Scripting Notes

When writing Python scripts on Windows (especially in Git Bash / Claude Code):

Use only ASCII characters in script files — no curly quotes, em dashes, or non-ASCII. These cause SyntaxError on Windows.
Avoid python -c for anything beyond trivial one-liners. Write a .py file instead — multiline python -c commands break on Windows due to quoting differences.
Generate GUIDs in scripts, not inline: use str(uuid.uuid4()) inside the script rather than backtick-substitution in shell commands.