Make predictions and generate prediction explanations (SHAP/XEMP) from DataRobot deployments, with support for batch scoring, dataset templates, and data validation.
How this skill is triggered — by the user, by Claude, or both
Slash command
/datarobot-agent-skills:datarobot-predictionsThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
This skill provides comprehensive guidance for working with DataRobot predictions, including real-time predictions, batch scoring, and generating prediction datasets.
This skill provides comprehensive guidance for working with DataRobot predictions, including real-time predictions, batch scoring, and generating prediction datasets.
Most common use case: Generate predictions for a deployment
get_deployment_features(deployment_id) to understand required columnsgenerate_prediction_data_template(deployment_id, n_rows) to create CSV structuredeployment.predict_batch(...) (works for both single-row “real-time” and batch scoring)Example: "Generate a prediction dataset template for deployment abc123 with 10 rows"
To also explain predictions: pass --max-explanations N to make_prediction.py (or the
max_explanations=N kwarg in code). See Prediction Explanations below.
Use this skill when you need to:
For post-hoc explanations against a training project / leaderboard model (not a deployment), use the
datarobot-model-explainabilityskill instead. This skill covers deployment-time explanations returned alongside scoring.
Before making predictions, you need to understand what features a deployment requires:
Create properly formatted prediction datasets:
Validate datasets before making predictions:
Execute predictions using various methods:
User request: "I want to predict sales for next week for store_A with temperatures of 75°F each day and no promotions."
Agent workflow:
User request: "Score all records in my prediction_data.csv file using deployment abc123."
Agent workflow:
This skill guides you to use the DataRobot Python SDK directly. Install the SDK if needed:
pip install datarobot
Use these DataRobot SDK methods to work with predictions:
Deployment Information:
dr.Deployment.get(deployment_id) - Get deployment detailsdeployment.get_features() - Get required features (name/type/importance)Predictions:
deployment.predict_batch(source) - Convenience batch prediction API (CSV path, file object, or pandas DataFrame)dr.BatchPredictionJob.score(deployment=deployment, ...) - Advanced batch prediction controljob.get_result_when_complete() - Wait for batch scoring to finish and download resultsData Management:
dr.Dataset.create_from_file(file_path) - Upload datasetdr.Dataset.get(dataset_id) - Get dataset infoSee the Common Patterns section below for complete examples.
Deployments can return per-row explanations (top feature contributions) alongside predictions. Two algorithms are available depending on how the deployment was configured:
shap): SHapley Additive exPlanations. Available on tree-based models when SHAP was
enabled at deployment time. Returns signed contributions in the model's score space.xemp): DataRobot's eXplainable AI for the eXact Model Prediction. Default when SHAP
is not enabled. Returns top-N strongest features with a qualitative strength (+++, --, etc.).If you omit explanation_algorithm, the deployment's default is used.
Pass max_explanations=N (and any optional filters) when calling datarobot_predict.deployment.predict:
import datarobot as dr
import pandas as pd
from datarobot_predict.deployment import predict as dr_predict
dr.Client(token=..., endpoint=...)
deployment = dr.Deployment.get("abc123")
result = dr_predict(
deployment=deployment,
data_frame=pd.DataFrame([{"feature1": 10, "feature2": 20}]),
max_explanations=3, # top 3 contributors per row
explanation_algorithm="shap", # or "xemp"; omit for deployment default
# threshold_high=0.8, # optional: only explain rows scoring > 0.8
# threshold_low=0.2, # optional: only explain rows scoring < 0.2
# passthrough_columns="all", # optional: echo input columns through to output
)
print(result.dataframe.to_dict(orient="records"))
The result DataFrame includes columns like EXPLANATION_1_FEATURE_NAME,
EXPLANATION_1_ACTUAL_VALUE, EXPLANATION_1_STRENGTH, EXPLANATION_1_QUALITATIVE_STRENGTH for
each of the top-N contributors.
| Parameter | Purpose |
|---|---|
max_explanations | Top-N contributors per row. 0 (default) disables explanations. |
max_ngram_explanations | Text models only: cap text-segment explanations per row. |
threshold_high | Only explain rows with prediction probability above this (0–1). |
threshold_low | Only explain rows with prediction probability below this (0–1). |
explanation_algorithm | "shap" or "xemp"; omit to use deployment default. |
passthrough_columns | "all" or set of input column names to echo through to output. |
python scripts/make_prediction.py abc123 '{"feature1": 10, "feature2": 20}' \
--max-explanations 3 --explanation-algorithm shap
threshold_high is useful when only positive (high-risk / fraud / churn-likely) predictions need
explaining — saves compute on a large batch.threshold_low is the mirror image for low-probability rows.[low, high] band.max_explanations ignored / no explanation columns in output: confirm you're calling
datarobot_predict.deployment.predict(...) and that the deployment has explanations enabled.
The deployment.predict_batch() convenience wrapper on the SDK is intended for plain scoring;
use datarobot_predict.deployment.predict when you need explanation kwargs.This skill includes executable helper scripts that Claude can run directly:
scripts/get_deployment_features.py - Get deployment feature requirementsscripts/generate_prediction_data_template.py - Generate CSV templatescripts/validate_prediction_data.py - Validate prediction datascripts/make_prediction.py - Make real-time predictionsUsage example:
# Get deployment features
python scripts/get_deployment_features.py abc123
# Generate template
python scripts/generate_prediction_data_template.py abc123 10 template.csv
# Validate data
python scripts/validate_prediction_data.py abc123 prediction_data.csv
# Make prediction
python scripts/make_prediction.py abc123 '{"feature1": 10, "feature2": 20}'
# Make prediction with top-3 SHAP explanations
python scripts/make_prediction.py abc123 '{"feature1": 10, "feature2": 20}' \
--max-explanations 3 --explanation-algorithm shap
Claude can run these scripts directly or use them as reference when writing code.
import datarobot as dr
import os
import pandas as pd
from datarobot_predict.deployment import predict as dr_predict
# Initialize client
dr.Client(
token=os.getenv("DATAROBOT_API_TOKEN"),
endpoint=os.getenv("DATAROBOT_ENDPOINT"),
)
deployment = dr.Deployment.get("abc123")
prediction_data = {
"feature1": value1,
"feature2": value2,
# ... all required features (excluding target)
}
# Score one row. Add max_explanations=N to get top-N explanations per row.
result = dr_predict(
deployment=deployment,
data_frame=pd.DataFrame([prediction_data]),
max_explanations=3, # optional; 0/omit to disable explanations
explanation_algorithm="shap", # optional; omit to use deployment default
)
print(result.dataframe.to_dict(orient="records"))
import datarobot as dr
import pandas as pd
import os
# Initialize client
client = dr.Client(
token=os.getenv("DATAROBOT_API_TOKEN"),
endpoint=os.getenv("DATAROBOT_ENDPOINT")
)
# Get deployment features
deployment = dr.Deployment.get("abc123")
model = dr.Model.get(deployment.model['id'])
features = model.get_features()
# Create template DataFrame
prediction_features = [f for f in features if f.name != model.target_name]
template_df = pd.DataFrame(columns=[f.name for f in prediction_features])
# Add sample rows
for i in range(100):
row = {}
for feature in prediction_features:
if feature.feature_type == 'Numeric':
row[feature.name] = 0.0
elif feature.feature_type == 'Categorical':
row[feature.name] = 'sample_value'
else:
row[feature.name] = ''
template_df = pd.concat([template_df, pd.DataFrame([row])], ignore_index=True)
# Save template
template_df.to_csv("prediction_template.csv", index=False)
# Fill template with actual data (modify CSV as needed)
# ...
# Submit batch prediction
job = dr.BatchPredictionJob.score(
deployment_id=deployment.id,
intake_settings={
'type': 'localFile',
'file': 'prediction_template.csv'
},
output_settings={
'type': 'localFile',
'path': 'predictions_output.csv'
}
)
# Monitor job
job_status = dr.BatchPredictionJob.get(job.id)
print(f"Job status: {job_status.status}")
# Download results when complete
if job_status.status == 'completed':
results = dr.BatchPredictionJob.download(job.id)
Common errors and solutions:
get_deployment_features to get complete listpip install datarobot
import datarobot as dr
import os
# Initialize client with API credentials
client = dr.Client(
token=os.getenv("DATAROBOT_API_TOKEN"),
endpoint=os.getenv("DATAROBOT_ENDPOINT", "https://app.datarobot.com")
)
Set these environment variables or pass them directly:
DATAROBOT_API_TOKEN - Your DataRobot API tokenDATAROBOT_ENDPOINT - Your DataRobot endpoint (default: https://app.datarobot.com)npx claudepluginhub datarobot-oss/datarobot-agent-skills --plugin datarobot-agent-skillsDeploys DataRobot models to production, manages deployments, configures prediction environments, and handles model swaps or A/B testing.
Automates Datarobot tasks via Rube MCP (Composio). Always searches tools first for current schemas before executing workflows.
Turns model work into production ML systems with data contracts, repeatable training, quality gates, deployable artifacts, and monitoring. Useful for ranking, search, recommendations, classifiers, forecasting, embeddings, LLMs, anomaly detection, and batch analytics.