Guides DataRobot model training workflows: project creation, dataset upload, AutoML configuration, progress monitoring, feature engineering, and model selection using Python SDK.
npx claudepluginhub datarobot-oss/datarobot-agent-skills --plugin datarobot-agent-skillsThis skill uses the workspace's default tool permissions.
This skill provides guidance for the complete model training workflow in DataRobot, from project creation through model selection and validation.
Guides dataset upload, validation, management, and preparation for DataRobot ML projects using Python SDK. Useful for data quality checks before training.
Automates DataRobot operations via Composio toolkit and Rube MCP. Discovers tools with RUBE_SEARCH_TOOLS, manages connections via RUBE_MANAGE_CONNECTIONS, and executes with RUBE_MULTI_EXECUTE_TOOL.
Builds end-to-end AutoML pipelines with data checks, feature engineering, model selection, hyperparameter tuning, evaluation, and deployment artifacts for repeatable ML workflows.
Share bugs, ideas, or general feedback.
This skill provides guidance for the complete model training workflow in DataRobot, from project creation through model selection and validation.
Most common use case: Create a project and train models
upload_dataset(file_path, dataset_name) to upload training datacreate_project(dataset_id, project_name) to create new projectstart_automl(project_id, mode) to begin AutoML trainingExample: "Create a new project with sales_data.csv, set 'revenue' as target, and start Quick AutoML training"
Use this skill when you need to:
User request: "Create a new project using my sales_data.csv file, predict 'revenue' as the target, and start AutoML training."
Agent workflow:
User request: "Train a model with time series settings: datetime column 'date', series ID 'store_id', forecast window 1-7 days."
Agent workflow:
This skill guides you to use the DataRobot Python SDK directly. Install the SDK if needed:
pip install datarobot
Use these DataRobot SDK methods for model training:
Projects:
dr.Project.create_from_dataset(dataset_id, project_name) - Create projectdr.Project.get(project_id) - Get project detailsdr.Project.list() - List all projectsproject.set_target(target_column) - Set target variableTraining:
project.start(autopilot_on=True) - Start AutoML trainingproject.get_status() - Check training statusdr.Model.list(project_id) - List trained modelsdr.Model.get(model_id) - Get model detailsModel Analysis:
model.get_metrics() - Get performance metricsmodel.get_feature_impact() - Get feature importanceSee the Common Patterns section below for complete examples.
This skill includes executable helper scripts that Claude can run directly:
scripts/create_project.py - Create a new project from a datasetscripts/start_training.py - Start AutoML trainingscripts/list_models.py - List trained models with metricsUsage example:
# Create project and set target
python scripts/create_project.py dataset_123 "Sales Prediction" revenue
# Start training
python scripts/start_training.py project_456 Quick
# List models
python scripts/list_models.py project_456 AUC
Claude can run these scripts directly or use them as reference when writing code.
import datarobot as dr
import os
# Initialize client
client = dr.Client(
token=os.getenv("DATAROBOT_API_TOKEN"),
endpoint=os.getenv("DATAROBOT_ENDPOINT")
)
# Upload dataset
dataset = dr.Dataset.create_from_file(
file_path="training_data.csv",
name="Sales Data"
)
# Create project
project = dr.Project.create_from_dataset(
dataset_id=dataset.id,
project_name="Sales Prediction"
)
# Set target
project.set_target(
target="revenue",
mode=dr.AUTOPILOT_MODE.QUICK
)
# Start AutoML (Quick mode)
project.start(autopilot_on=True, max_wait=3600)
# Monitor training
while project.get_status()['status'] not in ['complete', 'error']:
import time
time.sleep(30)
project.get_status()
# Get trained models
models = dr.Model.list(project.id)
best_model = max(models, key=lambda m: m.metrics.get('AUC', 0))
print(f"Best model: {best_model.id}, AUC: {best_model.metrics.get('AUC')}")
import datarobot as dr
# Upload dataset
dataset = dr.Dataset.create_from_file("sales_data.csv", "Sales Forecast Data")
# Create project
project = dr.Project.create_from_dataset(
dataset_id=dataset.id,
project_name="Sales Forecast"
)
# Configure time series settings
project.set_target(
target="sales",
mode=dr.AUTOPILOT_MODE.COMPREHENSIVE,
partitioning_method=dr.PARTITIONING_METHOD.DATETIME,
datetime_partition_column="date",
multiseries_id_columns=["store_id"],
forecast_window_start=1,
forecast_window_end=7
)
# Start training
project.start(autopilot_on=True, max_wait=7200)
# Wait for completion and get results
project.wait_for_completion()
models = dr.Model.list(project.id)
When selecting models, consider:
Common errors and solutions:
pip install datarobot
import datarobot as dr
import os
client = dr.Client(
token=os.getenv("DATAROBOT_API_TOKEN"),
endpoint=os.getenv("DATAROBOT_ENDPOINT", "https://app.datarobot.com")
)