Skill

datarobot-external-agent-monitoring

Instruments Python AI agents (LangChain, LangGraph, CrewAI, LlamaIndex, Google ADK) with OpenTelemetry to send traces, logs, metrics to DataRobot for monitoring.

Python

OpenTelemetry

npx claudepluginhub datarobot-oss/datarobot-agent-skills --plugin datarobot-agent-skills

Tool Access

This skill uses the workspace's default tool permissions.

Preview

This skill helps you instrument any AI agent — regardless of framework or deployment environment — to send OpenTelemetry telemetry (traces, logs, metrics) to DataRobot. It also creates a shell deployment in DataRobot as the telemetry routing target.

Supporting Assets

frameworks/crewai.mdframeworks/generic-python.mdframeworks/google-adk.mdframeworks/langchain-langgraph.mdframeworks/llamaindex.mdframeworks/pydantic-ai.mdscripts/create_shell_deployment.pyscripts/verify_otel_connection.py

SKILL.md

Similar Skills

agent-evaluation

Evaluates and optimizes LLM agent output using MLflow datasets, scorers, judges, and tracing. Improves tool selection accuracy, answer quality, reduces costs, fixes incomplete responses.

17 files6 tools

mlflow

multi-agent-observability

Designs observability for multi-agent systems with per-agent metrics, aggregate stats, agent cards, and event streams to monitor execution, track costs, log activities, and debug workflows.

3 tools

tac

agent-ops

Builds, evaluates, and monitors AI agents using Opik: architecture patterns, metrics like hallucination and task completion, production observability, debugging, and best practices.

3 files

opik

Stats

Stars12

Forks8

Last CommitApr 7, 2026

Actions

View Source View Plugin View on GitHub View README

Help us improve

Share bugs, ideas, or general feedback.

DataRobot External Agent Monitoring Skill

Quick Start

Most common use case: Instrument an existing agent project for DataRobot monitoring

Point the agent at your project directory
The skill detects your framework and existing OTel setup
It generates instrumentation code and creates a DataRobot shell deployment
Your agent sends traces, logs, and metrics to DataRobot

Example: "Instrument my agent in ./my_agent for DataRobot monitoring"

When to use this skill

Use this skill when you need to:

Monitor an external AI agent in DataRobot
Add OpenTelemetry tracing to an agent project
Send agent traces, logs, and metrics to DataRobot
Create a DataRobot deployment to receive external agent telemetry
Instrument a Google ADK, LangChain, LangGraph, CrewAI, LlamaIndex, PydanticAI, or any Python agent

Supported Frameworks

Framework	Detection	OTel Strategy
Google ADK	`google-adk` in deps or `google.adk` in imports	Lazy trace injection via callback (ADK overwrites TracerProvider)
LangChain / LangGraph	`langchain` or `langgraph` in deps/imports	Auto-instrumentor + standard setup
CrewAI	`crewai` in deps/imports	Auto-instrumentor + standard setup
LlamaIndex	`llama-index` or `llama_index` in deps/imports	Auto-instrumentor + standard setup
PydanticAI	`pydantic-ai` or `pydantic_ai` in deps/imports	Standard setup (respects global TracerProvider)
Generic Python	None of the above detected	Manual span instrumentation

Workflow

Follow these steps in order. Present the plan to the user and wait for approval before executing.

Step 1: Detect & Analyze

Read the project's dependency file (requirements.txt, pyproject.toml, setup.py, poetry.lock, or uv.lock)
Scan Python source files for framework imports
Check for existing OTel setup (look for opentelemetry imports, existing TracerProvider/LoggerProvider/MeterProvider configuration)
Identify the framework using the detection table above
Read the corresponding framework reference file from the frameworks/ directory next to this SKILL.md:
- Google ADK → frameworks/google-adk.md
- LangChain/LangGraph → frameworks/langchain-langgraph.md
- CrewAI → frameworks/crewai.md
- LlamaIndex → frameworks/llamaindex.md
- PydanticAI → frameworks/pydantic-ai.md
- Generic Python → frameworks/generic-python.md

Step 2: Check Prerequisites

Check if DATAROBOT_API_TOKEN env var is set. If not, ask the user to provide it.
Check if DATAROBOT_ENDPOINT env var is set. If not, ask the user (default: https://app.datarobot.com/api/v2).
Derive DATAROBOT_OTEL_ENDPOINT automatically: if DATAROBOT_ENDPOINT ends with /api/v2, strip it and append /otel (e.g., https://app.datarobot.com/api/v2 → https://app.datarobot.com/otel).
Check if the datarobot Python SDK is available. If not, install it: pip install datarobot.
Check if OTel packages are already in the project's dependencies.

Security note: Never echo API tokens or .env file contents into chat transcripts or logs. Use environment variables or CI secrets for credential management. If credentials are accidentally exposed, rotate them immediately.

Step 3: Present Plan

Tell the user what you detected and present the changes you will make:

Framework detected (or generic Python)
Existing OTel setup found (if any)
New dependencies to add
New files to create (dr_otel_config.py, and optionally dr_agent_metrics.py for frameworks with custom metrics)
Existing files to modify (agent entrypoint, dependency file)
Shell deployment to create in DataRobot

Wait for user approval before executing. If the user has already given explicit consent to implement or deploy, that counts as approval — no need to re-ask.

Step 4: Execute

Add dependencies to the project's dependency file:
- opentelemetry-sdk
- opentelemetry-api
- opentelemetry-exporter-otlp-proto-http
- Framework-specific packages (see framework reference file)
Generate dr_otel_config.py using the generic pattern below, adapted per the framework reference file.
Wire into agent entrypoint: Add import and call to configure_otel() at startup. Follow the framework reference file for specific wiring instructions (auto-instrumentors, callbacks, etc.).
Generate dr_agent_metrics.py if the framework reference file specifies custom metrics callbacks.
Create shell deployment: Run the helper script:
```
python <skill_scripts_dir>/create_shell_deployment.py \
  --name "<project_name> Monitoring" \
  --description "OTel telemetry sink for <framework> agent"
```
The script automatically enables prediction row storage and automatic association ID generation on the deployment.

Report results: Show the deployment ID and a copy-paste env var block for the user's runtime:

export DATAROBOT_API_TOKEN="<token>"
export DATAROBOT_ENTITY_ID="deployment-<id>"
export DATAROBOT_OTEL_ENDPOINT="<otel_endpoint>"

Step 5: Verify & Provide Runtime Instructions

Optionally run the verification script:

DATAROBOT_API_TOKEN=<token> \
DATAROBOT_ENTITY_ID=deployment-<id> \
DATAROBOT_OTEL_ENDPOINT=<endpoint>/otel \
python <skill_scripts_dir>/verify_otel_connection.py

Provide the user with the env vars to set in their deployment environment:
- DATAROBOT_API_TOKEN — DataRobot API key
- DATAROBOT_ENTITY_ID — deployment-<id> (from shell deployment creation)
- DATAROBOT_OTEL_ENDPOINT — {DATAROBOT_ENDPOINT}/otel
Explain what they'll see in DataRobot:
- Data Exploration > Traces: Span hierarchy (agent orchestration, LLM calls, tool calls)
- Data Exploration > Logs: Structured logs correlated with traces via traceId
- Data Exploration > Metrics: Custom metrics (request count, latency, LLM calls, tool calls)

Generic OTel Configuration Pattern

This is the core configure_otel() function to generate for every project. Framework-specific files layer additional setup on top.

Critical rules:

Always pass endpoint= and headers= directly to exporters — NEVER use OTEL_EXPORTER_OTLP_* env vars (some frameworks detect these and create conflicting providers)
Be additive — add DataRobot as an additional span processor to any existing TracerProvider, don't replace it
Use SimpleSpanProcessor (not Batch) to avoid flush-before-shutdown issues
Use DELTA temporality for metrics (required by DataRobot)

Generated dr_otel_config.py template:

"""DataRobot OpenTelemetry configuration.

Configures traces, logs, and metrics export to DataRobot's OTel endpoint.
Call configure_otel() at application startup, before any agent code runs.

Required env vars at runtime:
    DATAROBOT_API_TOKEN      - DataRobot API key
    DATAROBOT_ENTITY_ID      - deployment-<deployment_id>
    DATAROBOT_OTEL_ENDPOINT  - https://<your-instance>.datarobot.com/otel
"""

import logging
import os

from opentelemetry import metrics, trace
from opentelemetry._logs import set_logger_provider
from opentelemetry.exporter.otlp.proto.http._log_exporter import OTLPLogExporter
from opentelemetry.exporter.otlp.proto.http.metric_exporter import OTLPMetricExporter
from opentelemetry.exporter.otlp.proto.http.trace_exporter import OTLPSpanExporter
from opentelemetry.sdk._logs import LoggerProvider, LoggingHandler
from opentelemetry.sdk._logs.export import SimpleLogRecordProcessor
from opentelemetry.sdk.metrics import Counter, Histogram, MeterProvider, ObservableCounter
from opentelemetry.sdk.metrics.export import (
    AggregationTemporality,
    PeriodicExportingMetricReader,
)
from opentelemetry.sdk.resources import Resource
from opentelemetry.sdk.trace import TracerProvider
from opentelemetry.sdk.trace.export import SimpleSpanProcessor


def _build_dr_headers():
    """Build DataRobot authentication headers for OTel exporters."""
    api_key = os.environ.get("DATAROBOT_API_TOKEN", "")
    entity_id = os.environ.get("DATAROBOT_ENTITY_ID", "")
    if not api_key:
        logging.warning("DATAROBOT_API_TOKEN not set — OTel export to DataRobot will fail")
    if not entity_id:
        logging.warning("DATAROBOT_ENTITY_ID not set — OTel export to DataRobot will fail")
    return {
        "X-DataRobot-Entity-Id": entity_id,
        "X-DataRobot-Api-Key": api_key,
    }


def _get_endpoint():
    """Get DataRobot OTel endpoint, auto-deriving from DATAROBOT_ENDPOINT if needed."""
    endpoint = os.environ.get("DATAROBOT_OTEL_ENDPOINT", "")
    if endpoint:
        return endpoint.rstrip("/")
    # Auto-derive from DATAROBOT_ENDPOINT (e.g. https://app.datarobot.com/api/v2 → .../otel)
    api_endpoint = os.environ.get("DATAROBOT_ENDPOINT", "")
    if api_endpoint:
        base = api_endpoint.rstrip("/")
        if base.endswith("/api/v2"):
            base = base[: -len("/api/v2")]
        return f"{base}/otel"
    return ""


def configure_otel():
    """Configure OpenTelemetry to export traces, logs, and metrics to DataRobot.

    This function is additive — it adds DataRobot as an additional exporter
    alongside any existing OTel setup. It does not replace existing providers.
    """
    headers = _build_dr_headers()
    endpoint = _get_endpoint()
    if not endpoint:
        logging.warning("DATAROBOT_OTEL_ENDPOINT not set — skipping OTel configuration")
        return
    resource = Resource.create()

    # --- Traces ---
    dr_span_processor = SimpleSpanProcessor(
        OTLPSpanExporter(endpoint=f"{endpoint}/v1/traces", headers=headers)
    )
    existing_provider = trace.get_tracer_provider()
    if hasattr(existing_provider, "add_span_processor"):
        existing_provider.add_span_processor(dr_span_processor)
    else:
        provider = TracerProvider(resource=resource)
        provider.add_span_processor(dr_span_processor)
        trace.set_tracer_provider(provider)

    # --- Logs ---
    log_exporter = OTLPLogExporter(endpoint=f"{endpoint}/v1/logs", headers=headers)
    logger_provider = LoggerProvider(resource=resource)
    set_logger_provider(logger_provider)
    logger_provider.add_log_record_processor(SimpleLogRecordProcessor(log_exporter))
    handler = LoggingHandler(level=logging.NOTSET, logger_provider=logger_provider)
    # Custom formatter ensures OTLP log bodies are never empty
    # (some libraries emit records with empty getMessage())
    handler.setFormatter(logging.Formatter("%(levelname)s %(name)s: %(message)s"))
    logging.getLogger().addHandler(handler)

    # --- Metrics ---
    preferred_temporality = {
        Counter: AggregationTemporality.DELTA,
        Histogram: AggregationTemporality.DELTA,
        ObservableCounter: AggregationTemporality.DELTA,
    }
    metric_exporter = OTLPMetricExporter(
        endpoint=f"{endpoint}/v1/metrics",
        headers=headers,
        preferred_temporality=preferred_temporality,
    )
    meter_provider = MeterProvider(
        metric_readers=[PeriodicExportingMetricReader(metric_exporter)],
        resource=resource,
    )
    metrics.set_meter_provider(meter_provider)

OTel provider initialization order warning:

Some frameworks override the global TracerProvider at startup (notably Google ADK). When this happens, the standard trace setup above will lose the DataRobot exporter. The framework reference files document which frameworks have this issue and provide alternative patterns (e.g., lazy injection via callbacks). Always check the framework reference file.

Existing OTel setups (e.g., exporters to Jaeger, Datadog, Google Cloud Trace) are preserved when possible — DataRobot is added alongside, not replacing. However, note that OTel has a single global provider per signal. Whoever calls set_tracer_provider() last wins. The additive pattern above avoids calling set_tracer_provider() when a provider already exists, instead adding a processor to the existing one.

DataRobot Tracing Table — Span Attribute Mapping

DataRobot's tracing UI (Data Exploration > Traces) maps specific span attributes to table columns. Using the correct attribute names is critical for data to appear in the dashboard.

Column Mapping

Tracing Table Column	Span Attribute	Aggregation Rule
Prompt	`gen_ai.prompt`	First span with this attribute wins
Completion	`gen_ai.completion`	Last span with this attribute wins
Tools	`tool_name`	Lists all unique values across all spans in the trace
Cost	`datarobot.moderation.cost`	Summed across all spans in the trace

Important: DataRobot looks for tool_name (underscore), NOT tool.name (dot). Some frameworks (e.g., LangGraph) do not set tool_name by default — you must add it manually as a span attribute inside each tool call.

All Recognized Span Attributes

Attribute	Description	Example
`gen_ai.prompt`	User input / prompt text	`"Analyze policy XYZ"`
`gen_ai.completion`	Model output / response	`"Policy matched..."`
`gen_ai.request.model`	Model used for the call	`"gpt-4o"`
`gen_ai.usage.prompt_tokens`	Input token count	`150`
`gen_ai.usage.completion_tokens`	Output token count	`320`
`tool_name`	Name of tool/function called (required for Tools column)	`"search_database"`
`tool.parameters`	Tool call parameters (JSON string)	`'{"query": "..."}'`
`datarobot.moderation.cost`	Cost of this span (summed for trace total)	`0.0023`

Helper Scripts

create_shell_deployment.py

Creates a shell deployment in DataRobot as a telemetry routing target.

python <scripts_dir>/create_shell_deployment.py \
  --name "My Agent Monitoring" \
  --description "OTel telemetry sink for my agent"

Requires env vars: DATAROBOT_API_TOKEN, DATAROBOT_ENDPOINT

Returns JSON:

{
  "deployment_id": "abc123",
  "entity_id": "deployment-abc123",
  "otel_endpoint": "https://app.datarobot.com/otel"
}

verify_otel_connection.py

Sends test telemetry to verify the OTel pipeline is working.

python <scripts_dir>/verify_otel_connection.py

Requires env vars: DATAROBOT_API_TOKEN, DATAROBOT_ENTITY_ID, DATAROBOT_OTEL_ENDPOINT

Returns JSON:

{
  "status": "success",
  "traces": "sent",
  "logs": "sent",
  "metrics": "sent"
}

Dependencies

Required for instrumentation (added to user's project):

opentelemetry-sdk
opentelemetry-api
opentelemetry-exporter-otlp-proto-http

Required for shell deployment creation (available in the skill's script environment):

datarobot

Best practices

Call configure_otel() before any agent/framework initialization — some frameworks capture the provider at import time
Never set OTEL_EXPORTER_OTLP_* env vars — pass endpoint and headers directly to exporters to avoid conflicts
Use SimpleSpanProcessor over BatchSpanProcessor — avoids flush issues on short-lived processes
DELTA temporality for metrics — DataRobot requires delta aggregation for counters and histograms
Check framework reference files for initialization order issues before generating code

Error handling

Common errors and solutions:

Error	Cause	Solution
Traces not appearing in DataRobot	Framework overwrites TracerProvider	Use lazy injection pattern (see framework reference)
401 Unauthorized from OTel endpoint	Invalid API token	Verify `DATAROBOT_API_TOKEN` is correct
404 from OTel endpoint	Wrong endpoint URL	Ensure `DATAROBOT_OTEL_ENDPOINT` ends with `/otel`
Metrics not appearing	`OTEL_EXPORTER_OTLP_*` env vars set	Remove env vars, use direct exporter config
`DATAROBOT_ENTITY_ID` format error	Missing `deployment-` prefix	Must be `deployment-<id>`, not just `<id>`

datarobot-external-agent-monitoring

Tool Access

Preview

Supporting Assets

SKILL.md

Similar Skills

Help us improve

Help us improve

datarobot-external-agent-monitoring

Tool Access

Preview

Supporting Assets

SKILL.md

DataRobot External Agent Monitoring Skill

Quick Start

When to use this skill

Supported Frameworks

Workflow

Step 1: Detect & Analyze

Step 2: Check Prerequisites

Step 3: Present Plan

Step 4: Execute

Step 5: Verify & Provide Runtime Instructions

Generic OTel Configuration Pattern

DataRobot Tracing Table — Span Attribute Mapping

Column Mapping

All Recognized Span Attributes

Helper Scripts

create_shell_deployment.py

verify_otel_connection.py

Dependencies

Best practices

Error handling

Resources

Similar Skills

Help us improve

DataRobot External Agent Monitoring Skill

Quick Start

When to use this skill

Supported Frameworks

Workflow

Step 1: Detect & Analyze

Step 2: Check Prerequisites

Step 3: Present Plan

Step 4: Execute

Step 5: Verify & Provide Runtime Instructions

Generic OTel Configuration Pattern

DataRobot Tracing Table — Span Attribute Mapping

Column Mapping

All Recognized Span Attributes

Helper Scripts

create_shell_deployment.py

verify_otel_connection.py

Dependencies

Best practices

Error handling

Resources