Skill

mlflow-traces

From gr

Queries, tags, evaluates, and manages MLflow traces from video-research-mcp Gemini API calls using MCP tools like search_traces, get_trace, log_feedback. For debugging, performance analysis, feedback logging, custom scorers.

OpenTelemetry

ai-ml

monitoring

Install

npx claudepluginhub galbaz1/video-research-mcp

Tool Access

This skill uses the workspace's default tool permissions.

Preview

Query, tag, evaluate, and manage MLflow traces captured from video-research-mcp Gemini API calls. Uses `mcp__mlflow-mcp__*` MCP tools — no code writing needed for most operations.

SKILL.md

Similar Skills

cache-components

Guides Next.js Cache Components and Partial Prerendering (PPR) with cacheComponents enabled. Implements 'use cache', cacheLife(), cacheTag(), revalidateTag(), static/dynamic optimization, and cache debugging.

cache-components

139.1k

claude-opus-4-5-migration

2 files

Migrates code, prompts, and API calls from Claude Sonnet 4.0/4.5 or Opus 4.1 to Opus 4.5, updating model strings on Anthropic, AWS, GCP, Azure platforms.

claude-opus-4-5-migration

83.2k

smart-explore

Performs token-optimized structural code search using tree-sitter AST parsing to discover symbols, outline files, and unfold code without reading full files.

claude-mem

65.8k

Stats

Stars18

Forks3

Last CommitFeb 28, 2026

Actions

View Source View Plugin View on GitHub View README

MLflow Trace Management — video-research-mcp

Overview

Query, tag, evaluate, and manage MLflow traces captured from video-research-mcp Gemini API calls. Uses mcp__mlflow-mcp__* MCP tools — no code writing needed for most operations.

Core principle: Search first, then act. Always verify before destructive operations.

Quick Reference

Task	Tool	Key Params
Find traces	`search_traces`	`experiment_id`, `filter_string`, `extract_fields`
Get details	`get_trace`	`trace_id`, `extract_fields`
Tag trace	`set_trace_tag`	`trace_id`, `key`, `value`
Log score	`log_feedback`	`trace_id`, `name`, `value`, `rationale`
Run scorers	`evaluate_traces`	`experiment_id`, `trace_ids`, `scorers`
List scorers	`list_scorers`	—

Canonical Field Paths

CRITICAL — only use fields that actually exist:

Path	Content	Common mistake
`info.trace_id`	Trace identifier	—
`info.state`	Status: OK, ERROR	NOT `info.status`
`info.request_time`	Timestamp	NOT `info.timestamp_ms`
`info.execution_duration_ms`	Duration in ms	NOT `info.execution_duration`
`info.request_preview`	First ~100 chars of request	—
`info.response_preview`	First ~100 chars of response	—
`info.tags`	All tags as object	Use `info.tags.*` for all
`data.spans.*.name`	Span names	Must include `data.` prefix
`data.spans.*.status_code`	Span status	NOT `data.spans.*.status`
`data.spans.*.inputs`	Span inputs	Moderate size
`data.spans.*.outputs`	Span outputs	Moderate size

extract_fields Discipline

Always use extract_fields. Video-research-mcp traces contain video URIs, cached content references, full Gemini prompts/responses. A single get_trace without extract_fields can flood your context window.

// BAD - pulls everything
get_trace({ trace_id: "tr-..." })
search_traces({ experiment_id: "2" })

// GOOD - selective fields
get_trace({ trace_id: "tr-...",
  extract_fields: "info.*,data.spans.*.name,data.spans.*.status_code" })
search_traces({ experiment_id: "2", max_results: 10,
  extract_fields: "info.trace_id,info.state,info.execution_duration_ms" })

Never request data.spans.*.attributes unqualified — it silently drops dotted keys and can contain massive payloads.

Filter String vs Extract Fields — DIFFERENT NAMING!

CRITICAL: filter_string and extract_fields use DIFFERENT field names:

Data	`filter_string` syntax	`extract_fields` syntax
Status	`status = 'ERROR'`	`info.state`
Timestamp	`timestamp_ms > 170000...`	`info.request_time`
Duration	`execution_time_ms > 5000`	`info.execution_duration_ms`
Tags	`tags.reviewed = 'true'`	`info.tags.*`

Common Workflows

Debug failed traces

search_traces({ experiment_id: "<id>", filter_string: "status='ERROR'", max_results: 20,
  extract_fields: "info.trace_id,info.state,info.execution_duration_ms,info.request_preview" })

get_trace({ trace_id: "tr-abc123",
  extract_fields: "info.*,data.spans.*.name,data.spans.*.status_code" })

set_trace_tag({ trace_id: "tr-abc123", key: "needs_investigation", value: "true" })

Find slow traces

search_traces({ experiment_id: "<id>", filter_string: "execution_time_ms > 5000",
  max_results: 20, extract_fields: "info.trace_id,info.execution_duration_ms,data.spans.*.name" })

Log human feedback

log_feedback({ trace_id: "tr-abc123", name: "response_quality", value: 4.5,
  source_type: "human", rationale: "Accurate analysis, good structure" })

Run built-in scorers

// List available scorers first
list_scorers()

// Run evaluation
evaluate_traces({ experiment_id: "<id>", trace_ids: "tr-abc,tr-def",
  scorers: "Correctness,RelevanceToQuery" })

Search before delete

// Step 1: Preview
search_traces({ experiment_id: "<id>", filter_string: "timestamp < 1704067200000",
  max_results: 10, extract_fields: "info.trace_id,info.request_time" })

// Step 2: Verify count and IDs, then delete
delete_traces({ experiment_id: "<id>", max_timestamp_millis: 1704067200000 })

Field Selection Recipes

"info.trace_id,info.state"                                    // Minimal overview
"info.trace_id,info.execution_duration_ms,data.spans.*.name"  // Performance
"info.*,data.spans.*.name,data.spans.*.status_code"           // Full context (safe)
"info.trace_id,info.tags.*"                                   // Tags only
"info.trace_id,info.assessments.*.feedback.value"             // Feedback scores

video-research-mcp Context

Setting	Value
Tracking server	`http://127.0.0.1:5001` (default)
Experiment name	`video-research-mcp`
Env var	`MLFLOW_TRACKING_URI`
Autolog captures	All `GeminiClient` generate/generate_structured calls
Trace spans	Gemini API calls with model, thinking level, tokens, cost

Traces are captured automatically when MLFLOW_TRACKING_URI is set. No code changes needed — mlflow.gemini.autolog() hooks into the google-genai SDK.

Troubleshooting

MCP tools not available / connection refused

The MLflow tracking server must be running:

MLFLOW_TRACKING_URI=http://127.0.0.1:5001 mlflow server --port 5001

Then restart Claude Code to reconnect.

No traces found

Check MLFLOW_TRACKING_URI is set in the server environment
Verify the experiment name: search with max_results: 1 across experiment IDs
Confirm traces are being captured: run a tool call, then search again

Wrong experiment

The default experiment is video-research-mcp. If traces land in Default (experiment 0), the MLFLOW_EXPERIMENT_NAME env var is not set.

mlflow-traces

Install

Tool Access

Preview

SKILL.md

Similar Skills

mlflow-traces

Install

Tool Access

Preview

SKILL.md

MLflow Trace Management — video-research-mcp

Overview

Quick Reference

Canonical Field Paths

extract_fields Discipline

Filter String vs Extract Fields — DIFFERENT NAMING!

Common Workflows

Debug failed traces

Find slow traces

Log human feedback

Run built-in scorers

Search before delete

Field Selection Recipes

video-research-mcp Context

Troubleshooting

MCP tools not available / connection refused

No traces found

Wrong experiment

Resources

Similar Skills

MLflow Trace Management — video-research-mcp

Overview

Quick Reference

Canonical Field Paths

extract_fields Discipline

Filter String vs Extract Fields — DIFFERENT NAMING!

Common Workflows

Debug failed traces

Find slow traces

Log human feedback

Run built-in scorers

Search before delete

Field Selection Recipes

video-research-mcp Context

Troubleshooting

MCP tools not available / connection refused

No traces found

Wrong experiment

Resources