From gr
Queries, tags, evaluates, and manages MLflow traces from video-research-mcp Gemini API calls using MCP tools like search_traces, get_trace, log_feedback. For debugging, performance analysis, feedback logging, custom scorers.
npx claudepluginhub galbaz1/video-research-mcpThis skill uses the workspace's default tool permissions.
Query, tag, evaluate, and manage MLflow traces captured from video-research-mcp Gemini API calls. Uses `mcp__mlflow-mcp__*` MCP tools — no code writing needed for most operations.
Guides Next.js Cache Components and Partial Prerendering (PPR) with cacheComponents enabled. Implements 'use cache', cacheLife(), cacheTag(), revalidateTag(), static/dynamic optimization, and cache debugging.
Migrates code, prompts, and API calls from Claude Sonnet 4.0/4.5 or Opus 4.1 to Opus 4.5, updating model strings on Anthropic, AWS, GCP, Azure platforms.
Performs token-optimized structural code search using tree-sitter AST parsing to discover symbols, outline files, and unfold code without reading full files.
Query, tag, evaluate, and manage MLflow traces captured from video-research-mcp Gemini API calls. Uses mcp__mlflow-mcp__* MCP tools — no code writing needed for most operations.
Core principle: Search first, then act. Always verify before destructive operations.
| Task | Tool | Key Params |
|---|---|---|
| Find traces | search_traces | experiment_id, filter_string, extract_fields |
| Get details | get_trace | trace_id, extract_fields |
| Tag trace | set_trace_tag | trace_id, key, value |
| Log score | log_feedback | trace_id, name, value, rationale |
| Run scorers | evaluate_traces | experiment_id, trace_ids, scorers |
| List scorers | list_scorers | — |
CRITICAL — only use fields that actually exist:
| Path | Content | Common mistake |
|---|---|---|
info.trace_id | Trace identifier | — |
info.state | Status: OK, ERROR | NOT info.status |
info.request_time | Timestamp | NOT info.timestamp_ms |
info.execution_duration_ms | Duration in ms | NOT info.execution_duration |
info.request_preview | First ~100 chars of request | — |
info.response_preview | First ~100 chars of response | — |
info.tags | All tags as object | Use info.tags.* for all |
data.spans.*.name | Span names | Must include data. prefix |
data.spans.*.status_code | Span status | NOT data.spans.*.status |
data.spans.*.inputs | Span inputs | Moderate size |
data.spans.*.outputs | Span outputs | Moderate size |
Always use extract_fields. Video-research-mcp traces contain video URIs, cached content references, full Gemini prompts/responses. A single get_trace without extract_fields can flood your context window.
// BAD - pulls everything
get_trace({ trace_id: "tr-..." })
search_traces({ experiment_id: "2" })
// GOOD - selective fields
get_trace({ trace_id: "tr-...",
extract_fields: "info.*,data.spans.*.name,data.spans.*.status_code" })
search_traces({ experiment_id: "2", max_results: 10,
extract_fields: "info.trace_id,info.state,info.execution_duration_ms" })
Never request data.spans.*.attributes unqualified — it silently drops dotted keys and can contain massive payloads.
CRITICAL: filter_string and extract_fields use DIFFERENT field names:
| Data | filter_string syntax | extract_fields syntax |
|---|---|---|
| Status | status = 'ERROR' | info.state |
| Timestamp | timestamp_ms > 170000... | info.request_time |
| Duration | execution_time_ms > 5000 | info.execution_duration_ms |
| Tags | tags.reviewed = 'true' | info.tags.* |
search_traces({ experiment_id: "<id>", filter_string: "status='ERROR'", max_results: 20,
extract_fields: "info.trace_id,info.state,info.execution_duration_ms,info.request_preview" })
get_trace({ trace_id: "tr-abc123",
extract_fields: "info.*,data.spans.*.name,data.spans.*.status_code" })
set_trace_tag({ trace_id: "tr-abc123", key: "needs_investigation", value: "true" })
search_traces({ experiment_id: "<id>", filter_string: "execution_time_ms > 5000",
max_results: 20, extract_fields: "info.trace_id,info.execution_duration_ms,data.spans.*.name" })
log_feedback({ trace_id: "tr-abc123", name: "response_quality", value: 4.5,
source_type: "human", rationale: "Accurate analysis, good structure" })
// List available scorers first
list_scorers()
// Run evaluation
evaluate_traces({ experiment_id: "<id>", trace_ids: "tr-abc,tr-def",
scorers: "Correctness,RelevanceToQuery" })
// Step 1: Preview
search_traces({ experiment_id: "<id>", filter_string: "timestamp < 1704067200000",
max_results: 10, extract_fields: "info.trace_id,info.request_time" })
// Step 2: Verify count and IDs, then delete
delete_traces({ experiment_id: "<id>", max_timestamp_millis: 1704067200000 })
"info.trace_id,info.state" // Minimal overview
"info.trace_id,info.execution_duration_ms,data.spans.*.name" // Performance
"info.*,data.spans.*.name,data.spans.*.status_code" // Full context (safe)
"info.trace_id,info.tags.*" // Tags only
"info.trace_id,info.assessments.*.feedback.value" // Feedback scores
| Setting | Value |
|---|---|
| Tracking server | http://127.0.0.1:5001 (default) |
| Experiment name | video-research-mcp |
| Env var | MLFLOW_TRACKING_URI |
| Autolog captures | All GeminiClient generate/generate_structured calls |
| Trace spans | Gemini API calls with model, thinking level, tokens, cost |
Traces are captured automatically when MLFLOW_TRACKING_URI is set. No code changes needed — mlflow.gemini.autolog() hooks into the google-genai SDK.
The MLflow tracking server must be running:
MLFLOW_TRACKING_URI=http://127.0.0.1:5001 mlflow server --port 5001
Then restart Claude Code to reconnect.
MLFLOW_TRACKING_URI is set in the server environmentmax_results: 1 across experiment IDsThe default experiment is video-research-mcp. If traces land in Default (experiment 0), the MLFLOW_EXPERIMENT_NAME env var is not set.