Help us improve
Share bugs, ideas, or general feedback.
From DeepEval
Instruments AI applications (LLM apps, agents, RAG pipelines) with DeepEval's native tracing for span-by-span visibility in Confident AI's Observatory. Supports framework integrations and manual @observe.
npx claudepluginhub confident-ai/deepeval --plugin deepevalHow this skill is triggered — by the user, by Claude, or both
Slash command
/deepeval:deepeval-tracingThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
Use this skill to instrument an **AI application** — an LLM app, agent, RAG
Exports raw OpenTelemetry traces from AI apps (agents, RAG, chatbots, LLM calls) to Confident AI's Observatory using OTLP/HTTP and confident.* attributes. No deepeval package needed.
Instruments Python and TypeScript code with MLflow Tracing for observability in LLM apps, agents, retrieval, tools, and frameworks like LangChain, OpenAI, LangGraph.
Share bugs, ideas, or general feedback.
Use this skill to instrument an AI application — an LLM app, agent, RAG
pipeline, or chatbot — with DeepEval's native tracing so its execution is
visible span by span in Confident AI's Observatory. The work is: pick a
supported integration when one exists, fall back to manual @observe
otherwise, give each span a meaningful type, and add tags and metadata.
This skill stops at producing well-formed traces. Attaching evaluation metrics
and running evals is the deepeval skill's job.
Instrument only the AI parts of the system — agent loops and planning, LLM
calls, retrieval / vector search, and tool calls. The span types (llm,
retriever, tool, agent) describe AI components. Do not trace non-AI
software (web servers, CRUD backends, infrastructure). If the target has no
LLM, agent, retrieval, or tool-calling component, this skill does not apply.
deepeval and deepeval-otel Skillsdeepeval-tracing) — instrument an app with the DeepEval SDK
(@observe, framework integrations) so traces reach Confident AI.deepeval skill — build pytest eval suites: datasets, metrics, traced
evals, deepeval test run, iteration. It runs evals against an app this
skill instrumented.deepeval-otel skill — instrument with the vendor-neutral OpenTelemetry
SDK instead of the DeepEval SDK (raw OTLP, including non-Python apps).The three are complementary. If unsure between this skill and deepeval-otel:
use this one when the app is Python and you want the DeepEval SDK; use
deepeval-otel when you want raw OpenTelemetry or the app is not Python.
pip install deepeval.deepeval login, or an exported
CONFIDENT_API_KEY (preferred for CI and non-interactive runs).references/integrations.md and the exact integration doc for what was
detected. Prefer a native integration over manual instrumentation.@observe. Read
references/tracing.md.type (llm, retriever, tool, agent) and
capture inputs/outputs.deepeval login or CONFIDENT_API_KEY, then verify traces appear
in the Confident AI Observatory.llm, retriever, tool, agent spans.
Never trace non-AI software.@observe. Manual tracing is the
fallback for unsupported frameworks and app-owned wrapper boundaries.deepeval skill; raw OpenTelemetry export belongs to deepeval-otel.| Topic | File |
|---|---|
Manual instrumentation: @observe, span types, tags, metadata | references/tracing.md |
| Integration selection rule and framework / model / vector-DB doc index | references/integrations.md |