Skill

deepeval-tracing

Instruments AI applications (LLM apps, agents, RAG pipelines, chatbots) with DeepEval's native tracing for span-by-span visibility in Confident AI's Observatory. Supports framework integrations and manual @observe instrumentation.

Python

Popularity

Stars

16,733

Forks

1,634

Invocation

How this skill is triggered — by the user, by Claude, or both

Slash command

/deepeval:deepeval-tracing

User invocable

Model invocable

Inline context

Default effort

Context Preview

The summary Claude sees in its skill listing — used to decide when to auto-load this skill

Use this skill to instrument an **AI application** — an LLM app, agent, RAG

Supporting Files

LICENSEreferences/integrations.mdreferences/tracing.md

SKILL.md

100 lines · ~1.3k tokens

Stats

LanguagePython

Stars16,733

Forks1,634

MaintenanceExcellent

Last CommitJul 9, 2026

Actions

View Source View Plugin View on GitHub View README

DeepEval Tracing

Use this skill to instrument an AI application — an LLM app, agent, RAG pipeline, or chatbot — with DeepEval's native tracing so its execution is visible span by span in Confident AI's Observatory. The work is: pick a supported integration when one exists, fall back to manual @observe otherwise, give each span a meaningful type, and add tags and metadata.

This skill stops at producing well-formed traces. Attaching evaluation metrics and running evals is the deepeval skill's job.

Scope: AI Applications Only

Instrument only the AI parts of the system — agent loops and planning, LLM calls, retrieval / vector search, and tool calls. The span types (llm, retriever, tool, agent) describe AI components. Do not trace non-AI software (web servers, CRUD backends, infrastructure). If the target has no LLM, agent, retrieval, or tool-calling component, this skill does not apply.

When to Use vs the `deepeval` and `deepeval-otel` Skills

This skill (deepeval-tracing) — instrument an app with the DeepEval SDK (@observe, framework integrations) so traces reach Confident AI.
deepeval skill — build pytest eval suites: datasets, metrics, traced evals, deepeval test run, iteration. It runs evals against an app this skill instrumented.
deepeval-otel skill — instrument with the vendor-neutral OpenTelemetry SDK instead of the DeepEval SDK (raw OTLP, including non-Python apps).

The three are complementary. If unsure between this skill and deepeval-otel: use this one when the app is Python and you want the DeepEval SDK; use deepeval-otel when you want raw OpenTelemetry or the app is not Python.

Prerequisites

An AI application in Python with pip install deepeval.
For traces to reach Confident AI: deepeval login, or an exported CONFIDENT_API_KEY (preferred for CI and non-interactive runs).

Workflow

Confirm the target is an AI application (it has LLM calls, an agent loop, retrieval, or tool calls). If it has none of these, stop — this skill does not apply.
Detect the framework, model provider, agent SDK, and vector database in use.
Read references/integrations.md and the exact integration doc for what was detected. Prefer a native integration over manual instrumentation.
If no native integration fits, instrument manually with @observe. Read references/tracing.md.
Give each span a meaningful type (llm, retriever, tool, agent) and capture inputs/outputs.
Add trace-level tags and metadata where they help diagnose failure patterns. Never trace secrets, credentials, or raw sensitive data.
Confirm deepeval login or CONFIDENT_API_KEY, then verify traces appear in the Confident AI Observatory.

Core Principles

Instrument AI components only — llm, retriever, tool, agent spans. Never trace non-AI software.
Prefer a supported integration over manual @observe. Manual tracing is the fallback for unsupported frameworks and app-owned wrapper boundaries.
Read the exact integration doc before writing tracing code.
Give spans meaningful types; let names default to function names unless there is a strong reason to override.
Never trace secrets, credentials, API keys, or raw sensitive user data.
Producing traces is the scope. Attaching metrics and running evals belong to the deepeval skill; raw OpenTelemetry export belongs to deepeval-otel.

References

Topic	File
Manual instrumentation: `@observe`, span types, tags, metadata	`references/tracing.md`
Integration selection rule and framework / model / vector-DB doc index	`references/integrations.md`

deepeval-tracing

Popularity

Invocation

Context Preview

Supporting Files

SKILL.md

deepeval-tracing

Popularity

Invocation

Context Preview

Supporting Files

SKILL.md

DeepEval Tracing

Scope: AI Applications Only

When to Use vs the `deepeval` and `deepeval-otel` Skills

Prerequisites

Workflow

Core Principles

References

Similar Skills

DeepEval Tracing

Scope: AI Applications Only

When to Use vs the `deepeval` and `deepeval-otel` Skills

Prerequisites

Workflow

Core Principles

References

Similar Skills

deepeval-tracing

Popularity

Invocation

Context Preview

Supporting Files

SKILL.md

deepeval-tracing

Popularity

Invocation

Context Preview

Supporting Files

SKILL.md

DeepEval Tracing

Scope: AI Applications Only

When to Use vs the deepeval and deepeval-otel Skills

Prerequisites

Workflow

Core Principles

References

Similar Skills

DeepEval Tracing

Scope: AI Applications Only

When to Use vs the deepeval and deepeval-otel Skills

Prerequisites

Workflow

Core Principles

References

Similar Skills

When to Use vs the `deepeval` and `deepeval-otel` Skills

When to Use vs the `deepeval` and `deepeval-otel` Skills